Download PDF

Subjects

- Genome-wide association studies

- Metabolomics

- Quantitative trait loci

- Risk factors

Abstract

Interpreting the association of genetic variants with complex traits can be improved by gaining a greater understanding of the molecular consequences of these variants. Although genome-wide association studies (GWAS) for complex diseases routinely profile over one million individuals1,2,3,4,5, studies of molecular traits have lagged behind. Here we performed a GWAS meta-analysis for 249 circulating metabolic traits in the Estonian Biobank and the UK Biobank in up to 619,372 individuals. We identified 88,127 common and low-frequency locus–trait associations from 8,398 loci that converged on shared genes and pathways. Using statistical fine mapping, systematic phenome-wide colocalization and cis-Mendelian randomization, we explored putative causal links between metabolic traits and disease outcomes. We predict that although plasma branched-chain amino acids (BCAAs) have been associated with type 2 diabetes in observational studies6,7, lowering BCAA levels by targeting the BCAA catabolism pathway is unlikely to reduce type 2 diabetes risk. Leveraging our large sample size and high-quality genotype imputation, we found that 19.4% of the confidently fine-mapped variants had minor allele frequencies between 0.1 and 1%, and these variants were twofold enriched for predicted missense and splice-altering variants. Our results highlight the value of integrating low-frequency variants into genetic association studies.

Main

Recent large-scale GWAS of metabolic traits have continued to uncover novel associations and biological insights8,9,10,11,12,13,14. However, for more than half of the metabolic traits that are captured by nuclear magnetic resonance (NMR) spectroscopy, the proportion of heritability explained by genome-wide significant variants remains below 50% (ref. 12), indicating that much larger sample sizes are needed to identify the remaining genetic effects. Furthermore, most existing association studies using the Nightingale Health NMR platform have been limited to common variants8,9,10,12 and exome sequencing13,15, leaving the full genome-wide spectrum of low-frequency genetic variation unexplored. Finally, larger sample sizes and increased statistical power also bring new challenges for interpreting genetic associations, particularly when genetic variants have pleiotropic effects on several correlated metabolic traits8,9,10. In particular, there is a growing concern that naive use of these associations in the Mendelian randomization16 framework can lead to spurious and misleading findings17,18.

Association testing and meta-analysis

We performed GWAS for 249 metabolic traits (Supplementary Table 1) in the Estonian Biobank (EstBB; n = 185,352) and 6 genetic ancestry groups from the UK Biobank (UKBB; n = 434,020) (Extended Data Fig. 1). The UKBB genetic ancestry groups were defined previously by the Pan-UKBB project19 and are listed in Table 1. Relying on the population-specific genotype imputation panel for the EstBB20 and the Genomics England21 and TopMed22 imputation panels for the UKBB allowed us to test 10–96 million variants across genetic ancestry groups (up to nine times more than previous studies using the same NMR platform8,12,13). On the basis of minor allele frequency (MAF), we stratified these variants into three bins: common variants (MAF > 1%), low-frequency variants (MAF between 0.1% and 1%) and rare variants (MAF < 0.1%). The number of significant locus–trait pairs ranged from 37 (UKBB_AMR) to 62,543 (UKBB_EUR), and the number of independent lead variants (r2 < 0.8) ranged from 24 to 6,014, with most associations detected in the UKBB_EUR and EstBB subsets (Table 1). We observed high genetic correlation for matched metabolic traits between the EstBB (n = 185,352) and UKBB_EUR (n = 413,897) subsets (median genetic correlation (rg) = 0.91, mean rg = 0.89), indicating that genetic effects are largely shared between the two biobanks (Supplementary Table 2).

Table 1 Number of significant locus–metabolic trait pairs and unique lead variants (r2 < 0.8) detected in each genetic ancestry group and the two meta-analysesFull size table

In the meta-analysis of EstBB and UKBB_EUR (meta_EUR; n = 599,249), we identified 86,886 locus–trait pairs, corresponding to 8,260 independent lead variants (r2 < 0.8). This represented an approximately tenfold increase compared with Karjalainen et al.8 (n = 136,016; 8,578 locus–trait pairs) and a 63% increase compared with a parallel study by Zoodsma et al.13 on the overlapping set of UKBB samples (n = 450,016; 52,662 locus–trait pairs). The estimated heritability of individual metabolic traits ranged from 2.8% for acetoacetate to 19.5% for HDL_size (median 10.2%), and we observed a clear linear relationship between heritability and the number of loci associated with each metabolic trait (Supplementary Fig. 1 and Supplementary Table 3). On

Genetic analysis of circulating metabolic traits in 619,372 individuals | Nature