TY - JOUR T1 - Evolution and functional impact of rare coding variation from deep sequencing of human exomes. JF - Science Y1 - 2012 A1 - Tennessen, Jacob A A1 - Bigham, Abigail W A1 - O'Connor, Timothy D A1 - Fu, Wenqing A1 - Kenny, Eimear E A1 - Gravel, Simon A1 - McGee, Sean A1 - Do, Ron A1 - Liu, Xiaoming A1 - Jun, Goo A1 - Kang, Hyun Min A1 - Jordan, Daniel A1 - Leal, Suzanne M A1 - Gabriel, Stacey A1 - Rieder, Mark J A1 - Abecasis, Goncalo A1 - Altshuler, David A1 - Nickerson, Deborah A A1 - Boerwinkle, Eric A1 - Sunyaev, Shamil A1 - Bustamante, Carlos D A1 - Bamshad, Michael J A1 - Akey, Joshua M KW - African Americans KW - Disease KW - European Continental Ancestry Group KW - Evolution, Molecular KW - Exome KW - Female KW - Gene Frequency KW - Genetic Association Studies KW - Genetic Predisposition to Disease KW - Genetic Variation KW - Genome, Human KW - High-Throughput Nucleotide Sequencing KW - Humans KW - Male KW - Polymorphism, Single Nucleotide KW - Population Growth KW - Selection, Genetic AB -

As a first step toward understanding how rare variants contribute to risk for complex diseases, we sequenced 15,585 human protein-coding genes to an average median depth of 111× in 2440 individuals of European (n = 1351) and African (n = 1088) ancestry. We identified over 500,000 single-nucleotide variants (SNVs), the majority of which were rare (86% with a minor allele frequency less than 0.5%), previously unknown (82%), and population-specific (82%). On average, 2.3% of the 13,595 SNVs each person carried were predicted to affect protein function of ~313 genes per genome, and ~95.7% of SNVs predicted to be functionally important were rare. This excess of rare functional variants is due to the combined effects of explosive, recent accelerated population growth and weak purifying selection. Furthermore, we show that large sample sizes will be required to associate rare variants with complex traits.

VL - 337 IS - 6090 U1 - http://www.ncbi.nlm.nih.gov/pubmed/22604720?dopt=Abstract ER - TY - JOUR T1 - Whole-genome sequence-based analysis of high-density lipoprotein cholesterol. JF - Nat Genet Y1 - 2013 A1 - Morrison, Alanna C A1 - Voorman, Arend A1 - Johnson, Andrew D A1 - Liu, Xiaoming A1 - Yu, Jin A1 - Li, Alexander A1 - Muzny, Donna A1 - Yu, Fuli A1 - Rice, Kenneth A1 - Zhu, Chengsong A1 - Bis, Joshua A1 - Heiss, Gerardo A1 - O'Donnell, Christopher J A1 - Psaty, Bruce M A1 - Cupples, L Adrienne A1 - Gibbs, Richard A1 - Boerwinkle, Eric KW - Cholesterol, HDL KW - Computational Biology KW - Databases, Genetic KW - Genetic Variation KW - Genome, Human KW - Genome-Wide Association Study KW - Genomics KW - Heterozygote KW - Humans KW - Open Reading Frames AB -

We describe initial steps for interrogating whole-genome sequence data to characterize the genetic architecture of a complex trait, levels of high-density lipoprotein cholesterol (HDL-C). We report whole-genome sequencing and analysis of 962 individuals from the Cohorts for Heart and Aging Research in Genetic Epidemiology (CHARGE) studies. From this analysis, we estimate that common variation contributes more to heritability of HDL-C levels than rare variation, and screening for mendelian variants for dyslipidemia identified individuals with extreme HDL-C levels. Whole-genome sequencing analyses highlight the value of regulatory and non-protein-coding regions of the genome in addition to protein-coding regions.

VL - 45 IS - 8 U1 - http://www.ncbi.nlm.nih.gov/pubmed/23770607?dopt=Abstract ER - TY - JOUR T1 - Associations of NINJ2 sequence variants with incident ischemic stroke in the Cohorts for Heart and Aging in Genomic Epidemiology (CHARGE) consortium. JF - PLoS One Y1 - 2014 A1 - Bis, Joshua C A1 - DeStefano, Anita A1 - Liu, Xiaoming A1 - Brody, Jennifer A A1 - Choi, Seung Hoan A1 - Verhaaren, Benjamin F J A1 - Debette, Stephanie A1 - Ikram, M Arfan A1 - Shahar, Eyal A1 - Butler, Kenneth R A1 - Gottesman, Rebecca F A1 - Muzny, Donna A1 - Kovar, Christie L A1 - Psaty, Bruce M A1 - Hofman, Albert A1 - Lumley, Thomas A1 - Gupta, Mayetri A1 - Wolf, Philip A A1 - van Duijn, Cornelia A1 - Gibbs, Richard A A1 - Mosley, Thomas H A1 - Longstreth, W T A1 - Boerwinkle, Eric A1 - Seshadri, Sudha A1 - Fornage, Myriam KW - Cell Adhesion Molecules, Neuronal KW - European Continental Ancestry Group KW - Female KW - Genetic Association Studies KW - Genetic Heterogeneity KW - Humans KW - Introns KW - Ischemia KW - Male KW - Myocardial Infarction KW - Polymorphism, Single Nucleotide KW - Prospective Studies KW - Sequence Analysis, DNA AB -

BACKGROUND: Stroke, the leading neurologic cause of death and disability, has a substantial genetic component. We previously conducted a genome-wide association study (GWAS) in four prospective studies from the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium and demonstrated that sequence variants near the NINJ2 gene are associated with incident ischemic stroke. Here, we sought to fine-map functional variants in the region and evaluate the contribution of rare variants to ischemic stroke risk.

METHODS AND RESULTS: We sequenced 196 kb around NINJ2 on chromosome 12p13 among 3,986 European ancestry participants, including 475 ischemic stroke cases, from the Atherosclerosis Risk in Communities Study, Cardiovascular Health Study, and Framingham Heart Study. Meta-analyses of single-variant tests for 425 common variants (minor allele frequency [MAF] ≥ 1%) confirmed the original GWAS results and identified an independent intronic variant, rs34166160 (MAF = 0.012), most significantly associated with incident ischemic stroke (HR = 1.80, p = 0.0003). Aggregating 278 putatively-functional variants with MAF≤ 1% using count statistics, we observed a nominally statistically significant association, with the burden of rare NINJ2 variants contributing to decreased ischemic stroke incidence (HR = 0.81; p = 0.026).

CONCLUSION: Common and rare variants in the NINJ2 region were nominally associated with incident ischemic stroke among a subset of CHARGE participants. Allelic heterogeneity at this locus, caused by multiple rare, low frequency, and common variants with disparate effects on risk, may explain the difficulties in replicating the original GWAS results. Additional studies that take into account the complex allelic architecture at this locus are needed to confirm these findings.

VL - 9 IS - 6 U1 - http://www.ncbi.nlm.nih.gov/pubmed/24959832?dopt=Abstract ER - TY - JOUR T1 - Sequencing of 2 subclinical atherosclerosis candidate regions in 3669 individuals: Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium Targeted Sequencing Study. JF - Circ Cardiovasc Genet Y1 - 2014 A1 - Bis, Joshua C A1 - White, Charles C A1 - Franceschini, Nora A1 - Brody, Jennifer A1 - Zhang, Xiaoling A1 - Muzny, Donna A1 - Santibanez, Jireh A1 - Gibbs, Richard A1 - Liu, Xiaoming A1 - Lin, Honghuang A1 - Boerwinkle, Eric A1 - Psaty, Bruce M A1 - North, Kari E A1 - Cupples, L Adrienne A1 - O'Donnell, Christopher J KW - Aged KW - Aged, 80 and over KW - Aging KW - Atherosclerosis KW - Class Ib Phosphatidylinositol 3-Kinase KW - Cohort Studies KW - European Continental Ancestry Group KW - Female KW - Genetic Variation KW - Genome-Wide Association Study KW - Genomics KW - Humans KW - Male KW - Middle Aged KW - Polymorphism, Single Nucleotide KW - Sequence Analysis, DNA KW - Sodium-Phosphate Cotransporter Proteins, Type I AB -

BACKGROUND: Atherosclerosis, the precursor to coronary heart disease and stroke, is characterized by an accumulation of fatty cells in the arterial intimal-medial layers. Common carotid intima media thickness (cIMT) and plaque are subclinical atherosclerosis measures that predict cardiovascular disease events. Previously, genome-wide association studies demonstrated evidence for association with cIMT (SLC17A4) and plaque (PIK3CG).

METHODS AND RESULTS: We sequenced 120 kb around SLC17A4 (6p22.2) and 251 kb around PIK3CG (7q22.3) among 3669 European ancestry participants from the Atherosclerosis Risk in Communities (ARIC) study, Cardiovascular Health Study (CHS), and Framingham Heart Study (FHS) in Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium. Primary analyses focused on 438 common variants (minor allele frequency ≥1%), which were independently meta-analyzed. A 3' untranslated region CCDC71L variant (rs2286149), upstream from PIK3CG, was the most significant finding in cIMT (P=0.00033) and plaque (P=0.0004) analyses. A SLC17A4 intronic variant was also associated with cIMT (P=0.008). Both were in low linkage disequilibrium with the genome-wide association study single nucleotide polymorphisms. Gene-based tests including T1 count and sequence kernel association test for rare variants (minor allele frequency <1%) did not yield statistically significant associations. However, we observed nominal associations for rare variants in CCDC71L and SLC17A3 with cIMT and of the entire 7q22 region with plaque (P=0.05).

CONCLUSIONS: Common and rare variants in PIK3CG and SLC17A4 regions demonstrated modest association with subclinical atherosclerosis traits. Although not conclusive, these findings may help to understand the genetic architecture of regions previously implicated by genome-wide association studies and identify variants within these regions for further investigation in larger samples.

VL - 7 IS - 3 U1 - http://www.ncbi.nlm.nih.gov/pubmed/24951662?dopt=Abstract ER - TY - JOUR T1 - Strategies to design and analyze targeted sequencing data: cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium Targeted Sequencing Study. JF - Circ Cardiovasc Genet Y1 - 2014 A1 - Lin, Honghuang A1 - Wang, Min A1 - Brody, Jennifer A A1 - Bis, Joshua C A1 - Dupuis, Josée A1 - Lumley, Thomas A1 - McKnight, Barbara A1 - Rice, Kenneth M A1 - Sitlani, Colleen M A1 - Reid, Jeffrey G A1 - Bressler, Jan A1 - Liu, Xiaoming A1 - Davis, Brian C A1 - Johnson, Andrew D A1 - O'Donnell, Christopher J A1 - Kovar, Christie L A1 - Dinh, Huyen A1 - Wu, Yuanqing A1 - Newsham, Irene A1 - Chen, Han A1 - Broka, Andi A1 - DeStefano, Anita L A1 - Gupta, Mayetri A1 - Lunetta, Kathryn L A1 - Liu, Ching-Ti A1 - White, Charles C A1 - Xing, Chuanhua A1 - Zhou, Yanhua A1 - Benjamin, Emelia J A1 - Schnabel, Renate B A1 - Heckbert, Susan R A1 - Psaty, Bruce M A1 - Muzny, Donna M A1 - Cupples, L Adrienne A1 - Morrison, Alanna C A1 - Boerwinkle, Eric KW - Adult KW - Aged KW - Aged, 80 and over KW - Aging KW - Cohort Studies KW - Female KW - Genetic Variation KW - Genome-Wide Association Study KW - Genomics KW - Heart Diseases KW - Humans KW - Male KW - Middle Aged KW - Polymorphism, Single Nucleotide KW - Research Design KW - Sequence Analysis, DNA AB -

BACKGROUND: Genome-wide association studies have identified thousands of genetic variants that influence a variety of diseases and health-related quantitative traits. However, the causal variants underlying the majority of genetic associations remain unknown. Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium Targeted Sequencing Study aims to follow up genome-wide association study signals and identify novel associations of the allelic spectrum of identified variants with cardiovascular-related traits.

METHODS AND RESULTS: The study included 4231 participants from 3 CHARGE cohorts: the Atherosclerosis Risk in Communities Study, the Cardiovascular Health Study, and the Framingham Heart Study. We used a case-cohort design in which we selected both a random sample of participants and participants with extreme phenotypes for each of 14 traits. We sequenced and analyzed 77 genomic loci, which had previously been associated with ≥1 of 14 phenotypes. A total of 52 736 variants were characterized by sequencing and passed our stringent quality control criteria. For common variants (minor allele frequency ≥1%), we performed unweighted regression analyses to obtain P values for associations and weighted regression analyses to obtain effect estimates that accounted for the sampling design. For rare variants, we applied 2 approaches: collapsed aggregate statistics and joint analysis of variants using the sequence kernel association test.

CONCLUSIONS: We sequenced 77 genomic loci in participants from 3 cohorts. We established a set of filters to identify high-quality variants and implemented statistical and bioinformatics strategies to analyze the sequence data and identify potentially functional variants within genome-wide association study loci.

VL - 7 IS - 3 U1 - http://www.ncbi.nlm.nih.gov/pubmed/24951659?dopt=Abstract ER - TY - JOUR T1 - Population genomic analysis of 962 whole genome sequences of humans reveals natural selection in non-coding regions. JF - PLoS One Y1 - 2015 A1 - Yu, Fuli A1 - Lu, Jian A1 - Liu, Xiaoming A1 - Gazave, Elodie A1 - Chang, Diana A1 - Raj, Srilakshmi A1 - Hunter-Zinck, Haley A1 - Blekhman, Ran A1 - Arbiza, Leonardo A1 - Van Hout, Cris A1 - Morrison, Alanna A1 - Johnson, Andrew D A1 - Bis, Joshua A1 - Cupples, L Adrienne A1 - Psaty, Bruce M A1 - Muzny, Donna A1 - Yu, Jin A1 - Gibbs, Richard A A1 - Keinan, Alon A1 - Clark, Andrew G A1 - Boerwinkle, Eric KW - DNA, Intergenic KW - Genetic Loci KW - Humans KW - Metagenomics KW - Open Reading Frames KW - Polymorphism, Single Nucleotide AB -

Whole genome analysis in large samples from a single population is needed to provide adequate power to assess relative strengths of natural selection across different functional components of the genome. In this study, we analyzed next-generation sequencing data from 962 European Americans, and found that as expected approximately 60% of the top 1% of positive selection signals lie in intergenic regions, 33% in intronic regions, and slightly over 1% in coding regions. Several detailed functional annotation categories in intergenic regions showed statistically significant enrichment in positively selected loci when compared to the null distribution of the genomic span of ENCODE categories. There was a significant enrichment of purifying selection signals detected in enhancers, transcription factor binding sites, microRNAs and target sites, but not on lincRNA or piRNAs, suggesting different evolutionary constraints for these domains. Loci in "repressed or low activity regions" and loci near or overlapping the transcription start site were the most significantly over-represented annotations among the top 1% of signals for positive selection.

VL - 10 IS - 3 U1 - http://www.ncbi.nlm.nih.gov/pubmed/25807536?dopt=Abstract ER - TY - JOUR T1 - Analysis commons, a team approach to discovery in a big-data environment for genetic epidemiology. JF - Nat Genet Y1 - 2017 A1 - Brody, Jennifer A A1 - Morrison, Alanna C A1 - Bis, Joshua C A1 - O'Connell, Jeffrey R A1 - Brown, Michael R A1 - Huffman, Jennifer E A1 - Ames, Darren C A1 - Carroll, Andrew A1 - Conomos, Matthew P A1 - Gabriel, Stacey A1 - Gibbs, Richard A A1 - Gogarten, Stephanie M A1 - Gupta, Namrata A1 - Jaquish, Cashell E A1 - Johnson, Andrew D A1 - Lewis, Joshua P A1 - Liu, Xiaoming A1 - Manning, Alisa K A1 - Papanicolaou, George J A1 - Pitsillides, Achilleas N A1 - Rice, Kenneth M A1 - Salerno, William A1 - Sitlani, Colleen M A1 - Smith, Nicholas L A1 - Heckbert, Susan R A1 - Laurie, Cathy C A1 - Mitchell, Braxton D A1 - Vasan, Ramachandran S A1 - Rich, Stephen S A1 - Rotter, Jerome I A1 - Wilson, James G A1 - Boerwinkle, Eric A1 - Psaty, Bruce M A1 - Cupples, L Adrienne VL - 49 IS - 11 ER - TY - JOUR T1 - Pharmacogenomics of statin-related myopathy: Meta-analysis of rare variants from whole-exome sequencing. JF - PLoS One Y1 - 2019 A1 - Floyd, James S A1 - Bloch, Katarzyna M A1 - Brody, Jennifer A A1 - Maroteau, Cyrielle A1 - Siddiqui, Moneeza K A1 - Gregory, Richard A1 - Carr, Daniel F A1 - Molokhia, Mariam A1 - Liu, Xiaoming A1 - Bis, Joshua C A1 - Ahmed, Ammar A1 - Liu, Xuan A1 - Hallberg, Pär A1 - Yue, Qun-Ying A1 - Magnusson, Patrik K E A1 - Brisson, Diane A1 - Wiggins, Kerri L A1 - Morrison, Alanna C A1 - Khoury, Etienne A1 - McKeigue, Paul A1 - Stricker, Bruno H A1 - Lapeyre-Mestre, Maryse A1 - Heckbert, Susan R A1 - Gallagher, Arlene M A1 - Chinoy, Hector A1 - Gibbs, Richard A A1 - Bondon-Guitton, Emmanuelle A1 - Tracy, Russell A1 - Boerwinkle, Eric A1 - Gaudet, Daniel A1 - Conforti, Anita A1 - van Staa, Tjeerd A1 - Sitlani, Colleen M A1 - Rice, Kenneth M A1 - Maitland-van der Zee, Anke-Hilse A1 - Wadelius, Mia A1 - Morris, Andrew P A1 - Pirmohamed, Munir A1 - Palmer, Colin A N A1 - Psaty, Bruce M A1 - Alfirevic, Ana AB -

AIMS: Statin-related myopathy (SRM), which includes rhabdomyolysis, is an uncommon but important adverse drug reaction because the number of people prescribed statins world-wide is large. Previous association studies of common genetic variants have had limited success in identifying a genetic basis for this adverse drug reaction. We conducted a multi-site whole-exome sequencing study to investigate whether rare coding variants confer an increased risk of SRM.

METHODS AND RESULTS: SRM 3-5 cases (N = 505) and statin treatment-tolerant controls (N = 2047) were recruited from multiple sites in North America and Europe. SRM 3-5 was defined as symptoms consistent with muscle injury and an elevated creatine phosphokinase level >4 times upper limit of normal without another likely cause of muscle injury. Whole-exome sequencing and variant calling was coordinated from two analysis centres, and results of single-variant and gene-based burden tests were meta-analysed. No genome-wide significant associations were identified. Given the large number of cases, we had 80% power to identify a variant with minor allele frequency of 0.01 that increases the risk of SRM 6-fold at genome-wide significance.

CONCLUSIONS: In this large whole-exome sequencing study of severe statin-related muscle injury conducted to date, we did not find evidence that rare coding variants are responsible for this adverse drug reaction. Larger sample sizes would be required to identify rare variants with small effects, but it is unclear whether such findings would be clinically actionable.

VL - 14 IS - 6 ER - TY - JOUR T1 - Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. JF - Nature Y1 - 2021 A1 - Taliun, Daniel A1 - Harris, Daniel N A1 - Kessler, Michael D A1 - Carlson, Jedidiah A1 - Szpiech, Zachary A A1 - Torres, Raul A1 - Taliun, Sarah A Gagliano A1 - Corvelo, André A1 - Gogarten, Stephanie M A1 - Kang, Hyun Min A1 - Pitsillides, Achilleas N A1 - LeFaive, Jonathon A1 - Lee, Seung-Been A1 - Tian, Xiaowen A1 - Browning, Brian L A1 - Das, Sayantan A1 - Emde, Anne-Katrin A1 - Clarke, Wayne E A1 - Loesch, Douglas P A1 - Shetty, Amol C A1 - Blackwell, Thomas W A1 - Smith, Albert V A1 - Wong, Quenna A1 - Liu, Xiaoming A1 - Conomos, Matthew P A1 - Bobo, Dean M A1 - Aguet, Francois A1 - Albert, Christine A1 - Alonso, Alvaro A1 - Ardlie, Kristin G A1 - Arking, Dan E A1 - Aslibekyan, Stella A1 - Auer, Paul L A1 - Barnard, John A1 - Barr, R Graham A1 - Barwick, Lucas A1 - Becker, Lewis C A1 - Beer, Rebecca L A1 - Benjamin, Emelia J A1 - Bielak, Lawrence F A1 - Blangero, John A1 - Boehnke, Michael A1 - Bowden, Donald W A1 - Brody, Jennifer A A1 - Burchard, Esteban G A1 - Cade, Brian E A1 - Casella, James F A1 - Chalazan, Brandon A1 - Chasman, Daniel I A1 - Chen, Yii-Der Ida A1 - Cho, Michael H A1 - Choi, Seung Hoan A1 - Chung, Mina K A1 - Clish, Clary B A1 - Correa, Adolfo A1 - Curran, Joanne E A1 - Custer, Brian A1 - Darbar, Dawood A1 - Daya, Michelle A1 - de Andrade, Mariza A1 - DeMeo, Dawn L A1 - Dutcher, Susan K A1 - Ellinor, Patrick T A1 - Emery, Leslie S A1 - Eng, Celeste A1 - Fatkin, Diane A1 - Fingerlin, Tasha A1 - Forer, Lukas A1 - Fornage, Myriam A1 - Franceschini, Nora A1 - Fuchsberger, Christian A1 - Fullerton, Stephanie M A1 - Germer, Soren A1 - Gladwin, Mark T A1 - Gottlieb, Daniel J A1 - Guo, Xiuqing A1 - Hall, Michael E A1 - He, Jiang A1 - Heard-Costa, Nancy L A1 - Heckbert, Susan R A1 - Irvin, Marguerite R A1 - Johnsen, Jill M A1 - Johnson, Andrew D A1 - Kaplan, Robert A1 - Kardia, Sharon L R A1 - Kelly, Tanika A1 - Kelly, Shannon A1 - Kenny, Eimear E A1 - Kiel, Douglas P A1 - Klemmer, Robert A1 - Konkle, Barbara A A1 - Kooperberg, Charles A1 - Köttgen, Anna A1 - Lange, Leslie A A1 - Lasky-Su, Jessica A1 - Levy, Daniel A1 - Lin, Xihong A1 - Lin, Keng-Han A1 - Liu, Chunyu A1 - Loos, Ruth J F A1 - Garman, Lori A1 - Gerszten, Robert A1 - Lubitz, Steven A A1 - Lunetta, Kathryn L A1 - Mak, Angel C Y A1 - Manichaikul, Ani A1 - Manning, Alisa K A1 - Mathias, Rasika A A1 - McManus, David D A1 - McGarvey, Stephen T A1 - Meigs, James B A1 - Meyers, Deborah A A1 - Mikulla, Julie L A1 - Minear, Mollie A A1 - Mitchell, Braxton D A1 - Mohanty, Sanghamitra A1 - Montasser, May E A1 - Montgomery, Courtney A1 - Morrison, Alanna C A1 - Murabito, Joanne M A1 - Natale, Andrea A1 - Natarajan, Pradeep A1 - Nelson, Sarah C A1 - North, Kari E A1 - O'Connell, Jeffrey R A1 - Palmer, Nicholette D A1 - Pankratz, Nathan A1 - Peloso, Gina M A1 - Peyser, Patricia A A1 - Pleiness, Jacob A1 - Post, Wendy S A1 - Psaty, Bruce M A1 - Rao, D C A1 - Redline, Susan A1 - Reiner, Alexander P A1 - Roden, Dan A1 - Rotter, Jerome I A1 - Ruczinski, Ingo A1 - Sarnowski, Chloe A1 - Schoenherr, Sebastian A1 - Schwartz, David A A1 - Seo, Jeong-Sun A1 - Seshadri, Sudha A1 - Sheehan, Vivien A A1 - Sheu, Wayne H A1 - Shoemaker, M Benjamin A1 - Smith, Nicholas L A1 - Smith, Jennifer A A1 - Sotoodehnia, Nona A1 - Stilp, Adrienne M A1 - Tang, Weihong A1 - Taylor, Kent D A1 - Telen, Marilyn A1 - Thornton, Timothy A A1 - Tracy, Russell P A1 - Van Den Berg, David J A1 - Vasan, Ramachandran S A1 - Viaud-Martinez, Karine A A1 - Vrieze, Scott A1 - Weeks, Daniel E A1 - Weir, Bruce S A1 - Weiss, Scott T A1 - Weng, Lu-Chen A1 - Willer, Cristen J A1 - Zhang, Yingze A1 - Zhao, Xutong A1 - Arnett, Donna K A1 - Ashley-Koch, Allison E A1 - Barnes, Kathleen C A1 - Boerwinkle, Eric A1 - Gabriel, Stacey A1 - Gibbs, Richard A1 - Rice, Kenneth M A1 - Rich, Stephen S A1 - Silverman, Edwin K A1 - Qasba, Pankaj A1 - Gan, Weiniu A1 - Papanicolaou, George J A1 - Nickerson, Deborah A A1 - Browning, Sharon R A1 - Zody, Michael C A1 - Zöllner, Sebastian A1 - Wilson, James G A1 - Cupples, L Adrienne A1 - Laurie, Cathy C A1 - Jaquish, Cashell E A1 - Hernandez, Ryan D A1 - O'Connor, Timothy D A1 - Abecasis, Goncalo R AB -

The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes). In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.

VL - 590 IS - 7845 ER - TY - JOUR T1 - Whole genome sequence analyses of eGFR in 23,732 people representing multiple ancestries in the NHLBI trans-omics for precision medicine (TOPMed) consortium. JF - EBioMedicine Y1 - 2021 A1 - Lin, Bridget M A1 - Grinde, Kelsey E A1 - Brody, Jennifer A A1 - Breeze, Charles E A1 - Raffield, Laura M A1 - Mychaleckyj, Josyf C A1 - Thornton, Timothy A A1 - Perry, James A A1 - Baier, Leslie J A1 - de Las Fuentes, Lisa A1 - Guo, Xiuqing A1 - Heavner, Benjamin D A1 - Hanson, Robert L A1 - Hung, Yi-Jen A1 - Qian, Huijun A1 - Hsiung, Chao A A1 - Hwang, Shih-Jen A1 - Irvin, Margaret R A1 - Jain, Deepti A1 - Kelly, Tanika N A1 - Kobes, Sayuko A1 - Lange, Leslie A1 - Lash, James P A1 - Li, Yun A1 - Liu, Xiaoming A1 - Mi, Xuenan A1 - Musani, Solomon K A1 - Papanicolaou, George J A1 - Parsa, Afshin A1 - Reiner, Alex P A1 - Salimi, Shabnam A1 - Sheu, Wayne H-H A1 - Shuldiner, Alan R A1 - Taylor, Kent D A1 - Smith, Albert V A1 - Smith, Jennifer A A1 - Tin, Adrienne A1 - Vaidya, Dhananjay A1 - Wallace, Robert B A1 - Yamamoto, Kenichi A1 - Sakaue, Saori A1 - Matsuda, Koichi A1 - Kamatani, Yoichiro A1 - Momozawa, Yukihide A1 - Yanek, Lisa R A1 - Young, Betsi A A1 - Zhao, Wei A1 - Okada, Yukinori A1 - Abecasis, Gonzalo A1 - Psaty, Bruce M A1 - Arnett, Donna K A1 - Boerwinkle, Eric A1 - Cai, Jianwen A1 - Yii-Der Chen, Ida A1 - Correa, Adolfo A1 - Cupples, L Adrienne A1 - He, Jiang A1 - Kardia, Sharon Lr A1 - Kooperberg, Charles A1 - Mathias, Rasika A A1 - Mitchell, Braxton D A1 - Nickerson, Deborah A A1 - Turner, Steve T A1 - Vasan, Ramachandran S A1 - Rotter, Jerome I A1 - Levy, Daniel A1 - Kramer, Holly J A1 - Köttgen, Anna A1 - Rich, Stephen S A1 - Lin, Dan-Yu A1 - Browning, Sharon R A1 - Franceschini, Nora AB -

BACKGROUND: Genetic factors that influence kidney traits have been understudied for low frequency and ancestry-specific variants.

METHODS: We combined whole genome sequencing (WGS) data from 23,732 participants from 10 NHLBI Trans-Omics for Precision Medicine (TOPMed) Program multi-ethnic studies to identify novel loci for estimated glomerular filtration rate (eGFR). Participants included European, African, East Asian, and Hispanic ancestries. We applied linear mixed models using a genetic relationship matrix estimated from the WGS data and adjusted for age, sex, study, and ethnicity.

FINDINGS: When testing single variants, we identified three novel loci driven by low frequency variants more commonly observed in non-European ancestry (PRKAA2, rs180996919, minor allele frequency [MAF] 0.04%, P = 6.1 × 10; METTL8, rs116951054, MAF 0.09%, P = 4.5 × 10; and MATK, rs539182790, MAF 0.05%, P = 3.4 × 10). We also replicated two known loci for common variants (rs2461702, MAF=0.49, P = 1.2 × 10, nearest gene GATM, and rs71147340, MAF=0.34, P = 3.3 × 10, CDK12). Testing aggregated variants within a gene identified the MAF gene. A statistical approach based on local ancestry helped to identify replication samples for ancestry-specific variants.

INTERPRETATION: This study highlights challenges in studying variants influencing kidney traits that are low frequency in populations and more common in non-European ancestry.

VL - 63 ER -