1
|
Lin JR, Sin-Chan P, Napolioni V, Torres GG, Mitra J, Zhang Q, Jabalameli MR, Wang Z, Nguyen N, Gao T, Laudes M, Görg S, Franke A, Nebel A, Greicius MD, Atzmon G, Ye K, Gorbunova V, Ladiges WC, Shuldiner AR, Niedernhofer LJ, Robbins PD, Milman S, Suh Y, Vijg J, Barzilai N, Zhang ZD. Rare genetic coding variants associated with human longevity and protection against age-related diseases. NATURE AGING 2021; 1:783-794. [PMID: 37117627 DOI: 10.1038/s43587-021-00108-5] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Accepted: 08/05/2021] [Indexed: 12/18/2022]
Abstract
Extreme longevity in humans has a strong genetic component, but whether this involves genetic variation in the same longevity pathways as found in model organisms is unclear. Using whole-exome sequences of a large cohort of Ashkenazi Jewish centenarians to examine enrichment for rare coding variants, we found most longevity-associated rare coding variants converge upon conserved insulin/insulin-like growth factor 1 signaling and AMP-activating protein kinase signaling pathways. Centenarians have a number of pathogenic rare coding variants similar to control individuals, suggesting that rare variants detected in the conserved longevity pathways are protective against age-related pathology. Indeed, we detected a pro-longevity effect of rare coding variants in the Wnt signaling pathway on individuals harboring the known common risk allele APOE4. The genetic component of extreme human longevity constitutes, at least in part, rare coding variants in pathways that protect against aging, including those that control longevity in model organisms.
Collapse
Affiliation(s)
- Jhih-Rong Lin
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
| | | | - Valerio Napolioni
- School of Biosciences and Veterinary Medicine, University of Camerino, Camerino, Italy
| | | | - Joydeep Mitra
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
| | - Quanwei Zhang
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
| | - M Reza Jabalameli
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
| | - Zhen Wang
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
| | - Nha Nguyen
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
| | - Tina Gao
- Department of Medicine, Albert Einstein College of Medicine, New York, NY, USA
| | - Matthias Laudes
- Division of Endocrinology, Diabetes and Clinical Nutrition, Department of Internal Medicine I, Kiel University, Kiel, Germany
| | - Siegfried Görg
- Institute of Transfusion Medicine, University Hospital Schleswig-Holstein, Lübeck, Germany
| | - Andre Franke
- Institute of Clinical Molecular Biology, Kiel University, Kiel, Germany
| | - Almut Nebel
- Institute of Clinical Molecular Biology, Kiel University, Kiel, Germany
| | - Michael D Greicius
- Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Stanford, CA, USA
| | - Gil Atzmon
- Department of Medicine, Albert Einstein College of Medicine, New York, NY, USA
- Department of Biology, Faculty of Natural Sciences, University of Haifa, Haifa, Israel
| | - Kenny Ye
- Department of Epidemiology & Population Health, Albert Einstein College of Medicine, New York, NY, USA
| | - Vera Gorbunova
- Department of Biology, University of Rochester, Rochester, NY, USA
| | - Warren C Ladiges
- Department of Comparative Medicine, School of Medicine, University of Washington, Seattle, WA, USA
| | | | - Laura J Niedernhofer
- Institute on the Biology of Aging and Metabolism and Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Paul D Robbins
- Institute on the Biology of Aging and Metabolism and Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Sofiya Milman
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
- Department of Medicine, Albert Einstein College of Medicine, New York, NY, USA
| | - Yousin Suh
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
- Department of Obstetrics and Gynecology, Columbia University Irving Medical Center, New York, NY, USA
- Department of Genetics and Development, Columbia University Irving Medical Center, New York, NY, USA
| | - Jan Vijg
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
| | - Nir Barzilai
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
- Department of Medicine, Albert Einstein College of Medicine, New York, NY, USA
| | - Zhengdong D Zhang
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA.
| |
Collapse
|
2
|
Wong ML, Arcos-Burgos M, Liu S, Licinio AW, Yu C, Chin EWM, Yao WD, Lu XY, Bornstein SR, Licinio J. Rare Functional Variants Associated with Antidepressant Remission in Mexican-Americans: Short title: Antidepressant remission and pharmacogenetics in Mexican-Americans. J Affect Disord 2021; 279:491-500. [PMID: 33128939 PMCID: PMC7953425 DOI: 10.1016/j.jad.2020.10.027] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Revised: 08/24/2020] [Accepted: 10/11/2020] [Indexed: 12/14/2022]
Abstract
INTRODUCTION Rare genetic functional variants can contribute to 30-40% of functional variability in genes relevant to drug action. Therefore, we investigated the role of rare functional variants in antidepressant response. METHOD Mexican-American individuals meeting the Diagnostic and Statistical Manual-IV criteria for major depressive disorder (MDD) participated in a prospective randomized, double-blind study with desipramine or fluoxetine. The rare variant analysis was performed using whole-exome genotyping data. Network and pathway analyses were carried out with the list of significant genes. RESULTS The Kernel-Based Adaptive Cluster method identified functional rare variants in 35 genes significantly associated with treatment remission (False discovery rate, FDR <0.01). Pathway analysis of these genes supports the involvement of the following gene ontology processes: olfactory/sensory transduction, regulation of response to cytokine stimulus, and meiotic cell cycleprocess. LIMITATIONS Our study did not have a placebo arm. We were not able to use antidepressant blood level as a covariate. Our study is based on a small sample size of only 65 Mexican-American individuals. Further studies using larger cohorts are warranted. CONCLUSION Our data identified several rare functional variants in antidepressant drug response in MDD patients. These have the potential to serve as genetic markers for predicting drug response. TRIAL REGISTRATION ClinicalTrials.gov NCT00265291.
Collapse
Affiliation(s)
- Ma-Li Wong
- Department of Psychiatry and Behavioral Sciences, State University of New York, Upstate Medical University, Syracuse, NY, USA; Department of Neuroscience and Physiology, State University of New York, Upstate Medical University, Syracuse, NY, USA; Mind & Brain Theme, South Australian Health and Medical Research Institute Adelaide, South Australia, Australia; Department of Psychiatry, Flinders University College of Medicine and Public Health, Bedford Park, South Australia, Australia.
| | - Mauricio Arcos-Burgos
- Grupo de Investigación en Psiquiatría, Departamento de Psiquiatría, Instituto de Investigaciones Médicas, Facultad de Medicina, Universidad de Antioquia, Medellin, Antioquia, Colombia
| | - Sha Liu
- Mind & Brain Theme, South Australian Health and Medical Research Institute Adelaide, South Australia, Australia
| | - Alice W Licinio
- Mind & Brain Theme, South Australian Health and Medical Research Institute Adelaide, South Australia, Australia
| | - Chenglong Yu
- Mind & Brain Theme, South Australian Health and Medical Research Institute Adelaide, South Australia, Australia; Department of Psychiatry, Flinders University College of Medicine and Public Health, Bedford Park, South Australia, Australia
| | - Eunice W M Chin
- Department of Psychiatry and Behavioral Sciences, State University of New York, Upstate Medical University, Syracuse, NY, USA
| | - Wei-Dong Yao
- Department of Psychiatry and Behavioral Sciences, State University of New York, Upstate Medical University, Syracuse, NY, USA; Department of Neuroscience and Physiology, State University of New York, Upstate Medical University, Syracuse, NY, USA
| | - Xin-Yun Lu
- Department of Neuroscience & Regenerative Medicine, Medical College of Georgia at Augusta University, Augusta, GA, USA
| | - Stefan R Bornstein
- Medical Clinic III, Carl Gustav Carus University Hospital, Dresden University of Technology, Dresden, Germany
| | - Julio Licinio
- Department of Psychiatry and Behavioral Sciences, State University of New York, Upstate Medical University, Syracuse, NY, USA; Department of Neuroscience and Physiology, State University of New York, Upstate Medical University, Syracuse, NY, USA; Mind & Brain Theme, South Australian Health and Medical Research Institute Adelaide, South Australia, Australia; Department of Psychiatry, Flinders University College of Medicine and Public Health, Bedford Park, South Australia, Australia.
| |
Collapse
|
3
|
WHEELER NICHOLASR, BENCHEK PENELOPE, KUNKLE BRIANW, HAMILTON-NELSON KARAL, WARFE MIKE, FONDRAN JEREMYR, HAINES JONATHANL, BUSH WILLIAMS. Hadoop and PySpark for reproducibility and scalability of genomic sequencing studies. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020; 25:523-534. [PMID: 31797624 PMCID: PMC6956992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Modern genomic studies are rapidly growing in scale, and the analytical approaches used to analyze genomic data are increasing in complexity. Genomic data management poses logistic and computational challenges, and analyses are increasingly reliant on genomic annotation resources that create their own data management and versioning issues. As a result, genomic datasets are increasingly handled in ways that limit the rigor and reproducibility of many analyses. In this work, we examine the use of the Spark infrastructure for the management, access, and analysis of genomic data in comparison to traditional genomic workflows on typical cluster environments. We validate the framework by reproducing previously published results from the Alzheimer's Disease Sequencing Project. Using the framework and analyses designed using Jupyter notebooks, Spark provides improved workflows, reduces user-driven data partitioning, and enhances the portability and reproducibility of distributed analyses required for large-scale genomic studies.
Collapse
Affiliation(s)
- NICHOLAS R. WHEELER
- Cleveland Institute for Computational Biology, Department of Population and Quantitative Health Sciences, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road Cleveland OH 44106, USA
| | - PENELOPE BENCHEK
- Cleveland Institute for Computational Biology, Department of Population and Quantitative Health Sciences, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road Cleveland OH 44106, USA
| | - BRIAN W. KUNKLE
- John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, 1501 NW 10th Ave, Miami, FL 33136, USA
| | - KARA L. HAMILTON-NELSON
- John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, 1501 NW 10th Ave, Miami, FL 33136, USA
| | - MIKE WARFE
- Cleveland Institute for Computational Biology, Center for Advanced Research Computing, University Technology, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road Cleveland OH 44106, USA
| | - JEREMY R. FONDRAN
- Cleveland Institute for Computational Biology, Center for Advanced Research Computing, University Technology, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road Cleveland OH 44106, USA
| | - JONATHAN L. HAINES
- Cleveland Institute for Computational Biology, Department of Population and Quantitative Health Sciences, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road Cleveland OH 44106, USA
| | - WILLIAM S. BUSH
- Cleveland Institute for Computational Biology, Department of Population and Quantitative Health Sciences, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road Cleveland OH 44106, USA
| |
Collapse
|
4
|
Vélez JI, Lopera F, Creagh PK, Piñeros LB, Das D, Cervantes-Henríquez ML, Acosta-López JE, Isaza-Ruget MA, Espinosa LG, Easteal S, Quintero GA, Silva CT, Mastronardi CA, Arcos-Burgos M. Targeting Neuroplasticity, Cardiovascular, and Cognitive-Associated Genomic Variants in Familial Alzheimer's Disease. Mol Neurobiol 2018; 56:3235-3243. [PMID: 30112632 PMCID: PMC6476862 DOI: 10.1007/s12035-018-1298-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2018] [Accepted: 08/02/2018] [Indexed: 11/24/2022]
Abstract
The identification of novel genetic variants contributing to the widespread in the age of onset (AOO) of Alzheimer’s disease (AD) could aid in the prognosis and/or development of new therapeutic strategies focused on early interventions. We recruited 78 individuals with AD from the Paisa genetic isolate in Antioquia, Colombia. These individuals belong to the world largest multigenerational and extended pedigree segregating AD as a consequence of a dominant fully penetrant mutation in the PSEN1 gene and exhibit an AOO ranging from the early 1930s to the late 1970s. To shed light on the genetic underpinning that could explain the large spread of the age of onset (AOO) of AD, 64 single nucleotide polymorphisms (SNP) associated with neuroanatomical, cardiovascular, and cognitive measures in AD were genotyped. Standard quality control and filtering procedures were applied, and single- and multi-locus linear mixed-effects models were used to identify AOO-associated SNPs. A full two-locus interaction model was fitted to define how identified SNPs interact to modulate AOO. We identified two key epistatic interactions between the APOE*E2 allele and SNPs ASTN2-rs7852878 and SNTG1-rs16914781 that delay AOO by up to ~ 8 years (95% CI 3.2–12.7, P = 1.83 × 10−3) and ~ 7.6 years (95% CI 3.3–11.8, P = 8.69 × 10−4), respectively, and validated our previous finding indicating that APOE*E2 delays AOO of AD in PSEN1 E280 mutation carriers. This new evidence involving APOE*E2 as an AOO delayer could be used for developing precision medicine approaches and predictive genomics models to potentially determine AOO in individuals genetically predisposed to AD.
Collapse
Affiliation(s)
- Jorge I. Vélez
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, ACT 2600 Australia
- Universidad del Norte, Barranquilla, Colombia
| | - Francisco Lopera
- Neuroscience Research Group, University of Antioquia, Medellín, Colombia
| | - Penelope K. Creagh
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, ACT 2600 Australia
| | - Laura B. Piñeros
- GENIUROS, Center for Research in Genetics and Genomics, Institute of Translational Medicine, School of Medicine and Health Sciences, Universidad del Rosario, Bogotá, Colombia
| | - Debjani Das
- Genome Diversity and Health Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, ACT, Canberra, 2600 Australia
| | - Martha L. Cervantes-Henríquez
- Universidad del Norte, Barranquilla, Colombia
- Grupo de Neurociencias del Caribe, Universidad Simón Bolívar, Barranquilla, Colombia
| | - Johan E. Acosta-López
- Grupo de Neurociencias del Caribe, Universidad Simón Bolívar, Barranquilla, Colombia
| | | | - Lady G. Espinosa
- INPAC Research Group, Fundación Universitaria Sanitas, Bogotá, Colombia
| | - Simon Easteal
- Genome Diversity and Health Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, ACT, Canberra, 2600 Australia
| | - Gustavo A. Quintero
- Studies in Translational Microbiology and Emerging Diseases (MICROS) Research Group, School of Medicine and Health Sciences, Universidad del Rosario, Bogotá, Colombia
| | - Claudia Tamar Silva
- GENIUROS, Center for Research in Genetics and Genomics, Institute of Translational Medicine, School of Medicine and Health Sciences, Universidad del Rosario, Bogotá, Colombia
| | - Claudio A. Mastronardi
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, ACT 2600 Australia
- Neuroscience Group (NeUROS), Institute of Translational Medicine, School of Medicine and Health Sciences, Universidad del Rosario, Bogotá, Colombia
| | - Mauricio Arcos-Burgos
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, ACT 2600 Australia
- GENIUROS, Center for Research in Genetics and Genomics, Institute of Translational Medicine, School of Medicine and Health Sciences, Universidad del Rosario, Bogotá, Colombia
| |
Collapse
|
5
|
Li C, Grove ML, Yu B, Jones BC, Morrison A, Boerwinkle E, Liu X. Genetic variants in microRNA genes and targets associated with cardiovascular disease risk factors in the African-American population. Hum Genet 2018; 137:85-94. [PMID: 29264654 PMCID: PMC5790599 DOI: 10.1007/s00439-017-1858-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2017] [Accepted: 12/06/2017] [Indexed: 02/07/2023]
Abstract
The purpose of this study is to identify microRNA (miRNA) related polymorphism, including single nucleotide variants (SNVs) in mature miRNA-encoding sequences or in miRNA-target sites, and their association with cardiovascular disease (CVD) risk factors in African-American population. To achieve our objective, we examined 1900 African-Americans from the Atherosclerosis Risk in Communities study using SNVs identified from whole-genome sequencing data. A total of 971 SNVs found in 726 different mature miRNA-encoding sequences and 16,057 SNVs found in the three prime untranslated region (3'UTR) of 3647 protein-coding genes were identified and interrogated their associations with 17 CVD risk factors. Using single-variant-based approach, we found 5 SNVs in miRNA-encoding sequences to be associated with serum Lipoprotein(a) [Lp(a)], high-density lipoprotein (HDL) or triglycerides, and 2 SNVs in miRNA-target sites to be associated with Lp(a) and HDL, all with false discovery rates of 5%. Using a gene-based approach, we identified 3 pairs of associations between gene NSD1 and platelet count, gene HSPA4L and cardiac troponin T, and gene AHSA2 and magnesium. We successfully validated the association between a variant specific to African-American population, NR_039880.1:n.18A>C, in mature hsa-miR-4727-5p encoding sequence and serum HDL level in an independent sample of 2135 African-Americans. Our study provided candidate miRNAs and their targets for further investigation of their potential contribution to ethnic disparities in CVD risk factors.
Collapse
Affiliation(s)
- Chang Li
- Human Genetics Center and Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Megan L Grove
- Human Genetics Center and Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Bing Yu
- Human Genetics Center and Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Barbara C Jones
- Human Genetics Center and Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Alanna Morrison
- Human Genetics Center and Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Eric Boerwinkle
- Human Genetics Center and Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA.
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
| | - Xiaoming Liu
- Human Genetics Center and Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA.
- Center for Precision Health, The University of Texas Health Science Center at Houston, Houston, TX, USA.
| |
Collapse
|
6
|
Hsieh AR, Chen DP, Chattopadhyay AS, Li YJ, Chang CC, Fann CSJ. A non-threshold region-specific method for detecting rare variants in complex diseases. PLoS One 2017; 12:e0188566. [PMID: 29190701 PMCID: PMC5708778 DOI: 10.1371/journal.pone.0188566] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2017] [Accepted: 11/09/2017] [Indexed: 11/23/2022] Open
Abstract
A region-specific method, NTR (non-threshold rare) variant detection method, was developed—it does not use the threshold for defining rare variants and accounts for directions of effects. NTR also considers linkage disequilibrium within the region and accommodates common and rare variants simultaneously. NTR weighs variants according to minor allele frequency and odds ratio to combine the effects of common and rare variants on disease occurrence into a single score and provides a test statistic to assess the significance of the score. In the simulations, under different effect sizes, the power of NTR increased as the effect size increased, and the type I error of our method was controlled well. Moreover, NTR was compared with several other existing methods, including the combined multivariate and collapsing method (CMC), weighted sum statistic method (WSS), sequence kernel association test (SKAT), and its modification, SKAT-O. NTR yields comparable or better power in simulations, especially when the effects of linkage disequilibrium between variants were at least moderate. In an analysis of diabetic nephropathy data, NTR detected more confirmed disease-related genes than the other aforementioned methods. NTR can thus be used as a complementary tool to help in dissecting the etiology of complex diseases.
Collapse
Affiliation(s)
- Ai-Ru Hsieh
- Graduate Institute of Biostatistics, China Medical University, Taichung, Taiwan
| | - Dao-Peng Chen
- Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei, Taiwan
| | | | - Ying-Ju Li
- Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei, Taiwan
| | - Chien-Ching Chang
- Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei, Taiwan
| | - Cathy S. J. Fann
- Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei, Taiwan
- * E-mail:
| |
Collapse
|
7
|
Chen MH, Yanek LR, Backman JD, Eicher JD, Huffman JE, Ben-Shlomo Y, Beswick AD, Yerges-Armstrong LM, Shuldiner AR, O'Connell JR, Mathias RA, Becker DM, Becker LC, Lewis JP, Johnson AD, Faraday N. Exome-chip meta-analysis identifies association between variation in ANKRD26 and platelet aggregation. Platelets 2017; 30:164-173. [PMID: 29185836 DOI: 10.1080/09537104.2017.1384538] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Previous genome-wide association studies (GWAS) have identified several variants associated with platelet function phenotypes; however, the proportion of variance explained by the identified variants is mostly small. Rare coding variants, particularly those with high potential for impact on protein structure/function, may have substantial impact on phenotype but are difficult to detect by GWAS. The main purpose of this study was to identify low frequency or rare variants associated with platelet function using genotype data from the Illumina HumanExome Bead Chip. Three family-based cohorts of European ancestry, including ~4,000 total subjects, comprised the discovery cohort and two independent cohorts, one of European and one of African American ancestry, were used for replication. Optical aggregometry in platelet-rich plasma was performed in all the discovery cohorts in response to adenosine diphosphate (ADP), epinephrine, and collagen. Meta-analyses were performed using both gene-based and single nucleotide variant association methods. The gene-based meta-analysis identified a significant association (P = 7.13 × 10-7) between rare genetic variants in ANKRD26 and ADP-induced platelet aggregation. One of the ANKRD26 SNVs - rs191015656, encoding a threonine to isoleucine substitution predicted to alter protein structure/function, was replicated in Europeans. Aggregation increases of ~20-50% were observed in heterozygotes in all cohorts. Novel genetic signals in ABCG1 and HCP5 were also associated with platelet aggregation to ADP in meta-analyses, although only results for HCP5 could be replicated. The SNV in HCP5 intersects epigenetic signatures in CD41+ megakaryocytes suggesting a new functional role in platelet biology for HCP5. This is the first study to use gene-based association methods from SNV array genotypes to identify rare variants related to platelet function. The molecular mechanisms and pathophysiological relevance for the identified genetic associations requires further study.
Collapse
Affiliation(s)
- Ming-Huei Chen
- a National Heart, Lung and Blood Institute's The Framingham Heart Study, Population Sciences Branch, Division of Intramural Research , National Heart, Lung and Blood Institute , Framingham , MA , USA
| | - Lisa R Yanek
- b GeneSTAR Research Program, Department of Medicine, Division of General Internal Medicine , Johns Hopkins University School of Medicine , Baltimore , MD , USA
| | - Joshua D Backman
- c School of Medicine, Division of Endocrinology, Diabetes and Nutrition, and Program for Personalized and Genomic Medicine , University of Maryland School of Medicine , Baltimore , MD , USA
| | - John D Eicher
- a National Heart, Lung and Blood Institute's The Framingham Heart Study, Population Sciences Branch, Division of Intramural Research , National Heart, Lung and Blood Institute , Framingham , MA , USA
| | - Jennifer E Huffman
- a National Heart, Lung and Blood Institute's The Framingham Heart Study, Population Sciences Branch, Division of Intramural Research , National Heart, Lung and Blood Institute , Framingham , MA , USA
| | - Yoav Ben-Shlomo
- d School of Social and Community Medicine , University of Bristol , Bristol , UK
| | - Andrew D Beswick
- e School of Clinical Sciences , University of Bristol , Bristol , UK
| | - Laura M Yerges-Armstrong
- c School of Medicine, Division of Endocrinology, Diabetes and Nutrition, and Program for Personalized and Genomic Medicine , University of Maryland School of Medicine , Baltimore , MD , USA
| | - Alan R Shuldiner
- c School of Medicine, Division of Endocrinology, Diabetes and Nutrition, and Program for Personalized and Genomic Medicine , University of Maryland School of Medicine , Baltimore , MD , USA
| | - Jeffrey R O'Connell
- c School of Medicine, Division of Endocrinology, Diabetes and Nutrition, and Program for Personalized and Genomic Medicine , University of Maryland School of Medicine , Baltimore , MD , USA
| | - Rasika A Mathias
- f GeneSTAR Research Program, Department of Medicine, Divisions of Allergy and Clinical Immunology and General Internal Medicine , Johns Hopkins University School of Medicine , Baltimore , MD , USA
| | - Diane M Becker
- b GeneSTAR Research Program, Department of Medicine, Division of General Internal Medicine , Johns Hopkins University School of Medicine , Baltimore , MD , USA
| | - Lewis C Becker
- g GeneSTAR Research Program, Department of Medicine, Divisions of Cardiology and General Internal Medicine , Johns Hopkins University School of Medicine , Baltimore , MD , USA
| | - Joshua P Lewis
- c School of Medicine, Division of Endocrinology, Diabetes and Nutrition, and Program for Personalized and Genomic Medicine , University of Maryland School of Medicine , Baltimore , MD , USA
| | - Andrew D Johnson
- a National Heart, Lung and Blood Institute's The Framingham Heart Study, Population Sciences Branch, Division of Intramural Research , National Heart, Lung and Blood Institute , Framingham , MA , USA
| | - Nauder Faraday
- h GeneSTAR Research Program, Department of Anesthesiology & Critical Care Medicine , Johns Hopkins University School of Medicine , Baltimore , MD , USA
| |
Collapse
|
8
|
Wong ML, Arcos-Burgos M, Liu S, Vélez JI, Yu C, Baune BT, Jawahar MC, Arolt V, Dannlowski U, Chuah A, Huttley GA, Fogarty R, Lewis MD, Bornstein SR, Licinio J. The PHF21B gene is associated with major depression and modulates the stress response. Mol Psychiatry 2017; 22:1015-1025. [PMID: 27777418 PMCID: PMC5461220 DOI: 10.1038/mp.2016.174] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/18/2016] [Revised: 08/14/2016] [Accepted: 08/16/2016] [Indexed: 12/04/2022]
Abstract
Major depressive disorder (MDD) affects around 350 million people worldwide; however, the underlying genetic basis remains largely unknown. In this study, we took into account that MDD is a gene-environment disorder, in which stress is a critical component, and used whole-genome screening of functional variants to investigate the 'missing heritability' in MDD. Genome-wide association studies (GWAS) using single- and multi-locus linear mixed-effect models were performed in a Los Angeles Mexican-American cohort (196 controls, 203 MDD) and in a replication European-ancestry cohort (499 controls, 473 MDD). Our analyses took into consideration the stress levels in the control populations. The Mexican-American controls, comprised primarily of recent immigrants, had high levels of stress due to acculturation issues and the European-ancestry controls with high stress levels were given higher weights in our analysis. We identified 44 common and rare functional variants associated with mild to moderate MDD in the Mexican-American cohort (genome-wide false discovery rate, FDR, <0.05), and their pathway analysis revealed that the three top overrepresented Gene Ontology (GO) processes were innate immune response, glutamate receptor signaling and detection of chemical stimulus in smell sensory perception. Rare variant analysis replicated the association of the PHF21B gene in the ethnically unrelated European-ancestry cohort. The TRPM2 gene, previously implicated in mood disorders, may also be considered replicated by our analyses. Whole-genome sequencing analyses of a subset of the cohorts revealed that European-ancestry individuals have a significantly reduced (50%) number of single nucleotide variants compared with Mexican-American individuals, and for this reason the role of rare variants may vary across populations. PHF21b variants contribute significantly to differences in the levels of expression of this gene in several brain areas, including the hippocampus. Furthermore, using an animal model of stress, we found that Phf21b hippocampal gene expression is significantly decreased in animals resilient to chronic restraint stress when compared with non-chronically stressed animals. Together, our results reveal that including stress level data enables the identification of novel rare functional variants associated with MDD.
Collapse
Affiliation(s)
- M-L Wong
- Mind & Brain Theme, South Australian
Health and Medical Research Institute (SAHMRI), Adelaide,
SA, Australia
- Department of Psychiatry, Flinders
University School of Medicine, Bedford Park, SA,
Australia
| | - M Arcos-Burgos
- Department of Genome Sciences, John
Curtin School of Medical Research, Australian National University,
Canberra, ACT, Australia
- University of Rosario International
Institute of Translational Medicine, Bogotá,
Colombia
| | - S Liu
- Mind & Brain Theme, South Australian
Health and Medical Research Institute (SAHMRI), Adelaide,
SA, Australia
- Department of Psychiatry, Flinders
University School of Medicine, Bedford Park, SA,
Australia
| | - J I Vélez
- Department of Genome Sciences, John
Curtin School of Medical Research, Australian National University,
Canberra, ACT, Australia
- Universidad del Norte,
Barranquilla, Colombia
| | - C Yu
- Mind & Brain Theme, South Australian
Health and Medical Research Institute (SAHMRI), Adelaide,
SA, Australia
- Department of Psychiatry, Flinders
University School of Medicine, Bedford Park, SA,
Australia
| | - B T Baune
- Discipline of Psychiatry, University of
Adelaide, Adelaide, SA, Australia
| | - M C Jawahar
- Discipline of Psychiatry, University of
Adelaide, Adelaide, SA, Australia
| | - V Arolt
- Department of Psychiatry and
Psychotherapy, University of Münster, Münster,
Germany
| | - U Dannlowski
- Department of Psychiatry and
Psychotherapy, University of Münster, Münster,
Germany
- Department of Psychiatry and
Psychotherapy, University of Marburg, Marburg,
Germany
| | - A Chuah
- Department of Genome Sciences, John
Curtin School of Medical Research, Australian National University,
Canberra, ACT, Australia
| | - G A Huttley
- Department of Genome Sciences, John
Curtin School of Medical Research, Australian National University,
Canberra, ACT, Australia
| | - R Fogarty
- Mind & Brain Theme, South Australian
Health and Medical Research Institute (SAHMRI), Adelaide,
SA, Australia
| | - M D Lewis
- Mind & Brain Theme, South Australian
Health and Medical Research Institute (SAHMRI), Adelaide,
SA, Australia
- Department of Psychiatry, Flinders
University School of Medicine, Bedford Park, SA,
Australia
| | - S R Bornstein
- Department of Psychiatry and
Psychotherapy, University of Münster, Münster,
Germany
- Medical Clinic III, Carl Gustav Carus
University Hospital, Dresden University of Technology, Dresden,
Germany
| | - J Licinio
- Mind & Brain Theme, South Australian
Health and Medical Research Institute (SAHMRI), Adelaide,
SA, Australia
- Department of Psychiatry, Flinders
University School of Medicine, Bedford Park, SA,
Australia
| |
Collapse
|
9
|
Abstract
Despite thousands of genetic loci identified to date, a large proportion of genetic variation predisposing to complex disease and traits remains unaccounted for. Advances in sequencing technology enable focused explorations on the contribution of low-frequency and rare variants to human traits. Here we review experimental approaches and current knowledge on the contribution of these genetic variants in complex disease and discuss challenges and opportunities for personalised medicine.
Collapse
Affiliation(s)
- Lorenzo Bomba
- Human Genetics, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, CB10 1HH, UK
| | - Klaudia Walter
- Human Genetics, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, CB10 1HH, UK
| | - Nicole Soranzo
- Human Genetics, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, CB10 1HH, UK. .,Department of Haematology, University of Cambridge, Hills Rd, Cambridge, CB2 0AH, UK. .,The National Institute for Health Research Blood and Transplant Unit (NIHR BTRU) in Donor Health and Genomics at the University of Cambridge, University of Cambridge, Strangeways Research Laboratory, Wort's Causeway, Cambridge, CB1 8RN, UK.
| |
Collapse
|
10
|
Vélez JI, Lopera F, Patel HR, Johar AS, Cai Y, Rivera D, Tobón C, Villegas A, Sepulveda-Falla D, Lehmann SG, Easteal S, Mastronardi CA, Arcos-Burgos M. Mutations modifying sporadic Alzheimer's disease age of onset. Am J Med Genet B Neuropsychiatr Genet 2016; 171:1116-1130. [PMID: 27573710 DOI: 10.1002/ajmg.b.32493] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/14/2015] [Accepted: 08/15/2016] [Indexed: 11/10/2022]
Abstract
The identification of mutations modifying the age of onset (AOO) in Alzheimer's disease (AD) is crucial for understanding the natural history of AD and, therefore, for early interventions. Patients with sporadic AD (sAD) from a genetic isolate in the extremes of the AOO distribution were whole-exome genotyped. Single- and multi-locus linear mixed-effects models were used to identify functional variants modifying AOO. A posteriori enrichment and bioinformatic analyses were applied to evaluate the non-random clustering of the associate variants to physiopathological pathways involved in AD. We identified more than 20 pathogenic, genome-wide statistically significant mutations of major modifier effect on the AOO. These variants are harbored in genes implicated in neuron apoptosis, neurogenesis, inflammatory processes linked to AD, oligodendrocyte differentiation, and memory processes. This set of new genes harboring these mutations could be of importance for prediction, follow-up and eventually as therapeutical targets of AD. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Jorge I Vélez
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia.,Neuroscience Research Group, University of Antioquia, Medellín, Colombia
| | - Francisco Lopera
- Neuroscience Research Group, University of Antioquia, Medellín, Colombia
| | - Hardip R Patel
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia
| | - Angad S Johar
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia
| | - Yeping Cai
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia
| | - Dora Rivera
- Neuroscience Research Group, University of Antioquia, Medellín, Colombia
| | - Carlos Tobón
- Neuroscience Research Group, University of Antioquia, Medellín, Colombia
| | - Andrés Villegas
- Neuroscience Research Group, University of Antioquia, Medellín, Colombia
| | - Diego Sepulveda-Falla
- Neuroscience Research Group, University of Antioquia, Medellín, Colombia.,Institute of Neuropathology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Shaun G Lehmann
- Genome Diversity and Health Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia
| | - Simon Easteal
- Genome Diversity and Health Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia
| | - Claudio A Mastronardi
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia
| | - Mauricio Arcos-Burgos
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia.,Neuroscience Research Group, University of Antioquia, Medellín, Colombia
| |
Collapse
|
11
|
Montemuiño C, Espinosa A, Moure JC, Vera G, Hernández P, Ramos-Onsins S. Approaching Long Genomic Regions and Large Recombination Rates with msParSm as an Alternative to MaCS. Evol Bioinform Online 2016; 12:223-228. [PMID: 27721650 PMCID: PMC5047705 DOI: 10.4137/ebo.s40268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2016] [Revised: 07/19/2016] [Accepted: 07/21/2016] [Indexed: 11/05/2022] Open
Abstract
The msParSm application is an evolution of msPar, the parallel version of the coalescent simulation program ms, which removes the limitation for simulating long stretches of DNA sequences with large recombination rates, without compromising the accuracy of the standard coalescence. This work introduces msParSm, describes its significant performance improvements over msPar and its shared memory parallelization details, and shows how it can get better, if not similar, execution times than MaCS. Two case studies with different mutation rates were analyzed, one approximating the human average and the other approximating the Drosophila melanogaster average. Source code is available at https://github.com/cmontemuino/msparsm.
Collapse
Affiliation(s)
- Carlos Montemuiño
- Computer Architecture and Operating Systems Department (CAOS), Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Antonio Espinosa
- Computer Architecture and Operating Systems Department (CAOS), Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Juan C Moure
- Computer Architecture and Operating Systems Department (CAOS), Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Gonzalo Vera
- Centre for Research in Agricultural Genomics (CRAG) Consortium CSIC-IRTA-UAB-UB Edifici CRAG, Campus UAB, Bellaterra, Spain
| | - Porfidio Hernández
- Computer Architecture and Operating Systems Department (CAOS), Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Sebastián Ramos-Onsins
- Centre for Research in Agricultural Genomics (CRAG) Consortium CSIC-IRTA-UAB-UB Edifici CRAG, Campus UAB, Bellaterra, Spain
| |
Collapse
|
12
|
Lee S, Choi S, Kim YJ, Kim BJ, Hwang H, Park T. Pathway-based approach using hierarchical components of collapsed rare variants. Bioinformatics 2016; 32:i586-i594. [PMID: 27587678 PMCID: PMC5013912 DOI: 10.1093/bioinformatics/btw425] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
MOTIVATION To address 'missing heritability' issue, many statistical methods for pathway-based analyses using rare variants have been proposed to analyze pathways individually. However, neglecting correlations between multiple pathways can result in misleading solutions, and pathway-based analyses of large-scale genetic datasets require massive computational burden. We propose a Pathway-based approach using HierArchical components of collapsed RAre variants Of High-throughput sequencing data (PHARAOH) for the analysis of rare variants by constructing a single hierarchical model that consists of collapsed gene-level summaries and pathways and analyzes entire pathways simultaneously by imposing ridge-type penalties on both gene and pathway coefficient estimates; hence our method considers the correlation of pathways without constraint by a multiple testing problem. RESULTS Through simulation studies, the proposed method was shown to have higher statistical power than the existing pathway-based methods. In addition, our method was applied to the large-scale whole-exome sequencing data with levels of a liver enzyme using two well-known pathway databases Biocarta and KEGG. This application demonstrated that our method not only identified associated pathways but also successfully detected biologically plausible pathways for a phenotype of interest. These findings were successfully replicated by an independent large-scale exome chip study. AVAILABILITY AND IMPLEMENTATION An implementation of PHARAOH is available at http://statgen.snu.ac.kr/software/pharaoh/ CONTACT tspark@stats.snu.ac.kr SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sungyoung Lee
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 151-747, Korea
| | - Sungkyoung Choi
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 151-747, Korea
| | - Young Jin Kim
- Center for Genome Science, National Institute of Health, Osong Health Technology Administration Complex, Chungcheongbuk-Do 363-951, Korea
| | - Bong-Jo Kim
- Center for Genome Science, National Institute of Health, Osong Health Technology Administration Complex, Chungcheongbuk-Do 363-951, Korea
| | - Heungsun Hwang
- Department of Psychology, McGill University, Montreal, QC H3A 1B1, Canada
| | - Taesung Park
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 151-747, Korea Department of Statistics, Seoul National University, Seoul 151-747, Korea
| |
Collapse
|
13
|
Acosta MT, Swanson J, Stehli A, Molina BSG, Martinez AF, Arcos-Burgos M, Muenke M. ADGRL3 (LPHN3) variants are associated with a refined phenotype of ADHD in the MTA study. Mol Genet Genomic Med 2016; 4:540-7. [PMID: 27652281 PMCID: PMC5023939 DOI: 10.1002/mgg3.230] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2016] [Revised: 05/19/2016] [Accepted: 05/23/2016] [Indexed: 12/22/2022] Open
Abstract
Background ADHD is the most common neuropsychiatric condition affecting individuals of all ages. Long‐term outcomes of affected individuals and association with severe comorbidities as SUD or conduct disorders are the main concern. Genetic associations have been extensively described. Multiple studies show that intronic variants harbored in the ADGRL3 (LPHN3) gene are associated with ADHD, especially associated with poor outcomes. Methods In this study, we evaluated this association in the Multimodal Treatment Study of children with ADHD (MTA), initiated as a 14‐month randomized clinical trial of 579 children diagnosed with DSM‐IV ADHD‐Combined Type (ADHD‐C), that transitioned to a 16‐year prospective observational follow‐up, and 289 classmates added at the 2‐year assessment to serve as a local normative comparison group (LNCG). Diagnostic evaluations at entry were based on the Diagnostic Interview Schedule for Children‐Parent (DISC‐P), which was repeated at several points over the years. For an add‐on genetic study, blood samples were collected from 232 in the MTA group and 139 in the LNCG. Results For the 205 MTA participants, 14.6% retained the DISC‐P diagnosis of ADHD‐C in adolescence. For 127 LNCG participants, 88.2% remained undiagnosed by the DISC‐P. We genotyped 15 polymorphic SNP markers harbored in the ADGRL3 gene, and compared allele frequencies for the 30 cases with continued diagnosis of ADHD‐C in adolescence to the other participants. Replication of the association of rs2345039 ADGRL3 variant was observed (P value = 0.004, FDR corrected = 0.03; Odds ratio = 2.25, upper CI 1.28–3.97). Conclusion The detection of susceptibility conferred by ADGRL3 variants in the extreme phenotype of continued diagnosis of ADHD‐C from childhood to adolescence provides additional support that the association of ADGRL3 and ADHD is not spurious. Exploring genetic effects in longitudinal cohorts, in which refined, age‐dependent phenotypes are documented, is crucial to understand the natural history of ADHD.
Collapse
Affiliation(s)
- Maria T Acosta
- Medical Genetics BranchNational Human Genome Research InstituteNational Institutes of HealthBethesdaMaryland; Department of Pediatric and NeurologyGeorge Washington UniversityChildren's National Medical CenterWashingtonDistrict of Columbia
| | - James Swanson
- Department of PsychiatryFlorida International UniversityMiamiFlorida; Department of PediatricsUniversity of California at IrvineIrvineCalifornia
| | - Annamarie Stehli
- Department of Pediatrics University of California at Irvine Irvine California
| | - Brooke S G Molina
- Departments of Psychiatry and Psychology University of Pittsburgh Pittsburgh Pennsylvania
| | | | - Ariel F Martinez
- Medical Genetics Branch National Human Genome Research Institute National Institutes of Health Bethesda Maryland
| | - Mauricio Arcos-Burgos
- Genomics and Predictive Medicine Genome Biology Department John Curtin School of Medical Research ANU College of Medicine, Biology and Environment The Australian National University Canberra ACT Australia
| | - Maximilian Muenke
- Medical Genetics Branch National Human Genome Research Institute National Institutes of Health Bethesda Maryland
| |
Collapse
|
14
|
Vélez JI, Lopera F, Sepulveda-Falla D, Patel HR, Johar AS, Chuah A, Tobón C, Rivera D, Villegas A, Cai Y, Peng K, Arkell R, Castellanos FX, Andrews SJ, Silva Lara MF, Creagh PK, Easteal S, de Leon J, Wong ML, Licinio J, Mastronardi CA, Arcos-Burgos M. APOE*E2 allele delays age of onset in PSEN1 E280A Alzheimer's disease. Mol Psychiatry 2016; 21:916-24. [PMID: 26619808 PMCID: PMC5414071 DOI: 10.1038/mp.2015.177] [Citation(s) in RCA: 81] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/20/2015] [Revised: 10/07/2015] [Accepted: 10/14/2015] [Indexed: 01/10/2023]
Abstract
Alzheimer's disease (AD) age of onset (ADAOO) varies greatly between individuals, with unique causal mutations suggesting the role of modifying genetic and environmental interactions. We analyzed ~50 000 common and rare functional genomic variants from 71 individuals of the 'Paisa' pedigree, the world's largest pedigree segregating a severe form of early-onset AD, who were affected carriers of the fully penetrant E280A mutation in the presenilin-1 (PSEN1) gene. Affected carriers with ages at the extremes of the ADAOO distribution (30s-70s age range), and linear mixed-effects models were used to build single-locus regression models outlining the ADAOO. We identified the rs7412 (APOE*E2 allele) as a whole exome-wide ADAOO modifier that delays ADAOO by ~12 years (β=11.74, 95% confidence interval (CI): 8.07-15.41, P=6.31 × 10(-8), PFDR=2.48 × 10(-3)). Subsequently, to evaluate comprehensively the APOE (apolipoprotein E) haplotype variants (E1/E2/E3/E4), the markers rs7412 and rs429358 were genotyped in 93 AD affected carriers of the E280A mutation. We found that the APOE*E2 allele, and not APOE*E4, modifies ADAOO in carriers of the E280A mutation (β=8.24, 95% CI: 4.45-12.01, P=3.84 × 10(-5)). Exploratory linear mixed-effects multilocus analysis suggested that other functional variants harbored in genes involved in cell proliferation, protein degradation, apoptotic and immune dysregulation processes (i.e., GPR20, TRIM22, FCRL5, AOAH, PINLYP, IFI16, RC3H1 and DFNA5) might interact with the APOE*E2 allele. Interestingly, suggestive evidence as an ADAOO modifier was found for one of these variants (GPR20) in a set of patients with sporadic AD from the Paisa genetic isolate. This is the first study demonstrating that the APOE*E2 allele modifies the natural history of AD typified by the age of onset in E280A mutation carriers. To the best of our knowledge, this is the largest analyzed sample of patients with a unique mutation sharing uniform environment. Formal replication of our results in other populations and in other forms of AD will be crucial for prediction, follow-up and presumably developing new therapeutic strategies for patients either at risk or affected by AD.
Collapse
Affiliation(s)
- J I Vélez
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, ACT, Australia.,Neuroscience Research Group, University of Antioquia, Medellín, Colombia
| | - F Lopera
- Neuroscience Research Group, University of Antioquia, Medellín, Colombia
| | - D Sepulveda-Falla
- Neuroscience Research Group, University of Antioquia, Medellín, Colombia.,Institute of Neuropathology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - H R Patel
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, ACT, Australia
| | - A S Johar
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, ACT, Australia
| | - A Chuah
- Genome Discovery Unit, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, ACT, Australia
| | - C Tobón
- Neuroscience Research Group, University of Antioquia, Medellín, Colombia
| | - D Rivera
- Neuroscience Research Group, University of Antioquia, Medellín, Colombia
| | - A Villegas
- Neuroscience Research Group, University of Antioquia, Medellín, Colombia
| | - Y Cai
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, ACT, Australia
| | - K Peng
- Biomolecular Resource Facility, John Curtin School of Medical Research, The Australian National University, Canberra, ACT, Australia
| | - R Arkell
- Early Mammalian Development Laboratory, Research School of Biology, The Australian National University, Canberra, ACT, Australia
| | - F X Castellanos
- NYU Child Study Center, NYU Langone Medical Center, New York, NY, USA.,Nathan Kline Institute for Psychiatric Research, Orangeburg, NY, USA
| | - S J Andrews
- Genome Diversity and Health Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, ACT, Australia
| | - M F Silva Lara
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, ACT, Australia
| | - P K Creagh
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, ACT, Australia
| | - S Easteal
- Genome Diversity and Health Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, ACT, Australia
| | - J de Leon
- Mental Health Research Center at Eastern State Hospital, University of Kentucky, Lexington, KY, USA
| | - M L Wong
- South Australian Health and Medical Research Institute and Department of Psychiatry, School of Medicine, Flinders University, Adelaide, SA, Australia
| | - J Licinio
- South Australian Health and Medical Research Institute and Department of Psychiatry, School of Medicine, Flinders University, Adelaide, SA, Australia
| | - C A Mastronardi
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, ACT, Australia.,South Australian Health and Medical Research Institute and Department of Psychiatry, School of Medicine, Flinders University, Adelaide, SA, Australia
| | - M Arcos-Burgos
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, ACT, Australia.,Neuroscience Research Group, University of Antioquia, Medellín, Colombia
| |
Collapse
|
15
|
Discovery of rare variants for complex phenotypes. Hum Genet 2016; 135:625-34. [PMID: 27221085 DOI: 10.1007/s00439-016-1679-1] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2016] [Accepted: 04/28/2016] [Indexed: 12/27/2022]
Abstract
With the rise of sequencing technologies, it is now feasible to assess the role rare variants play in the genetic contribution to complex trait variation. While some of the earlier targeted sequencing studies successfully identified rare variants of large effect, unbiased gene discovery using exome sequencing has experienced limited success for complex traits. Nevertheless, rare variant association studies have demonstrated that rare variants do contribute to phenotypic variability, but sample sizes will likely have to be even larger than those of common variant association studies to be powered for the detection of genes and loci. Large-scale sequencing efforts of tens of thousands of individuals, such as the UK10K Project and aggregation efforts such as the Exome Aggregation Consortium, have made great strides in advancing our knowledge of the landscape of rare variation, but there remain many considerations when studying rare variation in the context of complex traits. We discuss these considerations in this review, presenting a broad range of topics at a high level as an introduction to rare variant analysis in complex traits including the issues of power, study design, sample ascertainment, de novo variation, and statistical testing approaches. Ultimately, as sequencing costs continue to decline, larger sequencing studies will yield clearer insights into the biological consequence of rare mutations and may reveal which genes play a role in the etiology of complex traits.
Collapse
|
16
|
A Mutation in DAOA Modifies the Age of Onset in PSEN1 E280A Alzheimer's Disease. Neural Plast 2016; 2016:9760314. [PMID: 26949549 PMCID: PMC4753688 DOI: 10.1155/2016/9760314] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2015] [Revised: 09/30/2015] [Accepted: 10/21/2015] [Indexed: 11/17/2022] Open
Abstract
We previously reported age of onset (AOO) modifier genes in the world's largest pedigree segregating early-onset Alzheimer's disease (AD), caused by the p.Glu280Ala (E280A) mutation in the PSEN1 gene. Here we report the results of a targeted analysis of functional exonic variants in those AOO modifier genes in sixty individuals with PSEN1 E280A AD who were whole-exome genotyped for ~250,000 variants. Standard quality control, filtering, and annotation for functional variants were applied, and common functional variants located in those previously reported as AOO modifier loci were selected. Multiloci linear mixed-effects models were used to test the association between these variants and AOO. An exonic missense mutation in the G72 (DAOA) gene (rs2391191, P = 1.94 × 10−4, PFDR = 9.34 × 10−3) was found to modify AOO in PSEN1 E280A AD. Nominal associations of missense mutations in the CLUAP1 (rs9790, P = 7.63 × 10−3, PFDR = 0.1832) and EXOC2 (rs17136239, P = 0.0325, PFDR = 0.391) genes were also found. Previous studies have linked polymorphisms in the DAOA gene with the occurrence of neuropsychiatric symptoms such as depression, apathy, aggression, delusions, hallucinations, and psychosis in AD. Our findings strongly suggest that this new conspicuous functional AOO modifier within the G72 (DAOA) gene could be pivotal for understanding the genetic basis of AD.
Collapse
|
17
|
Schmidt EM, Willer CJ. Insights into blood lipids from rare variant discovery. Curr Opin Genet Dev 2015; 33:25-31. [PMID: 26241468 DOI: 10.1016/j.gde.2015.06.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2015] [Revised: 06/19/2015] [Accepted: 06/22/2015] [Indexed: 12/18/2022]
Abstract
Large-scale genome wide screens have discovered over 160 common variants associated with plasma lipids, which are risk factors often linked to heart disease. A large fraction of lipid heritability remains unexplained, and it is hypothesized that rare variants of functional consequence may account for some of the missing heritability. Finding lipid-associated variants that occur less frequently in the human population poses a challenge, primarily due to lack of power and difficulties to identify and test them. Interrogation of the protein-coding regions of the genome using array and sequencing techniques has led to important discoveries of rare variants that affect lipid levels and related disease risk. Here, we summarize the latest methods and findings that contribute to our current understanding of rare variant lipid genetics.
Collapse
Affiliation(s)
- Ellen M Schmidt
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Cristen J Willer
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Internal Medicine, Division of Cardiovascular Medicine, University of Michigan, Ann Arbor, MI 48109, USA; Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
18
|
Abstract
Genome-wide association studies (GWASs) have successfully uncovered thousands of robust associations between common variants and complex traits and diseases. Despite these successes, much of the heritability of these traits remains unexplained. Because low-frequency and rare variants are not tagged by conventional genome-wide genotyping arrays, they may represent an important and understudied component of complex trait genetics. In contrast to common variant GWASs, there are many different types of study designs, assays and analytic techniques that can be utilized for rare variant association studies (RVASs). In this review, we briefly present the different technologies available to identify rare genetic variants, including novel exome arrays. We also compare the different study designs for RVASs and argue that the best design will likely be phenotype-dependent. We discuss the main analytical issues relevant to RVASs, including the different statistical methods that can be used to test genetic associations with rare variants and the various bioinformatic approaches to predicting in silico biological functions for variants. Finally, we describe recent rare variant association findings, highlighting the unexpected conclusion that most rare variants have modest-to-small effect sizes on phenotypic variation. This observation has major implications for our understanding of the genetic architecture of complex traits in the context of the unexplained heritability challenge.
Collapse
Affiliation(s)
- Paul L Auer
- School of Public Health, University of Wisconsin-Milwaukee, Milwaukee, WI 53201-0413 USA
| | - Guillaume Lettre
- Montreal Heart Institute and Université de Montréal, Montreal, Quebec H1T 1C8 Canada
| |
Collapse
|
19
|
Porth I, El-Kassaby YA. Using Populus as a lignocellulosic feedstock for bioethanol. Biotechnol J 2015; 10:510-24. [PMID: 25676392 DOI: 10.1002/biot.201400194] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2014] [Revised: 11/11/2014] [Accepted: 12/30/2014] [Indexed: 11/10/2022]
Abstract
Populus species along with species from the sister genus Salix will provide valuable feedstock resources for advanced second-generation biofuels. Their inherent fast growth characteristics can particularly be exploited for short rotation management, a time and energy saving cultivation alternative for lignocellulosic feedstock supply. Salicaceae possess inherent cell wall characteristics with favorable cellulose to lignin ratios for utilization as bioethanol crop. We review economically important traits relevant for intensively managed biofuel crop plantations, genomic and phenotypic resources available for Populus, breeding strategies for forest trees dedicated to bioenergy provision, and bioprocesses and downstream applications related to opportunities using Salicaceae as a renewable resource. Challenges need to be resolved for every single step of the conversion process chain, i.e., starting from tree domestication for improved performance as a bioenergy crop, bioconversion process, policy development for land use changes associated with advanced biofuels, and harvest and supply logistics associated with industrial-scale biorefinery plants using Populus as feedstock. Significant hurdles towards cost and energy efficiency, environmental friendliness, and yield maximization with regards to biomass pretreatment, saccharification, and fermentation of celluloses and the sustainability of biorefineries as a whole still need to be overcome.
Collapse
Affiliation(s)
- Ilga Porth
- Forest and Conservation Sciences, University of British Columbia, Vancouver, Canada.
| | | |
Collapse
|
20
|
Ionita-Laza I, Capanu M, De Rubeis S, McCallum K, Buxbaum JD. Identification of rare causal variants in sequence-based studies: methods and applications to VPS13B, a gene involved in Cohen syndrome and autism. PLoS Genet 2014; 10:e1004729. [PMID: 25502226 PMCID: PMC4263785 DOI: 10.1371/journal.pgen.1004729] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2014] [Accepted: 09/02/2014] [Indexed: 11/18/2022] Open
Abstract
Pinpointing the small number of causal variants among the abundant naturally occurring genetic variation is a difficult challenge, but a crucial one for understanding precise molecular mechanisms of disease and follow-up functional studies. We propose and investigate two complementary statistical approaches for identification of rare causal variants in sequencing studies: a backward elimination procedure based on groupwise association tests, and a hierarchical approach that can integrate sequencing data with diverse functional and evolutionary conservation annotations for individual variants. Using simulations, we show that incorporation of multiple bioinformatic predictors of deleteriousness, such as PolyPhen-2, SIFT and GERP++ scores, can improve the power to discover truly causal variants. As proof of principle, we apply the proposed methods to VPS13B, a gene mutated in the rare neurodevelopmental disorder called Cohen syndrome, and recently reported with recessive variants in autism. We identify a small set of promising candidates for causal variants, including two loss-of-function variants and a rare, homozygous probably-damaging variant that could contribute to autism risk.
Collapse
Affiliation(s)
- Iuliana Ionita-Laza
- Department of Biostatistics, Columbia University, New York, New York, United States of America
- * E-mail:
| | - Marinela Capanu
- Memorial Sloan-Kettering Cancer Center, New York, New York, United States of America
| | - Silvia De Rubeis
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America
- Departments of Psychiatry, Mount Sinai School of Medicine, New York, New York, United States of America
| | - Kenneth McCallum
- Department of Biostatistics, Columbia University, New York, New York, United States of America
| | - Joseph D. Buxbaum
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America
- Departments of Psychiatry, Mount Sinai School of Medicine, New York, New York, United States of America
- Departments of Genetics and Genomic Sciences, and Neuroscience, and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America
- Mindich Child Health and Development Institute, Mount Sinai School of Medicine, New York, New York, United States of America
| |
Collapse
|
21
|
Quadri M, Yang X, Cossu G, Olgiati S, Saddi VM, Breedveld GJ, Ouyang L, Hu J, Xu N, Graafland J, Ricchi V, Murgia D, Guedes LC, Mariani C, Marti MJ, Tarantino P, Asselta R, Valldeoriola F, Gagliardi M, Pezzoli G, Ezquerra M, Quattrone A, Ferreira J, Annesi G, Goldwurm S, Tolosa E, Oostra BA, Melis M, Wang J, Bonifati V. An exome study of Parkinson's disease in Sardinia, a Mediterranean genetic isolate. Neurogenetics 2014; 16:55-64. [PMID: 25294124 DOI: 10.1007/s10048-014-0425-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2014] [Accepted: 09/15/2014] [Indexed: 12/21/2022]
Abstract
Parkinson's disease (PD) is a common neurodegenerative disorder of complex aetiology. Rare, highly penetrant PD-causing mutations and common risk factors of small effect size have been identified in several genes/loci. However, these mutations and risk factors only explain a fraction of the disease burden, suggesting that additional, substantial genetic determinants remain to be found. Genetically isolated populations offer advantages for dissecting the genetic architecture of complex disorders, such as PD. We performed exome sequencing in 100 unrelated PD patients from Sardinia, a genetic isolate. SNPs absent from dbSNP129 and 1000 Genomes, shared by at least five patients, and of functional effects were genotyped in an independent Sardinian case-control sample (n = 500). Variants associated with PD with nominal p value <0.05 and those with odds ratio (OR) ≥3 were validated by Sanger sequencing and typed in a replication sample of 2965 patients and 2678 controls from Italy, Spain, and Portugal. We identified novel moderately rare variants in several genes, including SCAPER, HYDIN, UBE2H, EZR, MMRN2 and OGFOD1 that were specifically present in PD patients or enriched among them, nominating these as novel candidate risk genes for PD, although no variants achieved genome-wide significance after Bonferroni correction. Our results suggest that the genetic bases of PD are highly heterogeneous, with implications for the design of future large-scale exome or whole-genome analyses of this disease.
Collapse
Affiliation(s)
- Marialuisa Quadri
- Department of Clinical Genetics, Erasmus MC, PO Box 2040, 3000, CA, Rotterdam, The Netherlands
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Sham PC, Purcell SM. Statistical power and significance testing in large-scale genetic studies. Nat Rev Genet 2014; 15:335-46. [PMID: 24739678 DOI: 10.1038/nrg3706] [Citation(s) in RCA: 383] [Impact Index Per Article: 34.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Significance testing was developed as an objective method for summarizing statistical evidence for a hypothesis. It has been widely adopted in genetic studies, including genome-wide association studies and, more recently, exome sequencing studies. However, significance testing in both genome-wide and exome-wide studies must adopt stringent significance thresholds to allow multiple testing, and it is useful only when studies have adequate statistical power, which depends on the characteristics of the phenotype and the putative genetic variant, as well as the study design. Here, we review the principles and applications of significance testing and power calculation, including recently proposed gene-based tests for rare variants.
Collapse
Affiliation(s)
- Pak C Sham
- Centre for Genomic Sciences, Jockey Club Building for Interdisciplinary Research; State Key Laboratory of Brain and Cognitive Sciences, and Department of Psychiatry, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Shaun M Purcell
- 1] Center for Statistical Genetics, Icahn School of Medicine at Mount Sinai, New York 10029-6574, USA. [2] Center for Human Genetic Research, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts 02114, USA
| |
Collapse
|
23
|
Liu DJ, Peloso GM, Zhan X, Holmen OL, Zawistowski M, Feng S, Nikpay M, Auer PL, Goel A, Zhang H, Peters U, Farrall M, Orho-Melander M, Kooperberg C, McPherson R, Watkins H, Willer CJ, Hveem K, Melander O, Kathiresan S, Abecasis GR. Meta-analysis of gene-level tests for rare variant association. Nat Genet 2014; 46:200-4. [PMID: 24336170 PMCID: PMC3939031 DOI: 10.1038/ng.2852] [Citation(s) in RCA: 144] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2013] [Accepted: 11/20/2013] [Indexed: 12/14/2022]
Abstract
The majority of reported complex disease associations for common genetic variants have been identified through meta-analysis, a powerful approach that enables the use of large sample sizes while protecting against common artifacts due to population structure and repeated small-sample analyses sharing individual-level data. As the focus of genetic association studies shifts to rare variants, genes and other functional units are becoming the focus of analysis. Here we propose and evaluate new approaches for performing meta-analysis of rare variant association tests, including burden tests, weighted burden tests, variable-threshold tests and tests that allow variants with opposite effects to be grouped together. We show that our approach retains useful features from single-variant meta-analysis approaches and demonstrate its use in a study of blood lipid levels in ∼18,500 individuals genotyped with exome arrays.
Collapse
Affiliation(s)
- Dajiang J. Liu
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109
| | - Gina M. Peloso
- Broad Institute of Harvard and MIT, Cambridge, MA
- Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA
| | - Xiaowei Zhan
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109
| | - Oddgeir L. Holmen
- Department of Public Health and General Practice, Norwegian University of Science and Technology, Trondheim 7489, Norway
- St. Olav Hospital, Trondheim University Hospital, Trondheim, Norway
| | - Matthew Zawistowski
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109
| | - Shuang Feng
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109
| | - Majid Nikpay
- University of Ottawa Heart Institute, Ottawa, Ontario, Canada
| | - Paul L. Auer
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle WA 98109, USA
- School of Public Health, University of Wisconsin-Milwaukee
| | - Anuj Goel
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, United Kingdom
- Department of Cardiovascular Medicine, University of Oxford, Oxford, UK
| | - He Zhang
- Division of Cardiology, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, MI 48109
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109
| | - Ulrike Peters
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle WA 98109, USA
- Department of Epidemiology, University of Washington School of Public Health, Seattle, WA
| | - Martin Farrall
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, United Kingdom
- Department of Cardiovascular Medicine, University of Oxford, Oxford, UK
| | - Marju Orho-Melander
- Department of Cardiovascular Medicine, University of Oxford, Oxford, UK
- Department of Clinical Sciences, Lund University, Malmö, Sweden
| | - Charles Kooperberg
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle WA 98109, USA
- Department of Biostatistics, University of Washington School of Public Health, Seattle, WA
| | - Ruth McPherson
- University of Ottawa Heart Institute, Ottawa, Ontario, Canada
| | - Hugh Watkins
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, United Kingdom
- Department of Cardiovascular Medicine, University of Oxford, Oxford, UK
| | - Cristen J. Willer
- Division of Cardiology, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, MI 48109
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109
| | - Kristian Hveem
- Department of Public Health and General Practice, Norwegian University of Science and Technology, Trondheim 7489, Norway
- Levanger Hospital, Levanger, Norway
| | - Olle Melander
- Department of Cardiovascular Medicine, University of Oxford, Oxford, UK
- Department of Clinical Sciences, Lund University, Malmö, Sweden
| | - Sekar Kathiresan
- Broad Institute of Harvard and MIT, Cambridge, MA
- Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA
- Harvard Medical School, Cambridge, MA
| | - Gonçalo R. Abecasis
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109
| |
Collapse
|
24
|
Li B, Liu DJ, Leal SM. Identifying rare variants associated with complex traits via sequencing. ACTA ACUST UNITED AC 2014; Chapter 1:Unit 1.26. [PMID: 23853079 DOI: 10.1002/0471142905.hg0126s78] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Although genome-wide association studies have been successful in detecting associations with common variants, there is currently an increasing interest in identifying low-frequency and rare variants associated with complex traits. Next-generation sequencing technologies make it feasible to survey the full spectrum of genetic variation in coding regions or the entire genome. The association analysis for rare variants is challenging, and traditional methods are ineffective, however, due to the low frequency of rare variants, coupled with allelic heterogeneity. Recently a battery of new statistical methods has been proposed for identifying rare variants associated with complex traits. These methods test for associations by aggregating multiple rare variants across a gene or a genomic region or among a group of variants in the genome. In this unit, we describe key concepts for rare variant association for complex traits, survey some of the recent methods, discuss their statistical power under various scenarios, and provide practical guidance on analyzing next-generation sequencing data for identifying rare variants associated with complex traits.
Collapse
Affiliation(s)
- Bingshan Li
- Department of Molecular Physiology and Biophysics, Center for Human Genetics Research, Vanderbilt University, Nashville, Tennessee, USA
| | | | | |
Collapse
|
25
|
Cardinale CJ, Kelsen JR, Baldassano RN, Hakonarson H. Impact of exome sequencing in inflammatory bowel disease. World J Gastroenterol 2013; 19:6721-9. [PMID: 24187447 PMCID: PMC3812471 DOI: 10.3748/wjg.v19.i40.6721] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/11/2013] [Revised: 09/11/2013] [Accepted: 09/16/2013] [Indexed: 02/06/2023] Open
Abstract
Approaches to understanding the genetic contribution to inflammatory bowel disease (IBD) have continuously evolved from family- and population-based epidemiology, to linkage analysis, and most recently, to genome-wide association studies (GWAS). The next stage in this evolution seems to be the sequencing of the exome, that is, the regions of the human genome which encode proteins. The GWAS approach has been very fruitful in identifying at least 163 loci as being associated with IBD, and now, exome sequencing promises to take our genetic understanding to the next level. In this review we will discuss the possible contributions that can be made by an exome sequencing approach both at the individual patient level to aid with disease diagnosis and future therapies, as well as in advancing knowledge of the pathogenesis of IBD.
Collapse
|
26
|
Handel AE, Disanto G, Ramagopalan SV. Next-generation sequencing in understanding complex neurological disease. Expert Rev Neurother 2013; 13:215-27. [PMID: 23368808 DOI: 10.1586/ern.12.165] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Next-generation sequencing techniques have made vast quantities of data on human genomes and transcriptomes available to researchers. Huge progress has been made towards understanding the basis of many Mendelian neurological conditions, but progress has been considerably slower in complex neurological diseases (multiple sclerosis, migraine, Alzheimer's disease, Parkinson's disease, amyotrophic lateral sclerosis, and so on). The authors review current next-generation sequencing methodologies and present selected studies illustrating how these have been used to cast light on the genetic etiology of complex neurological diseases with specific focus on multiple sclerosis. The authors highlight particular pitfalls in next-generation sequencing experiments and speculate on both clinical and research applications of these sequencing platforms for complex neurological disorders in the future.
Collapse
Affiliation(s)
- Adam E Handel
- Department of Physiology, Anatomy and Genetics, University of Oxford, UK
| | | | | |
Collapse
|
27
|
Panoutsopoulou K, Tachmazidou I, Zeggini E. In search of low-frequency and rare variants affecting complex traits. Hum Mol Genet 2013; 22:R16-21. [PMID: 23922232 PMCID: PMC3782074 DOI: 10.1093/hmg/ddt376] [Citation(s) in RCA: 64] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The allelic architecture of complex traits is likely to be underpinned by a combination of multiple common frequency and rare variants. Targeted genotyping arrays and next-generation sequencing technologies at the whole-genome sequencing (WGS) and whole-exome scales (WES) are increasingly employed to access sequence variation across the full minor allele frequency (MAF) spectrum. Different study design strategies that make use of diverse technologies, imputation and sample selection approaches are an active target of development and evaluation efforts. Initial insights into the contribution of rare variants in common diseases and medically relevant quantitative traits point to low-frequency and rare alleles acting either independently or in aggregate and in several cases alongside common variants. Studies conducted in population isolates have been successful in detecting rare variant associations with complex phenotypes. Statistical methodologies that enable the joint analysis of rare variants across regions of the genome continue to evolve with current efforts focusing on incorporating information such as functional annotation, and on the meta-analysis of these burden tests. In addition, population stratification, defining genome-wide statistical significance thresholds and the design of appropriate replication experiments constitute important considerations for the powerful analysis and interpretation of rare variant association studies. Progress in addressing these emerging challenges and the accrual of sufficiently large data sets are poised to help the field of complex trait genetics enter a promising era of discovery.
Collapse
Affiliation(s)
| | | | - Eleftheria Zeggini
- To whom correspondence should be addressed at: Wellcome Trust Sanger Institute, The Morgan Building, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1HH, UK. Tel: +44-1223496868; Fax: +44-1223496826;
| |
Collapse
|
28
|
Aberer AJ, Stamatakis A. Rapid forward-in-time simulation at the chromosome and genome level. BMC Bioinformatics 2013; 14:216. [PMID: 23834340 PMCID: PMC3718712 DOI: 10.1186/1471-2105-14-216] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2013] [Accepted: 07/03/2013] [Indexed: 11/10/2022] Open
Abstract
Background In population genetics, simulation is a fundamental tool for analyzing how basic evolutionary forces such as natural selection, recombination, and mutation shape the genetic landscape of a population. Forward simulation represents the most powerful, but, at the same time, most compute-intensive approach for simulating the genetic material of a population. Results We introduce AnA-FiTS, a highly optimized forward simulation software, that is up to two orders of magnitude faster than current state-of-the-art software. In addition, we present a novel algorithm that further improves runtimes by up to an additional order of magnitude, for simulations where a fraction of the mutations is neutral (e.g., only 10% of mutations have an effect on fitness). Apart from simulated sequences, our tool also generates a graph structure that depicts the complete observable history of neutral mutations. Conclusions The substantial performance improvements allow for conducting forward simulations at the chromosome and genome level. The graph structure generated by our algorithm can give rise to novel approaches for visualizing and analyzing the output of forward simulations.
Collapse
Affiliation(s)
- Andre J Aberer
- The Exelixis Lab, Scientific Computing Group, Heidelberg Institute for Theoretical Studies, Schloss-Wolfsbrunnenweg 35, Heidelberg D-69118, Germany.
| | | |
Collapse
|
29
|
Wu G, Zhi D. Pathway-based approaches for sequencing-based genome-wide association studies. Genet Epidemiol 2013; 37:478-94. [PMID: 23650134 DOI: 10.1002/gepi.21728] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2012] [Revised: 03/04/2013] [Accepted: 03/29/2013] [Indexed: 01/07/2023]
Abstract
For analyzing complex trait association with sequencing data, most current studies test aggregated effects of variants in a gene or genomic region. Although gene-based tests have insufficient power even for moderately sized samples, pathway-based analyses combine information across multiple genes in biological pathways and may offer additional insight. However, most existing pathway association methods are originally designed for genome-wide association studies, and are not comprehensively evaluated for sequencing data. Moreover, region-based rare variant association methods, although potentially applicable to pathway-based analysis by extending their region definition to gene sets, have never been rigorously tested. In the context of exome-based studies, we use simulated and real datasets to evaluate pathway-based association tests. Our simulation strategy adopts a genome-wide genetic model that distributes total genetic effects hierarchically into pathways, genes, and individual variants, allowing the evaluation of pathway-based methods with realistic quantifiable assumptions on the underlying genetic architectures. The results show that, although no single pathway-based association method offers superior performance in all simulated scenarios, a modification of Gene Set Enrichment Analysis approach using statistics from single-marker tests without gene-level collapsing (weighted Kolmogrov-Smirnov [WKS]-Variant method) is consistently powerful. Interestingly, directly applying rare variant association tests (e.g., sequence kernel association test) to pathway analysis offers a similar power, but its results are sensitive to assumptions of genetic architecture. We applied pathway association analysis to an exome-sequencing data of the chronic obstructive pulmonary disease, and found that the WKS-Variant method confirms associated genes previously published.
Collapse
Affiliation(s)
- Guodong Wu
- Department of Biostatistics, University of Alabama at Birmingham, Birmingham, Alabama 35294, USA
| | | |
Collapse
|
30
|
Zhao LP, Huang X. Recursive organizer (ROR): an analytic framework for sequence-based association analysis. Hum Genet 2013; 132:745-59. [PMID: 23494241 DOI: 10.1007/s00439-013-1285-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2012] [Accepted: 03/03/2013] [Indexed: 12/13/2022]
Abstract
The advent of next-generation sequencing technologies affords the ability to sequence thousands of subjects cost-effectively, and is revolutionizing the landscape of genetic research. With the evolving genotyping/sequencing technologies, it is not unrealistic to expect that we will soon obtain a pair of diploidic fully phased genome sequences from each subject in the near future. Here, in light of this potential, we propose an analytic framework called, recursive organizer (ROR), which recursively groups sequence variants based upon sequence similarities and their empirical disease associations, into fewer and potentially more interpretable super sequence variants (SSV). As an illustration, we applied ROR to assess an association between HLA-DRB1 and type 1 diabetes (T1D), discovering SSVs of HLA-DRB1 with sequence data from the Wellcome Trust Case Control Consortium. Specifically, ROR reduces 36 observed unique HLA-DRB1 sequences into 8 SSVs that empirically associate with T1D, a fourfold reduction of sequence complexity. Using HLA-DRB1 data from Type 1 Diabetes Genetics Consortium as cases and data from Fred Hutchinson Cancer Research Center as controls, we are able to validate associations of these SSVs with T1D. Further, SSVs consist of nine nucleotides, and each associates with its corresponding amino acids. Detailed examination of these selected amino acids reveals their potential functional roles in protein structures and possible implication to the mechanism of T1D.
Collapse
Affiliation(s)
- Lue Ping Zhao
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Mailstop M2-B500, P.O. Box 19024, Seattle, WA 98109-1024, USA.
| | | |
Collapse
|
31
|
Amish revisited: next-generation sequencing studies of psychiatric disorders among the Plain people. Trends Genet 2013; 29:412-8. [PMID: 23422049 DOI: 10.1016/j.tig.2013.01.007] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2012] [Revised: 01/08/2013] [Accepted: 01/22/2013] [Indexed: 11/23/2022]
Abstract
The rapid development of next-generation sequencing (NGS) technology has led to renewed interest in the potential contribution of rarer forms of genetic variation to complex non-mendelian phenotypes such as psychiatric illnesses. Although challenging, family-based studies offer some advantages, especially in communities with large families and a limited number of founders. Here we revisit family-based studies of mental illnesses in traditional Amish and Mennonite communities--known collectively as the Plain people. We discuss the new opportunities for NGS in these populations, with particular emphasis on investigating psychiatric disorders. We also address some of the challenges facing NGS-based studies of complex phenotypes in founder populations.
Collapse
|
32
|
Chen YC, Carter H, Parla J, Kramer M, Goes FS, Pirooznia M, Zandi PP, McCombie WR, Potash JB, Karchin R. A hybrid likelihood model for sequence-based disease association studies. PLoS Genet 2013; 9:e1003224. [PMID: 23358228 PMCID: PMC3554549 DOI: 10.1371/journal.pgen.1003224] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2012] [Accepted: 11/21/2012] [Indexed: 11/18/2022] Open
Abstract
In the past few years, case-control studies of common diseases have shifted their focus from single genes to whole exomes. New sequencing technologies now routinely detect hundreds of thousands of sequence variants in a single study, many of which are rare or even novel. The limitation of classical single-marker association analysis for rare variants has been a challenge in such studies. A new generation of statistical methods for case-control association studies has been developed to meet this challenge. A common approach to association analysis of rare variants is the burden-style collapsing methods to combine rare variant data within individuals across or within genes. Here, we propose a new hybrid likelihood model that combines a burden test with a test of the position distribution of variants. In extensive simulations and on empirical data from the Dallas Heart Study, the new model demonstrates consistently good power, in particular when applied to a gene set (e.g., multiple candidate genes with shared biological function or pathway), when rare variants cluster in key functional regions of a gene, and when protective variants are present. When applied to data from an ongoing sequencing study of bipolar disorder (191 cases, 107 controls), the model identifies seven gene sets with nominal p-values0.05, of which one MAPK signaling pathway (KEGG) reaches trend-level significance after correcting for multiple testing. Inexpensive, high-throughput sequencing has transformed the field of case-control association studies. For the first time, it may be possible to identify the genetic underpinnings of complex diseases, by sequencing the DNA of hundreds (even thousands) of cases and controls and comparing patterns of DNA sequence variation. However, complex diseases are likely to be caused by many variants, some of which are very rare. Taken one at a time, the association between variant and disease phenotype may not be detectable by current statistical methods. One strategy is to identify regions where important variants occur by “collapsing” variants into groups. Here, we present a new collapsing approach, capable of detecting subtle genetic differences between cases and controls. We show, in extensive simulations and using a benchmark set of genes involved in human triglyceride levels, that the approach is potentially more powerful than existing methods. We apply the new method to an ongoing sequencing study of bipolar cases and controls and identify a set of genes found in neuronal synapses, which may be implicated in bipolar disorder.
Collapse
Affiliation(s)
- Yun-Ching Chen
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Hannah Carter
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Jennifer Parla
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Melissa Kramer
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Fernando S. Goes
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
| | - Mehdi Pirooznia
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
| | - Peter P. Zandi
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
| | - W. Richard McCombie
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - James B. Potash
- Department of Psychiatry, University of Iowa, Iowa City, Iowa, United States of America
| | - Rachel Karchin
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland, United States of America
- * E-mail:
| |
Collapse
|
33
|
Empirical power of very rare variants for common traits and disease: results from sanger sequencing 1998 individuals. Eur J Hum Genet 2013; 21:1027-30. [PMID: 23321613 DOI: 10.1038/ejhg.2012.284] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2012] [Revised: 10/12/2012] [Accepted: 11/22/2013] [Indexed: 11/09/2022] Open
Abstract
The optimal study design for identifying rare variants associated with common disease is not yet clear and researchers have to decide whether to prioritize lower sequencing coverage on larger sample sizes, or higher coverage on smaller sample sizes. High-coverage sequencing affords several advantages, such as genotype accuracy and improved identification of very rare variants, but this comes at increased cost. However, the magnitude of the contribution of very rare variants to the statistical power of gene-based association tests is unknown. By using Sanger sequence data on seven genes from 1998 subjects with simulated phenotypes, we provide evidence that excluding very rare variants, in general, reduces the statistical power of rare variant association tests only modestly. However, if the probability of being causal and the effect size of the causal variants are inversely related to the minor allele frequency, then very rare variants do contribute to some power, however the absolute power remains low. As very rare variants constitute the majority of variants identified in sequencing studies, these findings suggest that careful attention need to be placed on the plausible relationship that exist between very rare variants and common disease.
Collapse
|
34
|
Liu DJ, Leal SM. A unified method for detecting secondary trait associations with rare variants: application to sequence data. PLoS Genet 2012; 8:e1003075. [PMID: 23166519 PMCID: PMC3499373 DOI: 10.1371/journal.pgen.1003075] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2012] [Accepted: 09/23/2012] [Indexed: 01/11/2023] Open
Abstract
Next-generation sequencing has made possible the detection of rare variant (RV) associations with quantitative traits (QT). Due to high sequencing cost, many studies can only sequence a modest number of selected samples with extreme QT. Therefore association testing in individual studies can be underpowered. Besides the primary trait, many clinically important secondary traits are often measured. It is highly beneficial if multiple studies can be jointly analyzed for detecting associations with commonly measured traits. However, analyzing secondary traits in selected samples can be biased if sample ascertainment is not properly modeled. Some methods exist for analyzing secondary traits in selected samples, where some burden tests can be implemented. However p-values can only be evaluated analytically via asymptotic approximations, which may not be accurate. Additionally, potentially more powerful sequence kernel association tests, variable selection-based methods, and burden tests that require permutations cannot be incorporated. To overcome these limitations, we developed a unified method for analyzing secondary trait associations with RVs (STAR) in selected samples, incorporating all RV tests. Statistical significance can be evaluated either through permutations or analytically. STAR makes it possible to apply more powerful RV tests to analyze secondary trait associations. It also enables jointly analyzing multiple cohorts ascertained under different study designs, which greatly boosts power. The performance of STAR and commonly used RV association tests were comprehensively evaluated using simulation studies. STAR was also implemented to analyze a dataset from the SardiNIA project where samples with extreme low-density lipoprotein levels were sequenced. A significant association between LDLR and systolic blood pressure was identified, which is supported by pharmacogenetic studies. In summary, for sequencing studies, STAR is an important tool for detecting secondary-trait RV associations. Next-generation sequencing has greatly expanded our ability to identify missing heritability due to rare variants. In order to increase the power to detect associations, one desirable study design is to combine samples from multiple cohorts for mapping commonly measured traits. However, many current studies sequence selected samples (e.g. samples with extreme QT), which can bias the analysis of secondary traits, unless the sampling ascertainment mechanisms are properly adjusted. We developed a unified method for detecting secondary trait associations with rare variants (STAR) in selected and random samples, which can flexibly incorporate all rare variant association tests and allow joint analysis of multiple cohorts ascertained under different study designs. We demonstrate via simulations that STAR greatly boosts the power for detecting secondary trait associations. As an application of STAR, a dataset from the SardiNIA project was analyzed, where DNA samples from well-phenotyped individuals with extreme low-density lipoprotein levels were sequenced. LDLR was identified to be significantly associated with systolic blood pressure, which is supported by a previous pharmacogenetics study. In conclusion, STAR is an important tool for sequence-based association studies.
Collapse
Affiliation(s)
- Dajiang J. Liu
- Department of Biostatistics, Center of Statistical Genetics, University of Michigan, Ann Arbor, Michigan, United States of America
- * E-mail: (DJL); (SML)
| | - Suzanne M. Leal
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
- * E-mail: (DJL); (SML)
| |
Collapse
|
35
|
Liu D, Leal S. Estimating genetic effects and quantifying missing heritability explained by identified rare-variant associations. Am J Hum Genet 2012; 91:585-96. [PMID: 23022102 DOI: 10.1016/j.ajhg.2012.08.008] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2012] [Revised: 06/19/2012] [Accepted: 08/08/2012] [Indexed: 01/01/2023] Open
Abstract
Next-generation sequencing has led to many complex-trait rare-variant (RV) association studies. Although single-variant association analysis can be performed, it is grossly underpowered. Therefore, researchers have developed many RV association tests that aggregate multiple variant sites across a genetic region (e.g., gene), and test for the association between the trait and the aggregated genotype. After these aggregate tests detect an association, it is only possible to estimate the average genetic effect for a group of RVs. As a result of the "winner's curse," such an estimate can be biased. Although for common variants one can obtain unbiased estimates of genetic parameters by analyzing a replication sample, for RVs it is desirable to obtain unbiased genetic estimates for the study where the association is identified. This is because there can be substantial heterogeneity of RV sites and frequencies even among closely related populations. In order to obtain an unbiased estimate for aggregated RV analysis, we developed bootstrap-sample-split algorithms to reduce the bias of the winner's curse. The unbiased estimates are greatly important for understanding the population-specific contribution of RVs to the heritability of complex traits. We also demonstrate both theoretically and via simulations that for aggregate RV analysis the genetic variance for a gene or region will always be underestimated, sometimes substantially, because of the presence of noncausal variants or because of the presence of causal variants with effects of different magnitudes or directions. Therefore, even if RVs play a major role in the complex-trait etiologies, a portion of the heritability will remain missing, and the contribution of RVs to the complex-trait etiologies will be underestimated.
Collapse
|
36
|
Single Nucleotide Polymorphism (SNP) Detection and Genotype Calling from Massively Parallel Sequencing (MPS) Data. STATISTICS IN BIOSCIENCES 2012; 5:3-25. [PMID: 24489615 DOI: 10.1007/s12561-012-9067-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Massively parallel sequencing (MPS), since its debut in 2005, has transformed the field of genomic studies. These new sequencing technologies have resulted in the successful identification of causal variants for several rare Mendelian disorders. They have also begun to deliver on their promise to explain some of the missing heritability from genome-wide association studies (GWAS) of complex traits. We anticipate a rapidly growing number of MPS-based studies for a diverse range of applications in the near future. One crucial and nearly inevitable step is to detect SNPs and call genotypes at the detected polymorphic sites from the sequencing data. Here, we review statistical methods that have been proposed in the past five years for this purpose. In addition, we discuss emerging issues and future directions related to SNP detection and genotype calling from MPS data.
Collapse
|
37
|
Liu DJ, Leal SM. SEQCHIP: a powerful method to integrate sequence and genotype data for the detection of rare variant associations. ACTA ACUST UNITED AC 2012; 28:1745-51. [PMID: 22556370 DOI: 10.1093/bioinformatics/bts263] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
MOTIVATION Next-generation sequencing greatly increases the capacity to detect rare-variant complex-trait associations. However, it is still expensive to sequence a large number of samples and therefore often small datasets are used. Given cost constraints, a potentially more powerful two-step strategy is to sequence a subset of the sample to discover variants, and genotype the identified variants in the remaining sample. If only cases are sequenced, directly combining sequence and genotype data will lead to inflated type-I errors in rare-variant association analysis. Although several methods have been developed to correct for the bias, they are either underpowered or theoretically invalid. We proposed a new method SEQCHIP to integrate genotype and sequence data, which can be used with most existing rare-variant tests. RESULTS It is demonstrated using both simulated and real datasets that the SEQCHIP method has controlled type-I errors, and is substantially more powerful than all other currently available methods. AVAILABILITY SEQCHIP is implemented in an R-Package and is available at http://linkage.rockefeller.edu/suzanne/seqchip/Seqchip.html.
Collapse
Affiliation(s)
- Dajiang J Liu
- Department of Biostatistics, Center of Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA.
| | | |
Collapse
|
38
|
Liu DJ, Leal SM. A unified framework for detecting rare variant quantitative trait associations in pedigree and unrelated individuals via sequence data. Hum Hered 2012; 73:105-22. [PMID: 22555759 DOI: 10.1159/000336293] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2011] [Accepted: 01/07/2012] [Indexed: 11/19/2022] Open
Abstract
OBJECTIVES There is great interest to sequence unrelated or pedigree samples for detecting rare variant quantitative trait associations. In order to reduce the cost of sequencing and improve power, many studies sequence selected samples with extreme traits. Existing methods for detecting rare variant associations were developed for unrelated samples. Methods are needed to analyze (selected or randomly ascertained) pedigree samples. METHODS We propose a unified framework of modeling extreme trait genetic associations (MEGA) with rare variants. Using MEGA and appropriate permutation algorithms, many rare variant tests can be extended to family data. As an application, we compared study designs using both sib-pairs and unrelated individuals. Extensive simulations were carried out using realistic population genetic and complex trait models. RESULTS It is demonstrated that when extreme sampling is implemented within equal-sized cohorts of unrelated individuals or sib-pairs, analyzing unrelated individuals is consistently more powerful than studying sib-pairs. A higher portion of rare variants can be identified through sequencing unrelated samples compared to sibs. Alternatively, if samples are ascertained using fixed thresholds from an infinite-sized population, sequencing one sib with the most extreme trait from each extreme concordant sib-pair is consistently the most powerful design. CONCLUSIONS MEGA will play an important role in the analysis of sequence-based genetic association studies.
Collapse
Affiliation(s)
- Dajiang J Liu
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | | |
Collapse
|
39
|
Chen Z, Craiu RV, Bull SB. Two-Phase Stratified Sampling Designs for Regional Sequencing. Genet Epidemiol 2012; 36:320-32. [DOI: 10.1002/gepi.21624] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2011] [Revised: 01/16/2012] [Accepted: 01/17/2012] [Indexed: 12/12/2022]
Affiliation(s)
- Zhijian Chen
- Samuel Lunenfeld Research Institute of Mount Sinai Hospital; Toronto ON; Canada
| | - Radu V. Craiu
- Department of Statistics; University of Toronto; Toronto ON; Canada
| | | |
Collapse
|
40
|
Zhi D, Chen R. Statistical guidance for experimental design and data analysis of mutation detection in rare monogenic mendelian diseases by exome sequencing. PLoS One 2012; 7:e31358. [PMID: 22348076 PMCID: PMC3277495 DOI: 10.1371/journal.pone.0031358] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2011] [Accepted: 01/06/2012] [Indexed: 01/19/2023] Open
Abstract
Recently, whole-genome sequencing, especially exome sequencing, has successfully led to the identification of causal mutations for rare monogenic Mendelian diseases. However, it is unclear whether this approach can be generalized and effectively applied to other Mendelian diseases with high locus heterogeneity. Moreover, the current exome sequencing approach has limitations such as false positive and false negative rates of mutation detection due to sequencing errors and other artifacts, but the impact of these limitations on experimental design has not been systematically analyzed. To address these questions, we present a statistical modeling framework to calculate the power, the probability of identifying truly disease-causing genes, under various inheritance models and experimental conditions, providing guidance for both proper experimental design and data analysis. Based on our model, we found that the exome sequencing approach is well-powered for mutation detection in recessive, but not dominant, Mendelian diseases with high locus heterogeneity. A disease gene responsible for as low as 5% of the disease population can be readily identified by sequencing just 200 unrelated patients. Based on these results, for identifying rare Mendelian disease genes, we propose that a viable approach is to combine, sequence, and analyze patients with the same disease together, leveraging the statistical framework presented in this work.
Collapse
Affiliation(s)
- Degui Zhi
- Section on Statistical Genetics, Department of Biostatistics, University of Alabama at Birmingham, Birmingham, Alabama, United States of America
| | - Rui Chen
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| |
Collapse
|
41
|
Ladouceur M, Dastani Z, Aulchenko YS, Greenwood CMT, Richards JB. The empirical power of rare variant association methods: results from sanger sequencing in 1,998 individuals. PLoS Genet 2012; 8:e1002496. [PMID: 22319458 PMCID: PMC3271058 DOI: 10.1371/journal.pgen.1002496] [Citation(s) in RCA: 88] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2011] [Accepted: 12/08/2011] [Indexed: 01/09/2023] Open
Abstract
The role of rare genetic variation in the etiology of complex disease remains unclear. However, the development of next-generation sequencing technologies offers the experimental opportunity to address this question. Several novel statistical methodologies have been recently proposed to assess the contribution of rare variation to complex disease etiology. Nevertheless, no empirical estimates comparing their relative power are available. We therefore assessed the parameters that influence their statistical power in 1,998 individuals Sanger-sequenced at seven genes by modeling different distributions of effect, proportions of causal variants, and direction of the associations (deleterious, protective, or both) in simulated continuous trait and case/control phenotypes. Our results demonstrate that the power of recently proposed statistical methods depend strongly on the underlying hypotheses concerning the relationship of phenotypes with each of these three factors. No method demonstrates consistently acceptable power despite this large sample size, and the performance of each method depends upon the underlying assumption of the relationship between rare variants and complex traits. Sensitivity analyses are therefore recommended to compare the stability of the results arising from different methods, and promising results should be replicated using the same method in an independent sample. These findings provide guidance in the analysis and interpretation of the role of rare base-pair variation in the etiology of complex traits and diseases.
Collapse
Affiliation(s)
- Martin Ladouceur
- Department of Human Genetics, McGill University, Montreal, Canada
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Canada
| | - Zari Dastani
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Canada
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, Canada
| | - Yurii S. Aulchenko
- Department of Epidemiology, Erasmus MC, Rotterdam, The Netherlands
- Institute of Cytology and Genetics SD RAS, Novosibirsk, Russia
| | - Celia M. T. Greenwood
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Canada
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, Canada
- Department of Oncology, McGill University, Montreal, Canada
| | - J. Brent Richards
- Department of Human Genetics, McGill University, Montreal, Canada
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Canada
- Department of Medicine, Jewish General Hospital, McGill University, Montreal, Canada
- Twin Research and Genetic Epidemiology, King's College London, London, United Kingdom
| |
Collapse
|
42
|
Abstract
We have witnessed tremendous success in genome-wide association studies (GWAS) in recent years. Since the identification of variants in the complement factor H gene on the risk of age-related macular degeneration, GWAS have become ubiquitous in genetic studies and have led to the identification of genetic variants that are associated with a variety of complex human diseases and traits. These discoveries have changed our understanding of the biological architecture of common, complex diseases and have also provided new hypotheses to test. New tools, such as next-generation sequencing, will be an important part of the future of genetics research; however, GWAS studies will continue to play an important role in disease gene discovery. Many traits have yet to be explored by GWAS, especially in minority populations, and large collaborative studies are currently being conducted to maximize the return from existing GWAS data. In addition, GWAS technology continues to improve, increasing genomic coverage for major global populations and decreasing the cost of experiments. Although much of the variance attributable to genetic factors for many important traits is still unexplained, GWAS technology has been instrumental in mapping over a thousand genes to hundreds of traits. More discoveries are made each month and the scale, quality and quantity of current work has a steady trend upward. We briefly review the current key trends in GWAS, which can be summarized with three goals: increase power, increase collaborations and increase populations.
Collapse
|
43
|
Liu DJ, Leal SM. A flexible likelihood framework for detecting associations with secondary phenotypes in genetic studies using selected samples: application to sequence data. Eur J Hum Genet 2011; 20:449-56. [PMID: 22166943 DOI: 10.1038/ejhg.2011.211] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
For most complex trait association studies using next-generation sequencing, in addition to the primary phenotype of interest, many clinically important secondary traits are also available, which can be analyzed to map susceptibility genes. Owing to high sequencing costs, most studies use selected samples, and the sampling mechanisms of these studies can be complicated. When the primary and secondary traits are correlated, analyses of secondary phenotypes can cause spurious associations in selected samples and existing methods are inadequate to adjust for them. To address this problem, a likelihood-based method, MULTI-TRAIT-ASSOCIATION (MTA) was developed. MTA is flexible and can be applied to any study with known sampling mechanisms. It also allows efficient inferences of genetic parameters. To investigate the power of MTA and different study designs, extensive simulations were performed under rigorous population genetic and phenotypic models. It is demonstrated that there are great benefits for analyzing secondary phenotypes in selected samples. In particular, using case-control samples and samples with extreme primary phenotypes can be more powerful than analyzing random samples of equivalent size. One major challenge for sequence-based association studies is that most data sets are not of sufficient size to be adequately powered. By applying MTA, data sets ascertained under distinct mechanisms or targeted at different primary traits can be jointly analyzed to map common phenotypes and greatly increase power. The combined analysis can be performed using freely available data sets from public repositories, for example, dbGaP. In conclusion, MTA will have an important role in dissecting the etiology of complex traits.
Collapse
Affiliation(s)
- Dajiang J Liu
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | | |
Collapse
|
44
|
Ramsey LB, Bruun GH, Yang W, Treviño LR, Vattathil S, Scheet P, Cheng C, Rosner GL, Giacomini KM, Fan Y, Sparreboom A, Mikkelsen TS, Corydon TJ, Pui CH, Evans WE, Relling MV. Rare versus common variants in pharmacogenetics: SLCO1B1 variation and methotrexate disposition. Genome Res 2011; 22:1-8. [PMID: 22147369 DOI: 10.1101/gr.129668.111] [Citation(s) in RCA: 211] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Methotrexate is used to treat autoimmune diseases and malignancies, including acute lymphoblastic leukemia (ALL). Inter-individual variation in clearance of methotrexate results in heterogeneous systemic exposure, clinical efficacy, and toxicity. In a genome-wide association study of children with ALL, we identified SLCO1B1 as harboring multiple common polymorphisms associated with methotrexate clearance. The extent of influence of rare versus common variants on pharmacogenomic phenotypes remains largely unexplored. We tested the hypothesis that rare variants in SLCO1B1 could affect methotrexate clearance and compared the influence of common versus rare variants in addition to clinical covariates on clearance. From deep resequencing of SLCO1B1 exons in 699 children, we identified 93 SNPs, 15 of which were non-synonymous (NS). Three of these NS SNPs were common, with a minor allele frequency (MAF) >5%, one had low frequency (MAF 1%-5%), and 11 were rare (MAF <1%). NS SNPs (common or rare) predicted to be functionally damaging were more likely to be found among patients with the lowest methotrexate clearance than patients with high clearance. We verified lower function in vitro of four SLCO1B1 haplotypes that were associated with reduced methotrexate clearance. In a multivariate stepwise regression analysis adjusting for other genetic and non-genetic covariates, SLCO1B1 variants accounted for 10.7% of the population variability in clearance. Of that variability, common NS variants accounted for the majority, but rare damaging NS variants constituted 17.8% of SLCO1B1's effects (1.9% of total variation) and had larger effect sizes than common NS variants. Our results show that rare variants are likely to have an important effect on pharmacogenetic phenotypes.
Collapse
Affiliation(s)
- Laura B Ramsey
- Pharmaceutical Sciences Department, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Abstract
We propose a two-stage design for the analysis of sequence variants in which a proportion of genes that show some evidence of association are identified initially and then followed up in an independent data set. We compare two different approaches. In both approaches the same summary measure (total number of minor alleles) is used for each gene in the initial analysis. In the first (simple) approach the same summary measure is used in the analysis of the independent data set. In the second (alternative) approach a more specific hypothesis is formed for the second stage; the summary measure used is the count of minor alleles in only those variants that in the initial data showed the same direction of association as was seen overall. We applied the methods to the simulated quantitative traits of Genetic Analysis Workshop 17, blind to the simulation model, and then evaluated their performance once the underlying model was known. Performance was similar for most genes, but the simple strategy considerably out-performed the alternative strategy for one gene, where most of the effect was due to very rare variants; this suggests that the alternative approach would not be advisable when the effect is seen in very rare variants. Further simulations are needed to investigate the potential superior power of the alternative method when some variants within a gene have opposing effects. Overall, the power to detect associations was low; this was also true when using a more powerful joint analysis that combined the two stages of the study.
Collapse
|
46
|
Stitziel NO, Kiezun A, Sunyaev S. Computational and statistical approaches to analyzing variants identified by exome sequencing. Genome Biol 2011; 12:227. [PMID: 21920052 PMCID: PMC3308043 DOI: 10.1186/gb-2011-12-9-227] [Citation(s) in RCA: 99] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
New sequencing technology has enabled the identification of thousands of single nucleotide polymorphisms in the exome, and many computational and statistical approaches to identify disease-association signals have emerged.
Collapse
Affiliation(s)
- Nathan O Stitziel
- Division of Cardiovascular Medicine, Brigham and Women’s Hospital, Harvard Medical School, 75 Francis Street, Boston, MA 02115, USA
| | | | | |
Collapse
|
47
|
Feng BJ, Tavtigian SV, Southey MC, Goldgar DE. Design considerations for massively parallel sequencing studies of complex human disease. PLoS One 2011; 6:e23221. [PMID: 21850262 PMCID: PMC3151293 DOI: 10.1371/journal.pone.0023221] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2011] [Accepted: 07/14/2011] [Indexed: 12/24/2022] Open
Abstract
Massively Parallel Sequencing (MPS) allows sequencing of entire exomes and genomes to now be done at reasonable cost, and its utility for identifying genes responsible for rare Mendelian disorders has been demonstrated. However, for a complex disease, study designs need to accommodate substantial degrees of locus, allelic, and phenotypic heterogeneity, as well as complex relationships between genotype and phenotype. Such considerations include careful selection of samples for sequencing and a well-developed strategy for identifying the few "true" disease susceptibility genes from among the many irrelevant genes that will be found to harbor rare variants. To examine these issues we have performed simulation-based analyses in order to compare several strategies for MPS sequencing in complex disease. Factors examined include genetic architecture, sample size, number and relationship of individuals selected for sequencing, and a variety of filters based on variant type, multiple observations of genes and concordance of genetic variants within pedigrees. A two-stage design was assumed where genes from the MPS analysis of high-risk families are evaluated in a secondary screening phase of a larger set of probands with more modest family histories. Designs were evaluated using a cost function that assumes the cost of sequencing the whole exome is 400 times that of sequencing a single candidate gene. Results indicate that while requiring variants to be identified in multiple pedigrees and/or in multiple individuals in the same pedigree are effective strategies for reducing false positives, there is a danger of over-filtering so that most true susceptibility genes are missed. In most cases, sequencing more than two individuals per pedigree results in reduced power without any benefit in terms of reduced overall cost. Further, our results suggest that although no single strategy is optimal, simulations can provide important guidelines for study design.
Collapse
Affiliation(s)
- Bing-Jian Feng
- Department of Dermatology, University of Utah School of Medicine, Salt Lake City, Utah, United States of America
| | - Sean V. Tavtigian
- Huntsman Cancer Institute and Department of Oncological Sciences, University of Utah, Salt Lake City, Utah, United States of America
| | - Melissa C. Southey
- Department of Pathology, University of Melbourne, Melbourne, Victoria, Australia
| | - David E. Goldgar
- Department of Dermatology, University of Utah School of Medicine, Salt Lake City, Utah, United States of America
| |
Collapse
|
48
|
Edwards TL, Song Z, Li C. Enriching targeted sequencing experiments for rare disease alleles. ACTA ACUST UNITED AC 2011; 27:2112-8. [PMID: 21700677 DOI: 10.1093/bioinformatics/btr324] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION Next-generation targeted resequencing of genome-wide association study (GWAS)-associated genomic regions is a common approach for follow-up of indirect association of common alleles. However, it is prohibitively expensive to sequence all the samples from a well-powered GWAS study with sufficient depth of coverage to accurately call rare genotypes. As a result, many studies may use next-generation sequencing for single nucleotide polymorphism (SNP) discovery in a smaller number of samples, with the intent to genotype candidate SNPs with rare alleles captured by resequencing. This approach is reasonable, but may be inefficient for rare alleles if samples are not carefully selected for the resequencing experiment. RESULTS We have developed a probability-based approach, SampleSeq, to select samples for a targeted resequencing experiment that increases the yield of rare disease alleles substantially over random sampling of cases or controls or sampling based on genotypes at associated SNPs from GWAS data. This technique allows for smaller sample sizes for resequencing experiments, or allows the capture of rarer risk alleles. When following up multiple regions, SampleSeq selects subjects with an even representation of all the regions. SampleSeq also can be used to calculate the sample size needed for the resequencing to increase the chance of successful capture of rare alleles of desired frequencies. SOFTWARE http://biostat.mc.vanderbilt.edu/SampleSeq
Collapse
Affiliation(s)
- Todd L Edwards
- Vanderbilt Epidemiology Center, Division of Epidemiology, Department of Medicine, Vanderbilt University, Nashville, TN 37203, USA
| | | | | |
Collapse
|