1
|
Li C, Zhang K, Zhao J. Genome-wide Mendelian randomization mapping the influence of plasma proteome on major depressive disorder. J Affect Disord 2025; 376:1-9. [PMID: 39892755 DOI: 10.1016/j.jad.2025.01.140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Revised: 01/26/2025] [Accepted: 01/27/2025] [Indexed: 02/04/2025]
Abstract
Plasma proteins play critical roles in a series of biological processes and represent a major source of translational biomarkers and drug targets. In this study, we performed Mendelian randomization (MR) to explore potential causal associations of protein quantitative trait loci (pQTL, n = 54,219) with major depressive disorder (MDD) using summary statistics from the PGC (n = 143,265) and further replicated in FinnGen cohort (n = 406,986). Subsequently, gene expression quantitative trait loci (eQTL) of identified proteins were leveraged to validate the primary findings in both PGC and FinnGen cohorts. We implemented reverse causality detection using bidirectional MR analysis, Steiger test, Bayesian co-localization and phenotype scanning to further strengthen the MR findings. In primary analyses, MR analysis revealed 2 plasma protein significantly associated with MDD risk at Bonferroni correction (P < 3.720 × 10-5), including butyrophilin subfamily 2 member A1 (BTN2A1, OR = 0.860; 95 % CI, 0.825-0.895; P = 1.79 × 10-5) and butyrophilin subfamily 3 member A2 (BTN3A2, OR = 1.071; 95 % CI, 1.056-1.086; P = 3.89 × 10-6). Both the identified proteins had no reverse causality. Bayesian co-localization indicated that BTN2A1 (coloc.abf-PPH4 = 0.620) and BTN3A2 (coloc.abf-PPH4 = 0.872) exhibited a shared variant with MDD, a finding that was subsequently validated by HEIDI test. In the replication stage, BTN2A1 and BTN3A2 were successfully validated in the FinnGen cohort. This study genetically determined BTN2A1 and BTN3A2 were associated with MDD and these findings may have clinical implications for MDD prevention.
Collapse
Affiliation(s)
- Chong Li
- Department of Psychiatry, Zhujiang Hospital, Southern Medical University, No. 253, Industrial Avenue Zhong, Guangzhou, Guangdong 510220, China
| | - Kunxue Zhang
- Department of Neurology, Nanfang Hospital, Southern Medical University, No. 1838 Guangzhou Dadao Road North, Guangzhou, Guangdong 510515, China
| | - Jiubo Zhao
- Department of Psychiatry, Zhujiang Hospital, Southern Medical University, No. 253, Industrial Avenue Zhong, Guangzhou, Guangdong 510220, China; Department of Psychology, School of Public Health, Southern Medical University, No. 1838 Guangzhou Dadao Road North, Guangzhou, Guangdong 510220, China.
| |
Collapse
|
2
|
Ogloblinsky MSC, Conrad DF, Baudot A, Tournier-Lasserve E, Génin E, Marenne G. Benchmark of computational methods to detect digenism in sequencing data. Eur J Hum Genet 2025:10.1038/s41431-025-01834-9. [PMID: 40204980 DOI: 10.1038/s41431-025-01834-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2024] [Revised: 03/06/2025] [Accepted: 03/11/2025] [Indexed: 04/11/2025] Open
Abstract
Digenic inheritance is characterized by the combined alteration of two different genes leading to a disease. It could explain the etiology of many currently undiagnosed rare diseases. With the advent of next-generation sequencing technologies, the identification of digenic inheritance patterns has become more technically feasible, yet still poses significant challenges without any gold standard method. Here, we present a comprehensive overview of the existing methods developed to detect digenic inheritance in sequencing data and provide a classification in cohort-based and individual-based methods. The latter category of methods appeared the most applicable to rare diseases, especially the ones not needing patient phenotypic description as input. We discuss the availability of the different methods, their output and scalability to inform potential users. Focusing on methods to detect digenic inheritance in the case of very rare or heterogeneous diseases, we propose a benchmark using different real-life scenarios involving known digenic and putative neutral pairs of genes. Among these different methods, DiGePred stood out as the one giving the least number of false positives, ARBOCK as giving the greatest number of true positives, and DIEP as having the best balance between both. By synthesizing the state-of-the-art techniques and providing insights into their practical utility, this benchmark serves as a valuable resource for researchers and clinicians in selecting suitable methodologies for detecting digenic inheritance in a wide range of disorders using sequencing data.
Collapse
Affiliation(s)
| | - Donald F Conrad
- Division of Genetics, Oregon National Primate Research Center, Oregon Health & Science University, Portland, OR, USA
| | - Anaïs Baudot
- Aix Marseille Univ, INSERM, Marseille Medical Genetics (MMG), Marseille, France
| | - Elisabeth Tournier-Lasserve
- Université Paris Cité, Inserm, NeuroDiderot, Unité Mixte de Recherche 1141, F-75019, Paris, France
- Assistance publique-Hôpitaux de Paris, Service de Génétique Moléculaire Neurovasculaire, Hôpital Saint-Louis, F-75010, Paris, France
| | - Emmanuelle Génin
- Univ Brest, Inserm, EFS, UMR 1078, GGB, Brest, France
- Assistance publique-Hôpitaux de Paris, Service de Génétique Moléculaire Neurovasculaire, Hôpital Saint-Louis, F-75010, Paris, France
| | | |
Collapse
|
3
|
Kouri C, Martinez de Lapiscina I, Naamneh-Elzenaty R, Sommer G, Sauter KS, Flück CE. Oligogenic analysis across broad phenotypes of 46,XY differences in sex development associated with NR5A1/SF-1 variants: findings from the international SF1next study. EBioMedicine 2025; 113:105624. [PMID: 40037090 PMCID: PMC11925193 DOI: 10.1016/j.ebiom.2025.105624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2024] [Revised: 02/08/2025] [Accepted: 02/12/2025] [Indexed: 03/06/2025] Open
Abstract
BACKGROUND Oligogenic inheritance has been suggested as a possible mechanism to explain the broad phenotype observed in individuals with differences of sex development (DSD) harbouring NR5A1/SF-1 variants. METHODS We investigated genetic patterns of possible oligogenicity in a cohort of 30 individuals with NR5A1/SF-1 variants and 46,XY DSD recruited from the international SF1next study, using whole exome sequencing (WES) on family trios whenever available. WES data were analysed using a tailored filtering algorithm designed to identify rare variants in DSD and SF-1-related genes. Identified variants were subsequently tested using the Oligogenic Resource for Variant Analysis (ORVAL) bioinformatics platform for a possible combined pathogenicity with the individual NR5A1/SF-1 variant. FINDINGS In 73% (22/30) of the individuals with NR5A1/SF-1 related 46,XY DSD, we identified one to seven additional variants, predominantly in known DSD-related genes, that might contribute to the phenotype. We found identical variants in eight unrelated individuals with DSD in DSD-related genes (e.g., TBCE, FLNB, GLI3 and PDGFRA) and different variants in eight genes frequently associated with DSD (e.g., CDH23, FLNB, GLI2, KAT6B, MYO7A, PKD1, SPRY4 and ZFPM2) in 15 index cases. Our study also identified combinations with NR5A1/SF-1 variants and variants in novel candidate genes. INTERPRETATION These findings highlight the complex genetic landscape of DSD associated with NR5A1/SF-1, where in several cases, the use of advanced genetic testing and filtering with specific algorithms and machine learning tools revealed additional genetic hits that may contribute to the phenotype. FUNDING Swiss National Science Foundation and Boveri Foundation Zurich.
Collapse
Affiliation(s)
- Chrysanthi Kouri
- Pediatric Endocrinology, Diabetology and Metabolism, Department of Pediatrics, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland; Department for BioMedical Research, University of Bern, Bern 3008, Switzerland; Graduate School for Cellular and Biomedical Sciences, University of Bern, Bern 3012, Switzerland
| | - Idoia Martinez de Lapiscina
- Pediatric Endocrinology, Diabetology and Metabolism, Department of Pediatrics, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland; Department for BioMedical Research, University of Bern, Bern 3008, Switzerland; Research into the Genetics and Control of Diabetes and Other Endocrine Disorders, Biobizkaia Health Research Institute, Cruces University Hospital, Barakaldo 48903, Spain; CIBER de Diabetes y Enfermedades Metabólicas Asociadas (CIBERDEM), Instituto de Salud Carlos III, Madrid 28029, Spain; CIBER de Enfermedades Raras (CIBERER), Instituto de Salud Carlos III, Madrid 28029, Spain; Endo-ERN, Amsterdam 1081 HV, the Netherlands
| | - Rawda Naamneh-Elzenaty
- Pediatric Endocrinology, Diabetology and Metabolism, Department of Pediatrics, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland; Department for BioMedical Research, University of Bern, Bern 3008, Switzerland; Graduate School for Cellular and Biomedical Sciences, University of Bern, Bern 3012, Switzerland
| | - Grit Sommer
- Pediatric Endocrinology, Diabetology and Metabolism, Department of Pediatrics, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland; Swiss Childhood Cancer Registry, Institute of Social and Preventive Medicine, University of Bern, Bern 3012, Switzerland
| | - Kay-Sara Sauter
- Pediatric Endocrinology, Diabetology and Metabolism, Department of Pediatrics, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland; Department for BioMedical Research, University of Bern, Bern 3008, Switzerland
| | - Christa E Flück
- Pediatric Endocrinology, Diabetology and Metabolism, Department of Pediatrics, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland; Department for BioMedical Research, University of Bern, Bern 3008, Switzerland.
| |
Collapse
|
4
|
Chitra U, Arnold B, Raphael BJ. Resolving discrepancies between chimeric and multiplicative measures of higher-order epistasis. Nat Commun 2025; 16:1711. [PMID: 39962081 PMCID: PMC11833126 DOI: 10.1038/s41467-025-56986-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Accepted: 02/06/2025] [Indexed: 02/20/2025] Open
Abstract
Epistasis - the interaction between alleles at different genetic loci - plays a fundamental role in biology. However, several recent approaches quantify epistasis using a chimeric formula that measures deviations from a multiplicative fitness model on an additive scale, thus mixing two scales. Here, we show that for pairwise interactions, the chimeric formula yields a different magnitude but the same sign of epistasis compared to the multiplicative formula that measures both fitness and deviations on a multiplicative scale. However, for higher-order interactions, we show that the chimeric formula can have both different magnitude and sign compared to the multiplicative formula. We resolve these inconsistencies by deriving mathematical relationships between the different epistasis formulae and different parametrizations of the multivariate Bernoulli distribution. We argue that the chimeric formula does not appropriately model interactions between the Bernoulli random variables. In simulations, we show that the chimeric formula is less accurate than the classical multiplicative/additive epistasis formulae and may falsely detect higher-order epistasis. Analyzing multi-gene knockouts in yeast, multi-way drug interactions in E. coli, and deep mutational scanning of several proteins, we find that approximately 10% to 60% of inferred higher-order interactions change sign using the multiplicative/additive formula compared to the chimeric formula.
Collapse
Affiliation(s)
- Uthsav Chitra
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Brian Arnold
- Department of Computer Science, Princeton University, Princeton, NJ, USA
- Center for Statistics and Machine Learning, Princeton University, Princeton, NJ, USA
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ, USA.
| |
Collapse
|
5
|
Li S, Arora S, Attaoua R, Hamet P, Tremblay J, Bihlo A, Liu B, Rutter G. Leveraging hierarchical structures for genetic block interaction studies using the hierarchical transformer. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2024.11.18.24317486. [PMID: 39606365 PMCID: PMC11601704 DOI: 10.1101/2024.11.18.24317486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Abstract
Initially introduced in 1909 by William Bateson, classic epistasis (genetic variant interaction) refers to the phenomenon that one variant prevents another variant from a different locus from manifesting its effects. The potential effects of genetic variant interactions on complex diseases have been recognized for the past decades. Moreover, It has been studied and demonstrated that leveraging the combined SNP effects within the genetic block can significantly increase calculation power, reducing background noise, ultimately leading to novel epistasis discovery that the single SNP statistical epistasis study might overlook. However, it is still an open question how we can best combine gene structure representation modelling and interaction learning into an end-to-end model for gene interaction searching. Here, in the current study, we developed a neural genetic block interaction searching model that can effectively process large SNP chip inputs and output the potential genetic block interaction heatmap. Our model augments a previously published hierarchical transformer architecture (Liu and Lapata, 2019) with the ability to model genetic blocks. The cross-block relationship mapping was achieved via a hierarchical attention mechanism which allows the sharing of information regarding specific phenotypes, as opposed to simple unsupervised dimensionality reduction methods e.g. PCA. Results on both simulation and UK Biobank studies show our model brings substantial improvements compared to traditional exhaustive searching and neural network methods.
Collapse
Affiliation(s)
- Shiying Li
- Centre de Recherche du CHUM, and Faculty of Medicine, University of Montreal, QC, Canada
| | - Shivam Arora
- Department of Mathematics and Statistics, Memorial University of Newfoundland, NL, Canada
| | - Redha Attaoua
- Centre de Recherche du CHUM, and Faculty of Medicine, University of Montreal, QC, Canada
| | - Pavel Hamet
- Centre de Recherche du CHUM, and Faculty of Medicine, University of Montreal, QC, Canada
| | - Johanne Tremblay
- Centre de Recherche du CHUM, and Faculty of Medicine, University of Montreal, QC, Canada
| | - Alexander Bihlo
- Department of Mathematics and Statistics, Memorial University of Newfoundland, NL, Canada
| | - Bang Liu
- Département d’informatique et de recherche opérationnelle, Université de Montréal, QC, Canada
| | - Guy Rutter
- Centre de Recherche du CHUM, and Faculty of Medicine, University of Montreal, QC, Canada
- Section of Cell Biology and Functional Genomics, Department of Metabolism, Diabetes and Reproduction, Imperial College of London, du Cane Road, London W120NN, United Kingdom
- Lee Kong Chian School of Medicine, Nan Yang Technological University, Singapore
| |
Collapse
|
6
|
Ren W, Liang Z. Review on GPU accelerated methods for genome-wide SNP-SNP interactions. Mol Genet Genomics 2024; 300:10. [PMID: 39738695 DOI: 10.1007/s00438-024-02214-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2024] [Accepted: 12/11/2024] [Indexed: 01/02/2025]
Abstract
Detecting genome-wide SNP-SNP interactions (epistasis) efficiently is essential to harnessing the vast data now available from modern biobanks. With millions of SNPs and genetic information from hundreds of thousands of individuals, researchers are positioned to uncover new insights into complex disease pathways. However, this data scale brings significant computational and statistical challenges. To address these, recent approaches leverage GPU-based parallel computing for high-throughput, cost-effective analysis and refine algorithms to improve time and memory efficiency. In this survey, we systematically review GPU-accelerated methods for exhaustive epistasis detection, detailing the statistical models used and the computational strategies employed to enhance performance. Our findings indicate substantial speedups with GPU implementations over traditional CPU approaches. We conclude that while GPU-based solutions hold promise for advancing genomic research, continued innovation in both algorithm design and hardware optimization is necessary to meet future data challenges in the field.
Collapse
Affiliation(s)
- Wenlong Ren
- Department of Epidemiology and Medical Statistics, School of Public Health, Nantong University, Nantong, 226019, China.
| | - Zhikai Liang
- Department of Plant Sciences, North Dakota State University, Fargo, 58108, USA
| |
Collapse
|
7
|
Raval K, Jamshidi N, Seyran B, Salwinski L, Pillai R, Yang L, Ma F, Pellegrini M, Shin J, Yang X, Tudzarova S. Dysfunctional β-cell longevity in diabetes relies on energy conservation and positive epistasis. Life Sci Alliance 2024; 7:e202402743. [PMID: 39313296 PMCID: PMC11420665 DOI: 10.26508/lsa.202402743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Revised: 09/11/2024] [Accepted: 09/12/2024] [Indexed: 09/25/2024] Open
Abstract
Long-lived PFKFB3-expressing β-cells are dysfunctional partly because of prevailing glycolysis that compromises metabolic coupling of insulin secretion. Their accumulation in type 2 diabetes (T2D) appears to be related to the loss of apoptotic competency of cell fitness competition that maintains islet function by favoring constant selection of healthy "winner" cells. To investigate how PFKFB3 can disguise the competitive traits of dysfunctional "loser" β-cells, we analyzed the overlap between human β-cells with bona fide "loser signature" across diabetes pathologies using the HPAP scRNA-seq and spatial transcriptomics of PFKFB3-positive β-cells from nPOD T2D pancreata. The overlapping transcriptional profile of "loser" β-cells was represented by down-regulated ribosomal biosynthesis and genes encoding for mitochondrial respiration. PFKFB3-positive "loser" β-cells had the reduced expression of HLA class I and II genes. Gene-gene interaction analysis revealed that PFKFB3 rs1983890 can interact with the anti-apoptotic gene MAIP1 implicating positive epistasis as a mechanism for prolonged survival of "loser" β-cells in T2D. Inhibition of PFKFB3 resulted in the clearance of dysfunctional "loser" β-cells leading to restored glucose tolerance in the mouse model of T2D.
Collapse
Affiliation(s)
- Kavit Raval
- Hillblom Islet Research Center, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Neema Jamshidi
- Radiological Sciences, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Berfin Seyran
- Hillblom Islet Research Center, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Lukasz Salwinski
- Molecular Cell and Developmental Biology, College of Life Sciences, University of California Los Angeles, Los Angeles, CA, USA
| | - Raju Pillai
- Department of Pathology, City-of-Hope, Duarte, CA, USA
| | - Lixin Yang
- Department of Pathology, City-of-Hope, Duarte, CA, USA
| | - Feiyang Ma
- Molecular Cell and Developmental Biology, College of Life Sciences, University of California Los Angeles, Los Angeles, CA, USA
| | - Matteo Pellegrini
- Molecular Cell and Developmental Biology, College of Life Sciences, University of California Los Angeles, Los Angeles, CA, USA
| | - Juliana Shin
- Department of Molecular and Medical Pharmacology, University of California Los Angeles, Los Angeles, CA, USA
| | - Xia Yang
- Department of Molecular and Medical Pharmacology, University of California Los Angeles, Los Angeles, CA, USA
| | - Slavica Tudzarova
- Hillblom Islet Research Center, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| |
Collapse
|
8
|
Mascher M, Jayakodi M, Shim H, Stein N. Promises and challenges of crop translational genomics. Nature 2024; 636:585-593. [PMID: 39313530 PMCID: PMC7616746 DOI: 10.1038/s41586-024-07713-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 06/13/2024] [Indexed: 09/25/2024]
Abstract
Crop translational genomics applies breeding techniques based on genomic datasets to improve crops. Technological breakthroughs in the past ten years have made it possible to sequence the genomes of increasing numbers of crop varieties and have assisted in the genetic dissection of crop performance. However, translating research findings to breeding applications remains challenging. Here we review recent progress and future prospects for crop translational genomics in bringing results from the laboratory to the field. Genetic mapping, genomic selection and sequence-assisted characterization and deployment of plant genetic resources utilize rapid genotyping of large populations. These approaches have all had an impact on breeding for qualitative traits, where single genes with large phenotypic effects exert their influence. Characterization of the complex genetic architectures that underlie quantitative traits such as yield and flowering time, especially in newly domesticated crops, will require further basic research, including research into regulation and interactions of genes and the integration of genomic approaches and high-throughput phenotyping, before targeted interventions can be designed. Future priorities for translation include supporting genomics-assisted breeding in low-income countries and adaptation of crops to changing environments.
Collapse
Affiliation(s)
- Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany.
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany.
| | - Murukarthick Jayakodi
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | - Hyeonah Shim
- Department of Agriculture, Forestry and Bioresources, Plant Genomics and Breeding Institute, Research Institute of Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, Korea
| | - Nils Stein
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany.
- Martin Luther University Halle-Wittenberg, Halle, Germany.
| |
Collapse
|
9
|
Balvert M, Cooper-Knock J, Stamp J, Byrne RP, Mourragui S, van Gils J, Benonisdottir S, Schlüter J, Kenna K, Abeln S, Iacoangeli A, Daub JT, Browning BL, Taş G, Hu J, Wang Y, Alhathli E, Harvey C, Pianesi L, Schulte SC, González-Domínguez J, Garrisson E, Snyder MP, Schönhuth A, Sng LMF, Twine NA. Considerations in the search for epistasis. Genome Biol 2024; 25:296. [PMID: 39563431 PMCID: PMC11574992 DOI: 10.1186/s13059-024-03427-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Accepted: 10/23/2024] [Indexed: 11/21/2024] Open
Abstract
Epistasis refers to changes in the effect on phenotype of a unit of genetic information, such as a single nucleotide polymorphism or a gene, dependent on the context of other genetic units. Such interactions are both biologically plausible and good candidates to explain observations which are not fully explained by an additive heritability model. However, the search for epistasis has so far largely failed to recover this missing heritability. We identify key challenges and propose that future works need to leverage idealized systems, known biology and even previously identified epistatic interactions, in order to guide the search for new interactions.
Collapse
Affiliation(s)
| | | | | | - Ross P Byrne
- Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
| | | | - Juami van Gils
- Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | | | | | | | - Sanne Abeln
- Utrecht University, Utrecht, The Netherlands
| | - Alfredo Iacoangeli
- Department of Biostatistics and Health Informatics, King's College London, London, UK
- Department of Basic and Clinical Neuroscience, King's College London, London, UK
- NIHR BRC SLAM NHS Foundation Trust, London, UK
| | | | | | - Gizem Taş
- Tilburg University, Tilburg, The Netherlands
- UMC Utrecht, Utrecht, The Netherlands
| | - Jiajing Hu
- Department of Biostatistics and Health Informatics, King's College London, London, UK
| | - Yan Wang
- UMC Utrecht, Utrecht, The Netherlands
| | | | | | | | - Sara C Schulte
- Algorithmic Bioinformatics and Center for Digital Medicine, Heinrich Heine University, Düsseldorf, Germany
| | | | | | | | | | - Letitia M F Sng
- Commonwealth Scientific and Industrial Research Organisation, Westmead, Australia.
| | - Natalie A Twine
- Commonwealth Scientific and Industrial Research Organisation, Westmead, Australia.
| |
Collapse
|
10
|
Beer S, Elmenhorst D, Bischof GN, Ramirez A, Bauer A, Drzezga A. Explainable artificial intelligence identifies an AQP4 polymorphism-based risk score associated with brain amyloid burden. Neurobiol Aging 2024; 143:19-29. [PMID: 39208715 DOI: 10.1016/j.neurobiolaging.2024.08.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Revised: 08/05/2024] [Accepted: 08/06/2024] [Indexed: 09/04/2024]
Abstract
Aquaporin-4 (AQP4) is hypothesized to be a component of the glymphatic system, a pathway for removing brain interstitial solutes like amyloid-β (Aβ). Evidence exists that genetic variation of AQP4 impacts Aβ clearance, clinical outcome in Alzheimer's disease as well as sleep measures. We examined whether a risk score calculated from several AQP4 single-nucleotide polymorphisms (SNPs) is related to Aβ neuropathology in older cognitively unimpaired white individuals. We used a machine learning approach and explainable artificial intelligence to extract information on synergistic effects of AQP4 SNPs on brain amyloid burden from the ADNI cohort. From this information, we formulated a sex-specific AQP4 SNP-based risk score and evaluated it using data from the screening process of the A4 study. We found in both cohorts significant associations of the risk score with brain amyloid burden. The results support the hypothesis of an involvement of the glymphatic system, and particularly AQP4, in brain amyloid aggregation pathology. They suggest also that different AQP4 SNPs exert a synergistic effect on the build-up of brain amyloid burden.
Collapse
Affiliation(s)
- Simone Beer
- Institute of Neuroscience and Medicine (INM-2), Forschungszentrum Jülich, Germany.
| | - David Elmenhorst
- Institute of Neuroscience and Medicine (INM-2), Forschungszentrum Jülich, Germany; Department of Nuclear Medicine, Faculty of Medicine and University Hospital Cologne, University of Cologne, Germany
| | - Gerard N Bischof
- Institute of Neuroscience and Medicine (INM-2), Forschungszentrum Jülich, Germany; Department of Nuclear Medicine, Faculty of Medicine and University Hospital Cologne, University of Cologne, Germany
| | - Alfredo Ramirez
- Division of Neurogenetics and Molecular Psychiatry, Department of Psychiatry and Psychotherapy, Faculty of Medicine and University Hospital Cologne, University of Cologne, Germany; German Center for Neurodegenerative Diseases (DZNE), Bonn-Cologne, Germany; Excellence Cluster on Cellular Stress Responses in Aging-Associated Diseases (CECAD), University of Cologne, Cologne, Germany; Department for Neurodegenerative Diseases and Geriatric Psychiatry, Bonn, Germany; Department of Psychiatry and Glenn Biggs Institute for Alzheimer's and Neurodegenerative Diseases, San Antonio, TX, United States
| | - Andreas Bauer
- Institute of Neuroscience and Medicine (INM-2), Forschungszentrum Jülich, Germany
| | - Alexander Drzezga
- Institute of Neuroscience and Medicine (INM-2), Forschungszentrum Jülich, Germany; Department of Nuclear Medicine, Faculty of Medicine and University Hospital Cologne, University of Cologne, Germany; German Center for Neurodegenerative Diseases (DZNE), Bonn-Cologne, Germany
| |
Collapse
|
11
|
Vila JA. The origin of mutational epistasis. EUROPEAN BIOPHYSICS JOURNAL : EBJ 2024; 53:473-480. [PMID: 39443382 DOI: 10.1007/s00249-024-01725-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Revised: 10/03/2024] [Accepted: 10/06/2024] [Indexed: 10/25/2024]
Abstract
The interconnected processes of protein folding, mutations, epistasis, and evolution have all been the subject of extensive analysis throughout the years due to their significance for structural and evolutionary biology. The origin (molecular basis) of epistasis-the non-additive interactions between mutations-is still, nonetheless, unknown. The existence of a new perspective on protein folding, a problem that needs to be conceived as an 'analytic whole', will enable us to shed light on the origin of mutational epistasis at the simplest level-within proteins-while also uncovering the reasons why the genetic background in which they occur, a key component of molecular evolution, could foster changes in epistasis effects. Additionally, because mutations are the source of epistasis, more research is needed to determine the impact of post-translational modifications, which can potentially increase the proteome's diversity by several orders of magnitude, on mutational epistasis and protein evolvability. Finally, a protein evolution thermodynamic-based analysis that does not consider specific mutational steps or epistasis effects will be briefly discussed. Our study explores the complex processes behind the evolution of proteins upon mutations, clearing up some previously unresolved issues, and providing direction for further research.
Collapse
Affiliation(s)
- Jorge A Vila
- IMASL-CONICET, Ejército de Los Andes 950, 5700, San Luis, Argentina.
| |
Collapse
|
12
|
Rowan TN. Genetics and Genomics 101. Vet Clin North Am Food Anim Pract 2024; 40:345-355. [PMID: 39181796 DOI: 10.1016/j.cvfa.2024.05.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/27/2024] Open
Abstract
Genetic mutations, both favorable and unfavorable, are the raw material for improvement in livestock populations. The random inheritance of these mutations is essential for generating progenies with genetic potential greater than their parents. These mutations can act either in a simple manner, such that a single alteration disrupts phenotype, or in a complex manner where hundreds or thousands of mutations of small effect create a continuous distribution of phenotypes. Selection tools leverage phenotypic records, pedigrees, and genomics to estimate the genetic potential of individual animals. This more accurate accounting of genetic potential has generated enormous gains in livestock populations.
Collapse
Affiliation(s)
- Troy N Rowan
- Department of Animal Science, University of Tennessee, 2506 River Drive, Knoxville, TN 37996, USA; Department Large Animal Clinical Sciences, University of Tennessee, Knoxville, TN, USA.
| |
Collapse
|
13
|
Sun L, Bian J, Xin Y, Jiang L, Zheng L. Epi-SSA: A novel epistasis detection method based on a multi-objective sparrow search algorithm. PLoS One 2024; 19:e0311223. [PMID: 39446852 PMCID: PMC11500897 DOI: 10.1371/journal.pone.0311223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2024] [Accepted: 09/16/2024] [Indexed: 10/26/2024] Open
Abstract
Genome-wide association studies typically considers epistatic interactions as a crucial factor in exploring complex diseases. However, the current methods primarily concentrate on the detection of two-order epistatic interactions, with flaws in accuracy. In this work, we introduce a novel method called Epi-SSA, which can be better utilized to detect high-order epistatic interactions. Epi-SSA draws inspiration from the sparrow search algorithm and optimizes the population based on multiple objective functions in each iteration, in order to be able to more precisely identify epistatic interactions. To evaluate its performance, we conducted a comprehensive comparison between Epi-SSA and seven other methods using five simulation datasets: DME 100, DNME 100, DME 1000, DNME 1000 and DNME3 100. The DME 100 dataset encompasses eight second-order epistasis disease models with marginal effects, each comprising 100 simulated data instances, featuring 100 SNPs per instance, alongside 800 case and 800 control samples. The DNME 100 encompasses eight second-order epistasis disease models without marginal effects and retains other properties consistent with DME 100. Experiments on the DME 100 and DNME 100 datasets were designed to evaluate the algorithms' capacity to detect epistasis across varying disease models. The DME 1000 and DNME 1000 datasets extend the complexity with 1000 SNPs per simulated data instance, while retaining other properties consistent with DME 100 and DNME 100. These experiments aimed to gauge the algorithms' adaptability in detecting epistasis as the number of SNPs in the data increases. The DNME3 100 dataset introduces a higher level of complexity with six third-order epistasis disease models, otherwise paralleling the structure of DNME 100, serving to test the algorithms' proficiency in identifying higher-order epistasis. The highest average F-measures achieved by the seven other existing methods on the five datasets are 0.86, 0.86, 0.41, 0.56, and 0.79 respectively, while the average F-measures of Epi-SSA on the five datasets are 0.92, 0.97, 0.79, 0.86, and 0.97 respectively. The experimental results demonstrate that the Epi-SSA algorithm outperforms other methods in a variety of epistasis detection tasks. As the number of SNPs in the data set increases and the order of epistasis rises, the advantages of the Epi-SSA algorithm become increasingly pronounced. In addition, we applied Epi-SSA to the analysis of the WTCCC dataset, uncovering numerous genes and gene pairs that might play a significant role in the pathogenesis of seven complex diseases. It is worthy of note that some of these genes have been relatedly reported in the Comparative Toxicogenomics Database (CTD). Epi-SSA is a potent tool for detecting epistatic interactions, which aids us in further comprehending the pathogenesis of common and complex diseases. The source code of Epi-SSA can be obtained at https://osf.io/6sqwj/.
Collapse
Affiliation(s)
- Liyan Sun
- College of Computer Science and Technology, Changchun University, Changchun City, Jilin Province, China
| | - Jingwen Bian
- School of Cultural and Media Studies, Changchun University of Science and Technology, Changchun City, Jilin Province, China
| | - Yi Xin
- College of Computer Science and Technology, Changchun University, Changchun City, Jilin Province, China
| | - Linqing Jiang
- College of Computer Science and Technology, Changchun University, Changchun City, Jilin Province, China
| | - Linxuan Zheng
- College of Computer Science and Technology, Changchun University, Changchun City, Jilin Province, China
| |
Collapse
|
14
|
Velásquez-Zapata V, Smith S, Surana P, Chapman AV, Jaiswal N, Helm M, Wise RP. Diverse epistatic effects in barley-powdery mildew interactions localize to host chromosome hotspots. iScience 2024; 27:111013. [PMID: 39445108 PMCID: PMC11497433 DOI: 10.1016/j.isci.2024.111013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 06/27/2024] [Accepted: 09/18/2024] [Indexed: 10/25/2024] Open
Abstract
Barley Mildew locus a (Mla) encodes a multi-allelic series of nucleotide-binding leucine-rich repeat (NLR) receptors that specify recognition to diverse cereal diseases. We exploited time-course transcriptome dynamics of barley and derived immune mutants infected with the powdery mildew fungus, Blumeria hordei (Bh), to infer gene effects governed by Mla6 and two other loci significant to disease development, Blufensin1 (Bln1), and Required for Mla6 resistance3 (rar3 = Sgt1 ΔKL308-309 ). Interactions of Mla6 and Bln1 resulted in diverse epistatic effects on the Bh-induced barley transcriptome, differential immunity to Pseudomonas syringae expressing the effector protease AvrPphB, and reaction to Bh. From a total of 468 barley NLRs, 115 were grouped under different gene effect models; genes classified under these models localized to host chromosome hotspots. The corresponding Bh infection transcriptome was classified into nine co-expressed modules, linking differential expression with pathogen structures, signifying that disease is regulated by an inter-organismal network that diversifies the response.
Collapse
Affiliation(s)
- Valeria Velásquez-Zapata
- Program in Bioinformatics & Computational Biology, Iowa State University, Ames, IA 50011, USA
- Department of Plant Pathology, Entomology, and Microbiology, Iowa State University, Ames, IA 50011, USA
| | - Schuyler Smith
- Department of Plant Pathology, Entomology, and Microbiology, Iowa State University, Ames, IA 50011, USA
| | - Priyanka Surana
- Informatics Infrastructure Team, Tree of Life Programme, Wellcome Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, UK
| | - Antony V.E. Chapman
- Interdepartmental Genetics & Genomics, Iowa State University, Ames, IA 50011, USA
- Phytoform Labs, Rothamsted Research, Harpenden AL5 2JQ, UK
| | - Namrata Jaiswal
- USDA-Agricultural Research Service, Crop Production and Pest Control Research Unit, West Lafayette, IN 47907, USA
| | - Matthew Helm
- USDA-Agricultural Research Service, Crop Production and Pest Control Research Unit, West Lafayette, IN 47907, USA
| | - Roger P. Wise
- Program in Bioinformatics & Computational Biology, Iowa State University, Ames, IA 50011, USA
- Department of Plant Pathology, Entomology, and Microbiology, Iowa State University, Ames, IA 50011, USA
- Interdepartmental Genetics & Genomics, Iowa State University, Ames, IA 50011, USA
- USDA-Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| |
Collapse
|
15
|
Kelly CM, McLaughlin RL. Comparison of machine learning methods for genomic prediction of selected Arabidopsis thaliana traits. PLoS One 2024; 19:e0308962. [PMID: 39196916 PMCID: PMC11355539 DOI: 10.1371/journal.pone.0308962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 08/04/2024] [Indexed: 08/30/2024] Open
Abstract
We present a comparison of machine learning methods for the prediction of four quantitative traits in Arabidopsis thaliana. High prediction accuracies were achieved on individuals grown under standardized laboratory conditions from the 1001 Arabidopsis Genomes Project. An existing body of evidence suggests that linear models may be impeded by their inability to make use of non-additive effects to explain phenotypic variation at the population level. The results presented here use a nested cross-validation approach to confirm that some machine learning methods have the ability to statistically outperform linear prediction models, with the optimal model dependent on availability of training data and genetic architecture of the trait in question. Linear models were competitive in their performance as per previous work, though the neural network class of predictors was observed to be the most accurate and robust for traits with high heritability. The extent to which non-linear models exploit interaction effects will require further investigation of the causal pathways that lay behind their predictions. Future work utilizing more traits and larger sample sizes, combined with an improved understanding of their respective genetic architectures, may lead to improvements in prediction accuracy.
Collapse
|
16
|
Chitra U, Arnold BJ, Raphael BJ. Quantifying higher-order epistasis: beware the chimera. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.17.603976. [PMID: 39071303 PMCID: PMC11275791 DOI: 10.1101/2024.07.17.603976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
Epistasis, or interactions in which alleles at one locus modify the fitness effects of alleles at other loci, plays a fundamental role in genetics, protein evolution, and many other areas of biology. Epistasis is typically quantified by computing the deviation from the expected fitness under an additive or multiplicative model using one of several formulae. However, these formulae are not all equivalent. Importantly, one widely used formula - which we call the chimeric formula - measures deviations from a multiplicative fitness model on an additive scale, thus mixing two measurement scales. We show that for pairwise interactions, the chimeric formula yields a different magnitude, but the same sign (synergistic vs. antagonistic) of epistasis compared to the multiplicative formula that measures both fitness and deviations on a multiplicative scale. However, for higher-order interactions, we show that the chimeric formula can have both different magnitude and sign compared to the multiplicative formula - thus confusing negative epistatic interactions with positive interactions, and vice versa. We resolve these inconsistencies by deriving fundamental connections between the different epistasis formulae and the parameters of the multivariate Bernoulli distribution . Our results demonstrate that the additive and multiplicative epistasis formulae are more mathematically sound than the chimeric formula. Moreover, we demonstrate that the mathematical issues with the chimeric epistasis formula lead to markedly different biological interpretations of real data. Analyzing multi-gene knockout data in yeast, multi-way drug interactions in E. coli , and deep mutational scanning (DMS) of several proteins, we find that 10 - 60% of higher-order interactions have a change in sign with the multiplicative or additive epistasis formula. These sign changes result in qualitatively different findings on functional divergence in the yeast genome, synergistic vs. antagonistic drug interactions, and and epistasis between protein mutations. In particular, in the yeast data, the more appropriate multiplicative formula identifies nearly 500 additional negative three-way interactions, thus extending the trigenic interaction network by 25%.
Collapse
|
17
|
Shen S, Sobczyk MK, Paternoster L, Brown SJ. From GWASs toward Mechanistic Understanding with Case Studies in Dermatogenetics. J Invest Dermatol 2024; 144:1189-1199.e8. [PMID: 38782533 DOI: 10.1016/j.jid.2024.03.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 02/13/2024] [Accepted: 03/06/2024] [Indexed: 05/25/2024]
Abstract
Many human skin diseases result from the complex interplay of genetic and environmental mechanisms that are largely unknown. GWASs have yielded insight into the genetic aspect of complex disease by highlighting regions of the genome or specific genetic variants associated with disease. Leveraging this information to identify causal genes and cell types will provide insight into fundamental biology, inform diagnostics, and aid drug discovery. However, the etiological mechanisms from genetic variant to disease are still unestablished in most cases. There now exists an unprecedented wealth of data and computational methods for variant interpretation in a functional context. It can be challenging to decide where to start owing to a lack of consensus on the best way to identify causal genetic mechanisms. This article highlights 3 key aspects of genetic variant interpretation: prioritizing causal genes, cell types, and pathways. We provide a practical overview of the main methods and datasets, giving examples from recent atopic dermatitis studies to provide a blueprint for variant interpretation. A collection of resources, including brief description and links to the packages and web tools, is provided for researchers looking to start in silico follow-up genetic analysis of associated genetic variants.
Collapse
Affiliation(s)
- Silvia Shen
- Centre for Genomic & Experimental Medicine, Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, United Kingdom; Institute for Evolution and Ecology, School of Biological Sciences, The University of Edinburgh, Edinburgh, United Kingdom.
| | - Maria K Sobczyk
- MRC Integrative Epidemiology Unit, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Lavinia Paternoster
- MRC Integrative Epidemiology Unit, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Sara J Brown
- Centre for Genomic & Experimental Medicine, Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, United Kingdom; Department of Dermatology, NHS Lothian, Edinburgh, United Kingdom
| |
Collapse
|
18
|
Ma J, Li J, Chen Y, Yang Z, He Y. Poor statistical power in population-based association study of gene interaction. BMC Med Genomics 2024; 17:111. [PMID: 38678264 PMCID: PMC11055307 DOI: 10.1186/s12920-024-01884-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Accepted: 04/19/2024] [Indexed: 04/29/2024] Open
Abstract
BACKGROUND Statistical epistasis, or "gene-gene interaction" in genetic association studies, means the nonadditive effects between the polymorphic sites on two different genes affecting the same phenotype. In the genetic association analysis of complex traits, nevertheless, the researchers haven't found enough clues of statistical epistasis so far. METHODS We developed a statistical model where the statistical epistasis was presented as an extra linkage disequilibrium between the polymorphic sites of different risk genes. The power of statistical test for identifying the gene-gene interaction was calculated and then compared in different hypothesis scenarios. RESULTS Our results show the statistical power increases with the increasing of interaction coefficient, relative risk, and linkage disequilibrium with genetic markers. However, the power of interaction discovery is much lower than that of regular single-site association test. When rigorous criteria were employed in statistical tests, the identification of gene-gene interaction became a very difficult task. Since the criterion of significance was given to be p-value ≤ 5.0 × 10-8, the same as that of many genome-wide association studies, there is little chance to identify the gene-gene interaction in all kind of circumstances. CONCLUSIONS The lack of epistasis tends to be an inevitable result caused by the statistical principles of methods in the genetic association studies and therefore is the inherent characteristic of the research itself.
Collapse
Affiliation(s)
- Jiarui Ma
- Shanghai Key Laboratory of Medical Epigenetics, International Co-Laboratory of Medical Epigenetics and Metabolism (Ministry of Science and Technology), Institutes of Biomedical Sciences, Fudan University, Shanghai, 200032, China
| | - Jian Li
- Shanghai Key Laboratory of Medical Epigenetics, International Co-Laboratory of Medical Epigenetics and Metabolism (Ministry of Science and Technology), Institutes of Biomedical Sciences, Fudan University, Shanghai, 200032, China
| | - Yuqi Chen
- Shanghai Key Laboratory of Medical Epigenetics, International Co-Laboratory of Medical Epigenetics and Metabolism (Ministry of Science and Technology), Institutes of Biomedical Sciences, Fudan University, Shanghai, 200032, China
| | - Zhen Yang
- Center for Medical Research and Innovation of Pudong Hospital, Intelligent Medicine Institute, Fudan University, Shanghai, 200032, China
| | - Yungang He
- Shanghai Fifth People's Hospital, Intelligent Medicine Institute, Fudan University, Shanghai, 200032, PR China.
| |
Collapse
|
19
|
Zhu X, Yang Y, Lorincz-Comi N, Li G, Bentley AR, de Vries PS, Brown M, Morrison AC, Rotimi CN, Gauderman WJ, Rao DC, Aschard H. An approach to identify gene-environment interactions and reveal new biological insight in complex traits. Nat Commun 2024; 15:3385. [PMID: 38649715 PMCID: PMC11035594 DOI: 10.1038/s41467-024-47806-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 04/10/2024] [Indexed: 04/25/2024] Open
Abstract
There is a long-standing debate about the magnitude of the contribution of gene-environment interactions to phenotypic variations of complex traits owing to the low statistical power and few reported interactions to date. To address this issue, the Gene-Lifestyle Interactions Working Group within the Cohorts for Heart and Aging Research in Genetic Epidemiology Consortium has been spearheading efforts to investigate G × E in large and diverse samples through meta-analysis. Here, we present a powerful new approach to screen for interactions across the genome, an approach that shares substantial similarity to the Mendelian randomization framework. We identify and confirm 5 loci (6 independent signals) interacted with either cigarette smoking or alcohol consumption for serum lipids, and empirically demonstrate that interaction and mediation are the major contributors to genetic effect size heterogeneity across populations. The estimated lower bound of the interaction and environmentally mediated heritability is significant (P < 0.02) for low-density lipoprotein cholesterol and triglycerides in Cross-Population data. Our study improves the understanding of the genetic architecture and environmental contributions to complex traits.
Collapse
Affiliation(s)
- Xiaofeng Zhu
- Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, Cleveland, OH, USA.
| | - Yihe Yang
- Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, Cleveland, OH, USA
| | - Noah Lorincz-Comi
- Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, Cleveland, OH, USA
| | - Gen Li
- Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, Cleveland, OH, USA
| | - Amy R Bentley
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Paul S de Vries
- Human Genetics Center, Department of Epidemiology, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Michael Brown
- Human Genetics Center, Department of Epidemiology, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Alanna C Morrison
- Human Genetics Center, Department of Epidemiology, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Charles N Rotimi
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - W James Gauderman
- Division of Biostatistics, Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA, USA
| | - Dabeeru C Rao
- Center for Biostatistics and Data Science, Institute for Informatics, Data Science and Biostatistics, Washington University School of Medicine, St. Louis, MO, USA
| | - Hugues Aschard
- Institut Pasteur, Université Paris Cité, Department of Computational Biology, F-75015, Paris, France
- Program in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| |
Collapse
|
20
|
Behr M, Kumbier K, Cordova-Palomera A, Aguirre M, Ronen O, Ye C, Ashley E, Butte AJ, Arnaout R, Brown B, Priest J, Yu B. Learning epistatic polygenic phenotypes with Boolean interactions. PLoS One 2024; 19:e0298906. [PMID: 38625909 PMCID: PMC11020961 DOI: 10.1371/journal.pone.0298906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 01/31/2024] [Indexed: 04/18/2024] Open
Abstract
Detecting epistatic drivers of human phenotypes is a considerable challenge. Traditional approaches use regression to sequentially test multiplicative interaction terms involving pairs of genetic variants. For higher-order interactions and genome-wide large-scale data, this strategy is computationally intractable. Moreover, multiplicative terms used in regression modeling may not capture the form of biological interactions. Building on the Predictability, Computability, Stability (PCS) framework, we introduce the epiTree pipeline to extract higher-order interactions from genomic data using tree-based models. The epiTree pipeline first selects a set of variants derived from tissue-specific estimates of gene expression. Next, it uses iterative random forests (iRF) to search training data for candidate Boolean interactions (pairwise and higher-order). We derive significance tests for interactions, based on a stabilized likelihood ratio test, by simulating Boolean tree-structured null (no epistasis) and alternative (epistasis) distributions on hold-out test data. Finally, our pipeline computes PCS epistasis p-values that probabilisticly quantify improvement in prediction accuracy via bootstrap sampling on the test set. We validate the epiTree pipeline in two case studies using data from the UK Biobank: predicting red hair and multiple sclerosis (MS). In the case of predicting red hair, epiTree recovers known epistatic interactions surrounding MC1R and novel interactions, representing non-linearities not captured by logistic regression models. In the case of predicting MS, a more complex phenotype than red hair, epiTree rankings prioritize novel interactions surrounding HLA-DRB1, a variant previously associated with MS in several populations. Taken together, these results highlight the potential for epiTree rankings to help reduce the design space for follow up experiments.
Collapse
Affiliation(s)
- Merle Behr
- Faculty of Informatics and Data Science, University of Regensburg, Regensburg, Germany
| | - Karl Kumbier
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA, United States of America
| | | | - Matthew Aguirre
- Department of Pediatrics, Stanford Medicine, Stanford, CA, United States of America
- Department of Biomedical Data Science, Stanford Medicine, Stanford, CA, United States of America
| | - Omer Ronen
- Department of Statistics, University of California at Berkeley, Berkeley, CA, United States of America
| | - Chengzhong Ye
- Department of Statistics, University of California at Berkeley, Berkeley, CA, United States of America
| | - Euan Ashley
- Division of Cardiovascular Medicine, Stanford Medicine, Stanford, CA, United States of America
| | - Atul J. Butte
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, United States of America
| | - Rima Arnaout
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, United States of America
- Division of Cardiology, Department of Medicine, University of California, San Francisco, San Francisco, CA, United States of America
| | - Ben Brown
- Department of Statistics, University of California at Berkeley, Berkeley, CA, United States of America
- Biosciences Area, Lawrence Berkeley National Laboratory, Berkeley, CA, United States of America
| | - James Priest
- Department of Pediatrics, Stanford Medicine, Stanford, CA, United States of America
| | - Bin Yu
- Department of Statistics, University of California at Berkeley, Berkeley, CA, United States of America
- Department of Electrical Engineering and Computer Sciences and Center for Computational Biology, University of California at Berkeley, Berkeley, CA, United States of America
| |
Collapse
|
21
|
Zhang X, Bell JT. Detecting genetic effects on phenotype variability to capture gene-by-environment interactions: a systematic method comparison. G3 (BETHESDA, MD.) 2024; 14:jkae022. [PMID: 38289865 PMCID: PMC10989912 DOI: 10.1093/g3journal/jkae022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 01/16/2024] [Accepted: 01/19/2024] [Indexed: 02/01/2024]
Abstract
Genetically associated phenotypic variability has been widely observed across organisms and traits, including in humans. Both gene-gene and gene-environment interactions can lead to an increase in genetically associated phenotypic variability. Therefore, detecting the underlying genetic variants, or variance Quantitative Trait Loci (vQTLs), can provide novel insights into complex traits. Established approaches to detect vQTLs apply different methodologies from variance-only approaches to mean-variance joint tests, but a comprehensive comparison of these methods is lacking. Here, we review available methods to detect vQTLs in humans, carry out a simulation study to assess their performance under different biological scenarios of gene-environment interactions, and apply the optimal approaches for vQTL identification to gene expression data. Overall, with a minor allele frequency (MAF) of less than 0.2, the squared residual value linear model (SVLM) and the deviation regression model (DRM) are optimal when the data follow normal and non-normal distributions, respectively. In addition, the Brown-Forsythe (BF) test is one of the optimal methods when the MAF is 0.2 or larger, irrespective of phenotype distribution. Additionally, a larger sample size and more balanced sample distribution in different exposure categories increase the power of BF, SVLM, and DRM. Our results highlight vQTL detection methods that perform optimally under realistic simulation settings and show that their relative performance depends on the phenotype distribution, allele frequency, sample size, and the type of exposure in the interaction model underlying the vQTL.
Collapse
Affiliation(s)
- Xiaopu Zhang
- Department of Twin Research and Genetic Epidemiology, King's College London, St Thomas’ Hospital, Westminster Bridge Road, London SE1 7EH, UK
| | - Jordana T Bell
- Department of Twin Research and Genetic Epidemiology, King's College London, St Thomas’ Hospital, Westminster Bridge Road, London SE1 7EH, UK
| |
Collapse
|
22
|
Martins ARP, Warren NB, McMillan WO, Barrett RDH. Spatiotemporal dynamics in butterfly hybrid zones. INSECT SCIENCE 2024; 31:328-353. [PMID: 37596954 DOI: 10.1111/1744-7917.13262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 07/13/2023] [Accepted: 07/21/2023] [Indexed: 08/21/2023]
Abstract
Evaluating whether hybrid zones are stable or mobile can provide novel insights for evolution and conservation biology. Butterflies exhibit high sensitivity to environmental changes and represent an important model system for the study of hybrid zone origins and maintenance. Here, we review the literature exploring butterfly hybrid zones, with a special focus on their spatiotemporal dynamics and the potential mechanisms that could lead to their movement or stability. We then compare different lines of evidence used to investigate hybrid zone dynamics and discuss the strengths and weaknesses of each approach. Our goal with this review is to reveal general conditions associated with the stability or mobility of butterfly hybrid zones by synthesizing evidence obtained using different types of data sampled across multiple regions and spatial scales. Finally, we discuss spatiotemporal dynamics in the context of a speciation/divergence continuum, the relevance of hybrid zones for conservation biology, and recommend key topics for future investigation.
Collapse
Affiliation(s)
- Ananda R Pereira Martins
- Redpath Museum, McGill University, 859 Sherbrooke Street West, Montreal, Quebec, Canada
- Smithsonian Tropical Research Institute, Gamboa, Panama City, Panama
| | - Natalie B Warren
- Redpath Museum, McGill University, 859 Sherbrooke Street West, Montreal, Quebec, Canada
| | - W Owen McMillan
- Smithsonian Tropical Research Institute, Gamboa, Panama City, Panama
| | - Rowan D H Barrett
- Redpath Museum, McGill University, 859 Sherbrooke Street West, Montreal, Quebec, Canada
| |
Collapse
|
23
|
Zhang Q, Liu J, Liu H, Ao L, Xi Y, Chen D. Genome-wide epistasis analysis reveals gene-gene interaction network on an intermediate endophenotype P-tau/Aβ 42 ratio in ADNI cohort. Sci Rep 2024; 14:3984. [PMID: 38368488 PMCID: PMC10874417 DOI: 10.1038/s41598-024-54541-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Accepted: 02/14/2024] [Indexed: 02/19/2024] Open
Abstract
Alzheimer's disease (AD) is a progressive neurodegenerative disorder and the most common cause of dementia in the elderly worldwide. The exact etiology of AD, particularly its genetic mechanisms, remains incompletely understood. Traditional genome-wide association studies (GWAS), which primarily focus on single-nucleotide polymorphisms (SNPs) with main effects, provide limited explanations for the "missing heritability" of AD, while there is growing evidence supporting the important role of epistasis. In this study, we performed a genome-wide SNP-SNP interaction detection using a linear regression model and employed multiple GPUs for parallel computing, significantly enhancing the speed of whole-genome analysis. The cerebrospinal fluid (CSF) phosphorylated tau (P-tau)/amyloid-[Formula: see text] (A[Formula: see text]) ratio was used as a quantitative trait (QT) to enhance statistical power. Age, gender, and clinical diagnosis were included as covariates to control for potential non-genetic factors influencing AD. We identified 961 pairs of statistically significant SNP-SNP interactions, explaining a high-level variance of P-tau/A[Formula: see text] level, all of which exhibited marginal main effects. Additionally, we replicated 432 previously reported AD-related genes and found 11 gene-gene interaction pairs overlapping with the protein-protein interaction (PPI) network. Our findings may contribute to partially explain the "missing heritability" of AD. The identified subnetwork may be associated with synaptic dysfunction, Wnt signaling pathway, oligodendrocytes, inflammation, hippocampus, and neuronal cells.
Collapse
Affiliation(s)
- Qiushi Zhang
- School of Computer Science, Northeast Electric Power University, 169 Changchun Street, Jilin, 132012, China
| | - Junfeng Liu
- School of Computer Science, Northeast Electric Power University, 169 Changchun Street, Jilin, 132012, China
| | - Hongwei Liu
- College of Intelligent Systems Science and Engineering, Harbin Engineering University, 145 Nantong Street, Harbin, China
| | - Lang Ao
- School of Computer Science, Northeast Electric Power University, 169 Changchun Street, Jilin, 132012, China
| | - Yang Xi
- School of Computer Science, Northeast Electric Power University, 169 Changchun Street, Jilin, 132012, China
| | - Dandan Chen
- School of Automation Engineering, Northeast Electric Power University, 169 Changchun Street, Jilin, 132012, China.
- College of Intelligent Systems Science and Engineering, Harbin Engineering University, 145 Nantong Street, Harbin, China.
| |
Collapse
|
24
|
Chen X, Zhou Z, Li Y, Wang S, Xue E, Wang X, Peng H, Fan M, Wang M, Qin X, Wu Y, Li J, Zhu H, Chen D, Hu Y, Beaty TH, Wu T. Detecting Gene-Gene Interaction among DNA Repair Genes in Chinese non-Syndromic Cleft lip with or Without Palate Trios. Cleft Palate Craniofac J 2024:10556656241228124. [PMID: 38303570 DOI: 10.1177/10556656241228124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2024] Open
Abstract
OBJECTIVE The objective of this study is to investigate the gene-gene interactions associated with NSCL/P among DNA repair genes. DESIGN This study included 806 NSCL/P case-parent trios from China. Quality control process was conducted for genotyped single nucleotide polymorphisms (SNPs) located in six DNA repair genes (ATR, ERCC4, RFC1, TYMS, XRCC1 and XRCC3). We tested gene-gene interactions with Cordell's method using statistical package TRIO in R software. Bonferroni corrected significance level was set as P = 4.24 × 10-4. We also test the robustness of the interactions by permutation tests. SETTING Not applicable. PATIENTS/PARTICIPANTS A total of 806 NSCL/P case-parent trios (complete trios: 682, incomplete trios: 124) with Chinese ancestry. INTERVENTIONS Not applicable. MAIN OUTCOME MEASURE(S) Not applicable. RESULTS A total of 118 SNPs were extracted for the interaction tests. Fourteen pairs of significant interactions were identified after Bonferroni correction, which were confirmed in permutation tests. Twelve pairs were between ATR and ERCC4 or XRCC3. The most significant interaction occurred between rs2244500 in TYMS and rs3213403 in XRCC1(P = 8.16 × 10-15). CONCLUSIONS The current study identified gene-gene interactions among DNA repair genes in 806 Chinese NSCL/P trios, providing additional evidence for the complicated genetic structure underlying NSCL/P. ATR, ERCC4, XRCC3, TYMS and RFC1 were suggested to be possible candidate genes for NSCL/P.
Collapse
Affiliation(s)
- Xi Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Zhibo Zhou
- Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology, Beijing, China
| | - Yixin Li
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Siyue Wang
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Enci Xue
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Xueheng Wang
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Hexiang Peng
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Meng Fan
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Mengying Wang
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Xueying Qin
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Yiqun Wu
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Jing Li
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Hongping Zhu
- Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology, Beijing, China
| | - Dafang Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Yonghua Hu
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Terri H Beaty
- School of Public Health, Johns Hopkins University, Baltimore, Maryland, USA
| | - Tao Wu
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
- Key Laboratory of Reproductive Health, Ministry of Health, Beijing, China
| |
Collapse
|
25
|
Miras K. Exploring the costs of phenotypic plasticity for evolvable digital organisms. Sci Rep 2024; 14:108. [PMID: 38168919 PMCID: PMC10761833 DOI: 10.1038/s41598-023-50683-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Accepted: 12/22/2023] [Indexed: 01/05/2024] Open
Abstract
Phenotypic plasticity is usually defined as a property of individual genotypes to produce different phenotypes when exposed to different environmental conditions. While the benefits of plasticity for adaptation are well established, the costs associated with plasticity remain somewhat obscure. Understanding both why and how these costs arise could help us explain and predict the behavior of living creatures as well as allow the design of more adaptable robotic systems. One of the challenges of conducting such investigations concerns the difficulty of isolating the effects of different types of costs and the lack of control over environmental conditions. The present study addresses these challenges by using virtual worlds (software) to investigate the environmentally regulated phenotypic plasticity of digital organisms. The experimental setup guarantees that potential genetic costs of plasticity are isolated from other plasticity-related costs. Multiple populations of organisms endowed with and without phenotypic plasticity in either the body or the brain are evolved in simulation, and organisms must cope with different environmental conditions. The traits and fitness of the emergent organisms are compared, demonstrating cases in which plasticity is beneficial and cases in which it is neutral. The hypothesis put forward here is that the potential benefits of plasticity might be undermined by the genetic costs related to plasticity itself. The results suggest that this hypothesis is true, while further research is needed to guarantee that the observed effects unequivocally derive from genetic costs and not from some other (unforeseen) mechanism related to plasticity.
Collapse
Affiliation(s)
- Karine Miras
- Department of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.
| |
Collapse
|
26
|
Schwab B, Yin J. Computational multigene interactions in virus growth and infection spread. Virus Evol 2023; 10:vead082. [PMID: 38361828 PMCID: PMC10868543 DOI: 10.1093/ve/vead082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 11/29/2023] [Accepted: 12/19/2023] [Indexed: 02/17/2024] Open
Abstract
Viruses persist in nature owing to their extreme genetic heterogeneity and large population sizes, which enable them to evade host immune defenses, escape antiviral drugs, and adapt to new hosts. The persistence of viruses is challenging to study because mutations affect multiple virus genes, interactions among genes in their impacts on virus growth are seldom known, and measures of viral fitness are yet to be standardized. To address these challenges, we employed a data-driven computational model of cell infection by a virus. The infection model accounted for the kinetics of viral gene expression, functional gene-gene interactions, genome replication, and allocation of host cellular resources to produce progeny of vesicular stomatitis virus, a prototype RNA virus. We used this model to computationally probe how interactions among genes carrying up to eleven deleterious mutations affect different measures of virus fitness: single-cycle growth yields and multicycle rates of infection spread. Individual mutations were implemented by perturbing biophysical parameters associated with individual gene functions of the wild-type model. Our analysis revealed synergistic epistasis among deleterious mutations in their effects on virus yield; so adverse effects of single deleterious mutations were amplified by interaction. For the same mutations, multicycle infection spread indicated weak or negligible epistasis, where single mutations act alone in their effects on infection spread. These results were robust to simulation in high- and low-host resource environments. Our work highlights how different types and magnitudes of epistasis can arise for genetically identical virus variants, depending on the fitness measure. More broadly, gene-gene interactions can differently affect how viruses grow and spread.
Collapse
Affiliation(s)
- Bradley Schwab
- Wisconsin Institute for Discovery, Chemical and Biological Engineering, University of Wisconsin-Madison, 330 N. Orchard Street, Madison, WI 53715, USA
| | - John Yin
- Wisconsin Institute for Discovery, Chemical and Biological Engineering, University of Wisconsin-Madison, 330 N. Orchard Street, Madison, WI 53715, USA
| |
Collapse
|
27
|
Ren F, Li S, Wen Z, Liu Y, Tang D. The Spherical Evolutionary Multi-Objective (SEMO) Algorithm for Identifying Disease Multi-Locus SNP Interactions. Genes (Basel) 2023; 15:11. [PMID: 38275593 PMCID: PMC10815643 DOI: 10.3390/genes15010011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 11/21/2023] [Accepted: 12/18/2023] [Indexed: 01/27/2024] Open
Abstract
Single-nucleotide polymorphisms (SNPs), as disease-related biogenetic markers, are crucial in elucidating complex disease susceptibility and pathogenesis. Due to computational inefficiency, it is difficult to identify high-dimensional SNP interactions efficiently using combinatorial search methods, so the spherical evolutionary multi-objective (SEMO) algorithm for detecting multi-locus SNP interactions was proposed. The algorithm uses a spherical search factor and a feedback mechanism of excellent individual history memory to enhance the balance between search and acquisition. Moreover, a multi-objective fitness function based on the decomposition idea was used to evaluate the associations by combining two functions, K2-Score and LR-Score, as an objective function for the algorithm's evolutionary iterations. The performance evaluation of SEMO was compared with six state-of-the-art algorithms on a simulated dataset. The results showed that SEMO outperforms the comparative methods by detecting SNP interactions quickly and accurately with a shorter average run time. The SEMO algorithm was applied to the Wellcome Trust Case Control Consortium (WTCCC) breast cancer dataset and detected two- and three-point SNP interactions that were significantly associated with breast cancer, confirming the effectiveness of the algorithm. New combinations of SNPs associated with breast cancer were also identified, which will provide a new way to detect SNP interactions quickly and accurately.
Collapse
Affiliation(s)
- Fuxiang Ren
- College of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou 510006, China; (F.R.); (S.L.); (Y.L.)
| | - Shiyin Li
- College of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou 510006, China; (F.R.); (S.L.); (Y.L.)
| | - Zihao Wen
- College of Mathematics and Informatics, College of Software Engineering, South China Agricultural University, Guangzhou 510642, China
- Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
| | - Yidi Liu
- College of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou 510006, China; (F.R.); (S.L.); (Y.L.)
| | - Deyu Tang
- College of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou 510006, China; (F.R.); (S.L.); (Y.L.)
- College of Mathematics and Informatics, College of Software Engineering, South China Agricultural University, Guangzhou 510642, China
| |
Collapse
|
28
|
Galarza-Muñoz G, Soto-Morales SI, Jiao S, Holmgren M, Rosenthal JJC. Molecular determinants for cold adaptation in an Antarctic Na +/K +-ATPase. Proc Natl Acad Sci U S A 2023; 120:e2301207120. [PMID: 37782798 PMCID: PMC10576127 DOI: 10.1073/pnas.2301207120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2023] [Accepted: 07/28/2023] [Indexed: 10/04/2023] Open
Abstract
Enzymes from ectotherms living in chronically cold environments have evolved structural innovations to overcome the effects of temperature on catalysis. Cold adaptation of soluble enzymes is driven by changes within their primary structure or the aqueous milieu. For membrane-embedded enzymes, like the Na+/K+-ATPase, the situation is different because changes to the lipid bilayer in which they operate may also be relevant. Although much attention has been focused on thermal adaptation within lipid bilayers, relatively little is known about the contribution of structural changes within membrane-bound enzymes themselves. The identification of specific mutations that confer temperature compensation is complicated by the presence of neutral mutations, which can be more numerous. In the present study, we identified specific amino acids in a Na+/K+-ATPase from an Antarctic octopus that underlie cold resistance. Our approach was to generate chimeras between an Antarctic clone and a temperate ortholog and then study their temperature sensitivities in Xenopus oocytes using an electrophysiological approach. We identified 12 positions in the Antarctic Na+/K+-ATPase that, when transferred to the temperate ortholog, were sufficient to confer cold tolerance. Furthermore, although all 12 Antarctic mutations were required for the full phenotype, a single leucine in the third transmembrane segment (M3) imparted most of it. Mutations that confer cold resistance are mostly in transmembrane segments, at positions that face the lipid bilayer. We propose that the interface between a transmembrane enzyme and the lipid bilayer is a critical determinant of temperature sensitivity and, accordingly, has been a prime evolutionary target for thermal adaptation.
Collapse
Affiliation(s)
- Gaddiel Galarza-Muñoz
- Institute of Neurobiology, University of Puerto Rico, Medical Sciences Campus, San Juan, PR00901
| | - Sonia I. Soto-Morales
- Institute of Neurobiology, University of Puerto Rico, Medical Sciences Campus, San Juan, PR00901
| | - Song Jiao
- National Institute of Neurological Disorders and Stroke, NIH, Bethesda, MD20892
| | - Miguel Holmgren
- National Institute of Neurological Disorders and Stroke, NIH, Bethesda, MD20892
| | - Joshua J. C. Rosenthal
- Institute of Neurobiology, University of Puerto Rico, Medical Sciences Campus, San Juan, PR00901
| |
Collapse
|
29
|
Skodvin SN, Gjessing HK, Jugessur A, Romanowska J, Page CM, Corfield EC, Lee Y, Håberg SE, Gjerdevik M. Statistical methods to detect mother-father genetic interaction effects on risk of infertility: A genome-wide approach. Genet Epidemiol 2023; 47:503-519. [PMID: 37638522 DOI: 10.1002/gepi.22534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 05/25/2023] [Accepted: 08/01/2023] [Indexed: 08/29/2023]
Abstract
Infertility is a heterogeneous phenotype, and for many couples, the causes of fertility problems remain unknown. One understudied hypothesis is that allelic interactions between the genotypes of the two parents may influence the risk of infertility. Our aim was, therefore, to investigate how allelic interactions can be modeled using parental genotype data linked to 15,789 pregnancies selected from the Norwegian Mother, Father, and Child Cohort Study. The newborns in 1304 of these pregnancies were conceived using assisted reproductive technologies (ART), and the remainder were conceived naturally. Treating the use of ART as a proxy for infertility, different parameterizations were implemented in a genome-wide screen for interaction effects between maternal and paternal alleles at the same locus. Some of the models were more similar in the way they were parameterized, and some produced similar results when implemented on a genome-wide scale. The results showed near-significant interaction effects in genes relevant to the phenotype under study, such as Dynein axonemal heavy chain 17 (DNAH17) with a recognized role in male infertility. More generally, the interaction models presented here are readily adaptable to the study of other phenotypes in which maternal and paternal allelic interactions are likely to be involved.
Collapse
Affiliation(s)
- Siri N Skodvin
- Centre for Fertility and Health, Norwegian Institute of Public Health, Oslo, Norway
- Department of Global Public Health and Primary Care, University of Bergen, Bergen, Norway
| | - Håkon K Gjessing
- Centre for Fertility and Health, Norwegian Institute of Public Health, Oslo, Norway
- Department of Global Public Health and Primary Care, University of Bergen, Bergen, Norway
| | - Astanand Jugessur
- Centre for Fertility and Health, Norwegian Institute of Public Health, Oslo, Norway
- Department of Global Public Health and Primary Care, University of Bergen, Bergen, Norway
| | - Julia Romanowska
- Centre for Fertility and Health, Norwegian Institute of Public Health, Oslo, Norway
- Department of Global Public Health and Primary Care, University of Bergen, Bergen, Norway
| | - Christian M Page
- Department of Physical Health and Ageing, Division of Mental and Physical Health, Norwegian Institute of Public Health, Oslo, Norway
| | - Elizabeth C Corfield
- Department of Mental Disorders, Norwegian Institute of Public Health, Oslo, Norway
- Nic Waals Institute, Lovisenberg Diaconal Hospital, Oslo, Norway
| | - Yunsung Lee
- Centre for Fertility and Health, Norwegian Institute of Public Health, Oslo, Norway
| | - Siri E Håberg
- Centre for Fertility and Health, Norwegian Institute of Public Health, Oslo, Norway
| | - Miriam Gjerdevik
- Centre for Fertility and Health, Norwegian Institute of Public Health, Oslo, Norway
- Department of Computer Science, Electrical Engineering and Mathematical Sciences, Western Norway University of Applied Sciences, Bergen, Norway
| |
Collapse
|
30
|
Fu B, Pazokitoroudi A, Xue A, Anand A, Anand P, Zaitlen N, Sankararaman S. A biobank-scale test of marginal epistasis reveals genome-wide signals of polygenic epistasis. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.10.557084. [PMID: 37745394 PMCID: PMC10515811 DOI: 10.1101/2023.09.10.557084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
The contribution of epistasis (interactions among genes or genetic variants) to human complex trait variation remains poorly understood. Methods that aim to explicitly identify pairs of genetic variants, usually single nucleotide polymorphisms (SNPs), associated with a trait suffer from low power due to the large number of hypotheses tested while also having to deal with the computational problem of searching over a potentially large number of candidate pairs. An alternate approach involves testing whether a single SNP modulates variation in a trait against a polygenic background. While overcoming the limitation of low power, such tests of polygenic or marginal epistasis (ME) are infeasible on Biobank-scale data where hundreds of thousands of individuals are genotyped over millions of SNPs. We present a method to test for ME of a SNP on a trait that is applicable to biobank-scale data. We performed extensive simulations to show that our method provides calibrated tests of ME. We applied our method to test for ME at SNPs that are associated with 53 quantitative traits across ≈ 300 K unrelated white British individuals in the UK Biobank (UKBB). Testing 15, 601 trait-loci associations that were significant in GWAS, we identified 16 trait-loci pairs across 12 traits that demonstrate strong evidence of ME signals (p-value p < 5 × 10 - 8 53 ). We further partitioned the significant ME signals across the genome to identify 6 trait-loci pairs with evidence of local (within-chromosome) ME while 15 show evidence of distal (cross-chromosome) ME. Across the 16 trait-loci pairs, we document that the proportion of trait variance explained by ME is about 12x as large as that explained by the GWAS effects on average (range: 0.59 to 43.89). Our results show, for the first time, evidence of interaction effects between individual genetic variants and overall polygenic background modulating complex trait variation.
Collapse
Affiliation(s)
- Boyang Fu
- Department of Computer Science, UCLA, Los Angeles, CA, USA
| | | | - Albert Xue
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, USA
| | - Aakarsh Anand
- Department of Computer Science, UCLA, Los Angeles, CA, USA
| | - Prateek Anand
- Department of Computer Science, UCLA, Los Angeles, CA, USA
| | - Noah Zaitlen
- Department of Neurology, UCLA, Los Angeles, CA, USA
- Department of Computational Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Sriram Sankararaman
- Department of Computer Science, UCLA, Los Angeles, CA, USA
- Department of Computational Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| |
Collapse
|
31
|
Aleknonytė-Resch M, Trinh J, Leonard H, Delcambre S, Leitão E, Lai D, Smajić S, Orr-Urtreger A, Thaler A, Blauwendraat C, Sharma A, Makarious MB, Kim JJ, Lake J, Rahmati P, Freitag-Wolf S, Seibler P, Foroud T, Singleton AB, Grünewald A, Kaiser F, Klein C, Krawczak M, Dempfle A. Genome-wide case-only analysis of gene-gene interactions with known Parkinson's disease risk variants reveals link between LRRK2 and SYT10. NPJ Parkinsons Dis 2023; 9:102. [PMID: 37386035 PMCID: PMC10310744 DOI: 10.1038/s41531-023-00550-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Accepted: 06/15/2023] [Indexed: 07/01/2023] Open
Abstract
The effects of one genetic factor upon Parkinson's disease (PD) risk may be modified by other genetic factors. Such gene-gene interaction (G×G) could explain some of the 'missing heritability' of PD and the reduced penetrance of known PD risk variants. Using the largest single nucleotide polymorphism (SNP) genotype data set currently available for PD (18,688 patients), provided by the International Parkinson's Disease Genomics Consortium, we studied G×G with a case-only (CO) design. To this end, we paired each of 90 SNPs previously reported to be associated with PD with one of 7.8 million quality-controlled SNPs from a genome-wide panel. Support of any putative G×G interactions found was sought by the analysis of independent genotype-phenotype and experimental data. A total of 116 significant pairwise SNP genotype associations were identified in PD cases, pointing towards G×G. The most prominent associations involved a region on chromosome 12q containing SNP rs76904798, which is a non-coding variant of the LRRK2 gene. It yielded the lowest interaction p-value overall with SNP rs1007709 in the promoter region of the SYT10 gene (interaction OR = 1.80, 95% CI: 1.65-1.95, p = 2.7 × 10-43). SNPs around SYT10 were also associated with the age-at-onset of PD in an independent cohort of carriers of LRRK2 mutation p.G2019S. Moreover, SYT10 gene expression during neuronal development was found to differ between cells from affected and non-affected p.G2019S carriers. G×G interaction on PD risk, involving the LRRK2 and SYT10 gene regions, is biologically plausible owing to the known link between PD and LRRK2, its involvement in neural plasticity, and the contribution of SYT10 to the exocytosis of secretory vesicles in neurons.
Collapse
Affiliation(s)
- Milda Aleknonytė-Resch
- Institute of Medical Informatics and Statistics, Kiel University, Kiel, Germany
- Department of Computer Science, Kiel University, Kiel, Germany
| | - Joanne Trinh
- Institute of Neurogenetics, University of Lübeck, University Medical Center Schleswig-Holstein, Campus Lübeck, Germany
| | - Hampton Leonard
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA
- Data Tecnica International LLC, Glen Echo, MD, USA
- Center for Alzheimer's and Related Dementias, National Institute on Aging, Bethesda, MD, USA
| | - Sylvie Delcambre
- Molecular and Functional Neurobiology Group, Luxembourg Centre for Systems Biomedicine, Esch-sur-Alzette, Luxembourg
| | - Elsa Leitão
- Institute of Human Genetics, University Hospital Essen, Essen, Germany
| | - Dongbing Lai
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Semra Smajić
- Molecular and Functional Neurobiology Group, Luxembourg Centre for Systems Biomedicine, Esch-sur-Alzette, Luxembourg
| | - Avi Orr-Urtreger
- Neurological Institute, Tel Aviv Sourasky Medical Center, Sackler Faculty of Medicine and Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Avner Thaler
- Neurological Institute, Tel Aviv Sourasky Medical Center, Sackler Faculty of Medicine and Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Cornelis Blauwendraat
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA
- Center for Alzheimer's and Related Dementias, National Institute on Aging, Bethesda, MD, USA
| | - Arunabh Sharma
- Institute of Medical Informatics and Statistics, Kiel University, Kiel, Germany
| | - Mary B Makarious
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA
- Department of Clinical and Movement Neurosciences, University College London Queen Square Institute of Neurology, London, UK
- UCL Movement Disorders Centre, University College London, London, UK
| | - Jonggeol Jeff Kim
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA
| | - Julie Lake
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA
| | - Pegah Rahmati
- Institute of Medical Informatics and Statistics, Kiel University, Kiel, Germany
| | - Sandra Freitag-Wolf
- Institute of Medical Informatics and Statistics, Kiel University, Kiel, Germany
| | - Philip Seibler
- Institute of Neurogenetics, University of Lübeck, University Medical Center Schleswig-Holstein, Campus Lübeck, Germany
| | - Tatiana Foroud
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Andrew B Singleton
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA
- Center for Alzheimer's and Related Dementias, National Institute on Aging, Bethesda, MD, USA
| | - Anne Grünewald
- Institute of Neurogenetics, University of Lübeck, University Medical Center Schleswig-Holstein, Campus Lübeck, Germany
- Molecular and Functional Neurobiology Group, Luxembourg Centre for Systems Biomedicine, Esch-sur-Alzette, Luxembourg
| | - Frank Kaiser
- Institute of Human Genetics, University Hospital Essen, Essen, Germany
| | - Christine Klein
- Institute of Neurogenetics, University of Lübeck, University Medical Center Schleswig-Holstein, Campus Lübeck, Germany
| | - Michael Krawczak
- Institute of Medical Informatics and Statistics, Kiel University, Kiel, Germany
| | - Astrid Dempfle
- Institute of Medical Informatics and Statistics, Kiel University, Kiel, Germany.
| |
Collapse
|
32
|
Montesinos-López OA, Saint Pierre C, Gezan SA, Bentley AR, Mosqueda-González BA, Montesinos-López A, van Eeuwijk F, Beyene Y, Gowda M, Gardner K, Gerard GS, Crespo-Herrera L, Crossa J. Optimizing Sparse Testing for Genomic Prediction of Plant Breeding Crops. Genes (Basel) 2023; 14:genes14040927. [PMID: 37107685 PMCID: PMC10137724 DOI: 10.3390/genes14040927] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 04/07/2023] [Accepted: 04/13/2023] [Indexed: 04/29/2023] Open
Abstract
While sparse testing methods have been proposed by researchers to improve the efficiency of genomic selection (GS) in breeding programs, there are several factors that can hinder this. In this research, we evaluated four methods (M1-M4) for sparse testing allocation of lines to environments under multi-environmental trails for genomic prediction of unobserved lines. The sparse testing methods described in this study are applied in a two-stage analysis to build the genomic training and testing sets in a strategy that allows each location or environment to evaluate only a subset of all genotypes rather than all of them. To ensure a valid implementation, the sparse testing methods presented here require BLUEs (or BLUPs) of the lines to be computed at the first stage using an appropriate experimental design and statistical analyses in each location (or environment). The evaluation of the four cultivar allocation methods to environments of the second stage was done with four data sets (two large and two small) under a multi-trait and uni-trait framework. We found that the multi-trait model produced better genomic prediction (GP) accuracy than the uni-trait model and that methods M3 and M4 were slightly better than methods M1 and M2 for the allocation of lines to environments. Some of the most important findings, however, were that even under a scenario where we used a training-testing relation of 15-85%, the prediction accuracy of the four methods barely decreased. This indicates that genomic sparse testing methods for data sets under these scenarios can save considerable operational and financial resources with only a small loss in precision, which can be shown in our cost-benefit analysis.
Collapse
Affiliation(s)
| | - Carolina Saint Pierre
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera Mexico-Veracruz, El Batan, Texcoco 56237, Mexico
| | | | - Alison R Bentley
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera Mexico-Veracruz, El Batan, Texcoco 56237, Mexico
| | - Brandon A Mosqueda-González
- Centro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN), Mexico City 07738, Mexico
| | - Abelardo Montesinos-López
- Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara 44430, Mexico
| | - Fred van Eeuwijk
- Department of Plant Science Mathematical and Statistical Methods-Biometrics, P.O. Box 16, 6700AA Wageningen, The Netherlands
| | - Yoseph Beyene
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera Mexico-Veracruz, El Batan, Texcoco 56237, Mexico
| | - Manje Gowda
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera Mexico-Veracruz, El Batan, Texcoco 56237, Mexico
| | - Keith Gardner
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera Mexico-Veracruz, El Batan, Texcoco 56237, Mexico
| | - Guillermo S Gerard
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera Mexico-Veracruz, El Batan, Texcoco 56237, Mexico
| | - Leonardo Crespo-Herrera
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera Mexico-Veracruz, El Batan, Texcoco 56237, Mexico
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera Mexico-Veracruz, El Batan, Texcoco 56237, Mexico
- Colegio de Postgraduados, Montecillos 56230, Mexico
| |
Collapse
|
33
|
Abstract
BACKGROUND Autoimmune hepatitis has an unknown cause and genetic associations that are not disease-specific or always present. Clarification of its missing causality and heritability could improve prevention and management strategies. AIMS Describe the key epigenetic and genetic mechanisms that could account for missing causality and heritability in autoimmune hepatitis; indicate the prospects of these mechanisms as pivotal factors; and encourage investigations of their pathogenic role and therapeutic potential. METHODS English abstracts were identified in PubMed using multiple key search phases. Several hundred abstracts and 210 full-length articles were reviewed. RESULTS Environmental induction of epigenetic changes is the prime candidate for explaining the missing causality of autoimmune hepatitis. Environmental factors (diet, toxic exposures) can alter chromatin structure and the production of micro-ribonucleic acids that affect gene expression. Epistatic interaction between unsuspected genes is the prime candidate for explaining the missing heritability. The non-additive, interactive effects of multiple genes could enhance their impact on the propensity and phenotype of autoimmune hepatitis. Transgenerational inheritance of acquired epigenetic marks constitutes another mechanism of transmitting parental adaptations that could affect susceptibility. Management strategies could range from lifestyle adjustments and nutritional supplements to precision editing of the epigenetic landscape. CONCLUSIONS Autoimmune hepatitis has a missing causality that might be explained by epigenetic changes induced by environmental factors and a missing heritability that might reflect epistatic gene interactions or transgenerational transmission of acquired epigenetic marks. These unassessed or under-evaluated areas warrant investigation.
Collapse
Affiliation(s)
- Albert J Czaja
- Mayo Clinic College of Medicine and Science, Rochester, MN, USA.
- Professor Emeritus of Medicine, Mayo Clinic College of Medicine and Science, 200 First Street SW, Rochester, MN, 55905, USA.
| |
Collapse
|
34
|
Wang S, Shi J, Liu C, Wang P, Wang M, Li W, Zhou R, Zheng H, Jiang J, Li N, Li J, Zhou Z, Zhu H, Wu Y, Jia Z, Wu T, Hu Y, Beaty TH. Evidence of the folate-mediated one-carbon metabolism pathway genes in controlling the non-syndromic oral clefts risks. Oral Dis 2023; 29:1080-1088. [PMID: 34739175 DOI: 10.1111/odi.14068] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 10/21/2021] [Accepted: 10/26/2021] [Indexed: 02/05/2023]
Abstract
The folate-mediated one-carbon metabolism pathway is thought to play an important role in the etiology of non-syndromic oral clefts (NSOFC), although none of the genes in this pathway has shown significant signals in genome-wide association studies (GWAS). Recent evidence indicated that enhanced understanding could be gained by aggregating multiple SNPs effect simultaneously into polygenic risk score (PRS) to assess its association with disease risks. This study is aimed to assess the association between the genetic effect of folate-mediated one-carbon metabolism pathway and NSOFC risks using PRS based on a case-parent trio design. A total of 297 SNPs mapped from 18 genes in the folate-mediated one-carbon metabolism pathway were aggregated from a GWAS of 2458 case-parent trios recruited from an international consortium. We found a PRS based on the folate-mediated one-carbon metabolism pathway was significant among all NSOFC trios (OR = 1.95, 95% CI: 1.66-2.28, p = 2.39 × 10-16 ), as well as two major subtypes, non-syndromic cleft lip with or without cleft palate (NSCL/P) trios (OR = 1.71, 95% CI: 1.50-1.96, p = 7.66 × 10-15 ) and non-syndromic cleft palate only (NSCPO) trios (OR = 1.51, 95% CI: 1.36-1.68, p = 2.1 × 10-14 ). Similar results were also observed in further subgroup analyses stratified into Asian and European trios. The averaged PRS of the folate-mediated one-carbon metabolism pathway varied between the NSOFC case group and its comparison group (p < 0.05) with higher average PRS in the cases. Moreover, the top 5% pathway PRS group had 2.25 (95% CI: 1.85-2.73) times increased NSOFC risk, also 3.09 (95% CI: 2.50-3.81) and 2.06 (95% CI: 1.39-3.02) times increased risk of NSCL/P and NSCPO compared to the remainder of the distribution. The results of our study confirmed the folate-mediated one-carbon metabolism pathway was important in controlling risk to NSOFC and this study enhanced evidence towards understanding the genetic risks of NSOFC.
Collapse
Affiliation(s)
- Siyue Wang
- Peking University Health Science Center, Beijing, China
| | - Jiayu Shi
- Division of Growth and Development and Section of Orthodontics, School of Dentistry, University of California, Los Angeles, USA
| | | | - Ping Wang
- Peking University Health Science Center, Beijing, China
| | - Mengying Wang
- Peking University Health Science Center, Beijing, China
| | - Wenyong Li
- Peking University Health Science Center, Beijing, China
| | - Ren Zhou
- Peking University Health Science Center, Beijing, China
| | | | - Jin Jiang
- Peking University Health Science Center, Beijing, China
| | - Nan Li
- Peking University School of Stomatology, Beijing, China
| | - Jing Li
- Peking University School of Stomatology, Beijing, China
| | - Zhibo Zhou
- Peking University School of Stomatology, Beijing, China
| | - Hongping Zhu
- Peking University School of Stomatology, Beijing, China
| | - Yiqun Wu
- Peking University Health Science Center, Beijing, China
| | - Zhonglin Jia
- State Key Laboratory of Oral Diseases and National Clinical Research Center for Oral Diseases and Department of cleft lip and palate, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Tao Wu
- Peking University Health Science Center, Beijing, China
- Institute of Reproductive and Child Health/Key Laboratory of Reproductive Health, National Health Commission of the People's Republic of China, Beijing, China
| | - Yonghua Hu
- Peking University Health Science Center, Beijing, China
| | - Terri H Beaty
- School of Public Health, Johns Hopkins University, Baltimore, Maryland, USA
| |
Collapse
|
35
|
Zhang X, Zhu T, Wang L, Lv X, Yang W, Qu C, Li H, Wang H, Ning Z, Qu L. Genome-Wide Association Study Reveals the Genetic Basis of Duck Plumage Colors. Genes (Basel) 2023; 14:genes14040856. [PMID: 37107611 PMCID: PMC10137861 DOI: 10.3390/genes14040856] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Revised: 03/17/2023] [Accepted: 03/30/2023] [Indexed: 04/05/2023] Open
Abstract
Plumage color is an artificially and naturally selected trait in domestic ducks. Black, white, and spotty are the main feather colors in domestic ducks. Previous studies have shown that black plumage color is caused by MC1R, and white plumage color is caused by MITF. We performed a genome-wide association study (GWAS) to identify candidate genes associated with white, black, and spotty plumage in ducks. Two non-synonymous SNPs in MC1R (c.52G>A and c.376G>A) were significantly related to duck black plumage, and three SNPs in MITF (chr13:15411658A>G, chr13:15412570T>C and chr13:15412592C>G) were associated with white plumage. Additionally, we also identified the epistatic interactions between causing loci. Some ducks with white plumage carry the c.52G>A and c.376G>A in MC1R, which also compensated for black and spotty plumage color phenotypes, suggesting that MC1R and MITF have an epistatic effect. The MITF locus was supposed to be an upstream gene to MC1R underlying the white, black, and spotty colors. Although the specific mechanism remains to be further clarified, these findings support the importance of epistasis in plumage color variation in ducks.
Collapse
Affiliation(s)
- Xinye Zhang
- National Engineering Laboratory for Animal Breeding, Department of Animal Genetics and Breeding, College of Animal Science and Technology, China Agricultural University, Yuanmingyuan West Road 2, Beijing 100193, China
| | - Tao Zhu
- National Engineering Laboratory for Animal Breeding, Department of Animal Genetics and Breeding, College of Animal Science and Technology, China Agricultural University, Yuanmingyuan West Road 2, Beijing 100193, China
| | - Liang Wang
- Beijing Municipal General Station of Animal Science, Beijing 100107, China
| | - Xueze Lv
- Beijing Municipal General Station of Animal Science, Beijing 100107, China
| | - Weifang Yang
- Beijing Municipal General Station of Animal Science, Beijing 100107, China
| | - Changqing Qu
- Engineering Technology Research Center of Anti-Aging Chinese Herbal Medicine of Anhui Province, Fuyang Normal University, Fuyang 236037, China
| | - Haiying Li
- College of Animal Science, Xinjiang Agricultural University, Urumchi 830052, China
| | - Huie Wang
- College of Animal Science, Tarim University, Alar 843300, China
| | - Zhonghua Ning
- National Engineering Laboratory for Animal Breeding, Department of Animal Genetics and Breeding, College of Animal Science and Technology, China Agricultural University, Yuanmingyuan West Road 2, Beijing 100193, China
| | - Lujiang Qu
- National Engineering Laboratory for Animal Breeding, Department of Animal Genetics and Breeding, College of Animal Science and Technology, China Agricultural University, Yuanmingyuan West Road 2, Beijing 100193, China
| |
Collapse
|
36
|
Spadafora C. The epigenetic basis of evolution. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2023; 178:57-69. [PMID: 36720315 DOI: 10.1016/j.pbiomolbio.2023.01.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 12/17/2022] [Accepted: 01/26/2023] [Indexed: 01/31/2023]
Abstract
An increasing body of data are revealing key roles of epigenetics in evolutionary processes. The scope of this manuscript is to assemble in a coherent frame experimental evidence supporting a role of epigenetic factors and networks, active during embryogenesis, in orchestrating variation-inducing phenomena underlying evolution, seen as a global process. This process unfolds over two crucial levels: i) a flow of RNA-based information - predominantly small regulatory RNAs released from somatic cells exposed to environmental stimuli - taken up by spermatozoa and delivered to oocytes at fertilization and ii) the highly permissive and variation-prone environments offered by zygotes and totipotent early embryos. Totipotent embryos provide a variety of biological tools favouring the emergence of evolutionarily significant phenotypic novelties driven by RNA information. Under this light, neither random genomic mutations, nor the sieving role of natural selection are required, as the sperm-delivered RNA cargo conveys specific information and acts as "phenotypic-inducer" of defined environmentally acquired traits.
Collapse
Affiliation(s)
- Corrado Spadafora
- Institute of Translational Pharmacology, National Research Council (CNR), Rome, Italy.
| |
Collapse
|
37
|
Learning high-order interactions for polygenic risk prediction. PLoS One 2023; 18:e0281618. [PMID: 36763605 PMCID: PMC9916647 DOI: 10.1371/journal.pone.0281618] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 01/27/2023] [Indexed: 02/11/2023] Open
Abstract
Within the framework of precision medicine, the stratification of individual genetic susceptibility based on inherited DNA variation has paramount relevance. However, one of the most relevant pitfalls of traditional Polygenic Risk Scores (PRS) approaches is their inability to model complex high-order non-linear SNP-SNP interactions and their effect on the phenotype (e.g. epistasis). Indeed, they incur in a computational challenge as the number of possible interactions grows exponentially with the number of SNPs considered, affecting the statistical reliability of the model parameters as well. In this work, we address this issue by proposing a novel PRS approach, called High-order Interactions-aware Polygenic Risk Score (hiPRS), that incorporates high-order interactions in modeling polygenic risk. The latter combines an interaction search routine based on frequent itemsets mining and a novel interaction selection algorithm based on Mutual Information, to construct a simple and interpretable weighted model of user-specified dimensionality that can predict a given binary phenotype. Compared to traditional PRSs methods, hiPRS does not rely on GWAS summary statistics nor any external information. Moreover, hiPRS differs from Machine Learning-based approaches that can include complex interactions in that it provides a readable and interpretable model and it is able to control overfitting, even on small samples. In the present work we demonstrate through a comprehensive simulation study the superior performance of hiPRS w.r.t. state of the art methods, both in terms of scoring performance and interpretability of the resulting model. We also test hiPRS against small sample size, class imbalance and the presence of noise, showcasing its robustness to extreme experimental settings. Finally, we apply hiPRS to a case study on real data from DACHS cohort, defining an interaction-aware scoring model to predict mortality of stage II-III Colon-Rectal Cancer patients treated with oxaliplatin.
Collapse
|
38
|
Jeon D, Kang Y, Lee S, Choi S, Sung Y, Lee TH, Kim C. Digitalizing breeding in plants: A new trend of next-generation breeding based on genomic prediction. FRONTIERS IN PLANT SCIENCE 2023; 14:1092584. [PMID: 36743488 PMCID: PMC9892199 DOI: 10.3389/fpls.2023.1092584] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 01/05/2023] [Indexed: 06/18/2023]
Abstract
As the world's population grows and food needs diversification, the demand for cereals and horticultural crops with beneficial traits increases. In order to meet a variety of demands, suitable cultivars and innovative breeding methods need to be developed. Breeding methods have changed over time following the advance of genetics. With the advent of new sequencing technology in the early 21st century, predictive breeding, such as genomic selection (GS), emerged when large-scale genomic information became available. GS shows good predictive ability for the selection of individuals with traits of interest even for quantitative traits by using various types of the whole genome-scanning markers, breaking away from the limitations of marker-assisted selection (MAS). In the current review, we briefly describe the history of breeding techniques, each breeding method, various statistical models applied to GS and methods to increase the GS efficiency. Consequently, we intend to propose and define the term digital breeding through this review article. Digital breeding is to develop a predictive breeding methods such as GS at a higher level, aiming to minimize human intervention by automatically proceeding breeding design, propagating breeding populations, and to make selections in consideration of various environments, climates, and topography during the breeding process. We also classified the phases of digital breeding based on the technologies and methods applied to each phase. This review paper will provide an understanding and a direction for the final evolution of plant breeding in the future.
Collapse
Affiliation(s)
- Donghyun Jeon
- Plant Computational Genomics Laboratory, Department of Science in Smart Agriculture Systems, Chungnam National University, Daejeon, Republic of Korea
| | - Yuna Kang
- Plant Computational Genomics Laboratory, Department of Crop Science, Chungnam National University, Daejeon, Republic of Korea
| | - Solji Lee
- Plant Computational Genomics Laboratory, Department of Crop Science, Chungnam National University, Daejeon, Republic of Korea
| | - Sehyun Choi
- Plant Computational Genomics Laboratory, Department of Crop Science, Chungnam National University, Daejeon, Republic of Korea
| | - Yeonjun Sung
- Plant Computational Genomics Laboratory, Department of Science in Smart Agriculture Systems, Chungnam National University, Daejeon, Republic of Korea
| | - Tae-Ho Lee
- Genomics Division, National Institute of Agricultural Sciences, Jeonju, Republic of Korea
| | - Changsoo Kim
- Plant Computational Genomics Laboratory, Department of Science in Smart Agriculture Systems, Chungnam National University, Daejeon, Republic of Korea
- Plant Computational Genomics Laboratory, Department of Crop Science, Chungnam National University, Daejeon, Republic of Korea
| |
Collapse
|
39
|
Ma N, Jin A, Sun Y, Jin Y, Sun Y, Xiao Q, Sha X, Yu F, Yang L, Liu W, Gao X, Zhang X, Li L. Comprehensive investigating of MMR gene in hepatocellular carcinoma with chronic hepatitis B virus infection in Han Chinese population. Front Oncol 2023; 13:1124459. [PMID: 37035153 PMCID: PMC10079871 DOI: 10.3389/fonc.2023.1124459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 03/09/2023] [Indexed: 04/11/2023] Open
Abstract
Hepatocellular carcinoma associated with chronic hepatitis B virus infection seriously affects human health. Present studies suggest that genetic susceptibility plays an important role in the mechanism of cancer development. Therefore, this study focused on single nucleotide polymorphisms (SNPs) of MMR genes associated with HBV-HCC. Five groups of participants were included in this study, which were healthy control group (HC), spontaneous clearance (SC), chronic hepatitis B group (CHB), HBV-related liver cirrhosis group (LC) and HBV-related hepatocellular carcinoma group (HBV-HCC). A total of 3128 participants met the inclusion and exclusion criteria for this study. 20 polymorphic loci on MSH2, MSH3 and MSH6 were selected for genotyping. There were four case-control studies, which were HC vs. HCC, SC vs. HCC, CHB vs. HCC and LC vs. HCC. We used Hardy-Weinberg equilibrium test, unconditional logistic regression, haplotype analysis, and gene-gene interaction for genetic analysis. Ultimately, after excluding confounding factors such as age, gender, smoking and drinking, 12 polymorphisms were found to be associated with genetic susceptibility to HCC. Haplotype analysis showed the risk haplotype GTTT (rs1805355_G, rs3776968_T, rs1428030_C, rs181747_C) was more frequent in the HCC group compared with the HC group. The GMDR analysis showed that the best interaction model was the three-factor model of MSH2-rs1981928, MSH3-rs26779 and MSH6-rs2348244 in SC vs. HCC group (P=0.001). In addition, we found multiplicative or additive interactions between genes in our selected SNPs. These findings provide new ideas to further explore the etiology and pathogenesis of HCC. We have attempted to explain the molecular mechanisms by which certain SNPs (MSH2-rs4952887, MSH3-rs26779, MSH3-rs181747 and MSH3-rs32950) affect genetic susceptibility to HCC from the perspectives of eQTL, TFBS, cell cycle and so on. We also explained the results of haplotypes and gene-gene interactions. These findings provide new ideas to further explore the etiology and pathogenesis of HCC.
Collapse
Affiliation(s)
- Ning Ma
- Hebei Key Laboratory of Environment and Human Health, Department of Social Medicine and Health Care Management, School of Public Health, Hebei Medical University, Shijiazhuang, China
| | - Ao Jin
- Hebei Key Laboratory of Environment and Human Health, Department of Epidemiology and Statistics, School of Public Health, Hebei Medical University, Shijiazhuang, China
| | - Yitong Sun
- Hebei Key Laboratory of Environment and Human Health, Department of Epidemiology and Statistics, School of Public Health, Hebei Medical University, Shijiazhuang, China
| | - Yiyao Jin
- Hebei Key Laboratory of Environment and Human Health, Department of Epidemiology and Statistics, School of Public Health, Hebei Medical University, Shijiazhuang, China
| | - Yucheng Sun
- Hebei Key Laboratory of Environment and Human Health, Department of Epidemiology and Statistics, School of Public Health, Hebei Medical University, Shijiazhuang, China
| | - Qian Xiao
- Hebei Key Laboratory of Environment and Human Health, Department of Epidemiology and Statistics, School of Public Health, Hebei Medical University, Shijiazhuang, China
| | - XuanYi Sha
- Hebei Key Laboratory of Environment and Human Health, School of Basic Medicine, Hebei Medical University, Shijiazhuang, China
| | - Fengxue Yu
- The Hebei Key Laboratory of Gastroenterology, The Second Hospital of Hebei Medical University, Shijiazhuang, China
| | - Lei Yang
- Hebei Key Laboratory of Environment and Human Health, Department of Epidemiology and Statistics, School of Public Health, Hebei Medical University, Shijiazhuang, China
| | - Wenxuan Liu
- Hebei Key Laboratory of Environment and Human Health, Department of Epidemiology and Statistics, School of Public Health, Hebei Medical University, Shijiazhuang, China
| | - Xia Gao
- Hebei Key Laboratory of Environment and Human Health, Department of Epidemiology and Statistics, School of Public Health, Hebei Medical University, Shijiazhuang, China
| | - Xiaolin Zhang
- Hebei Key Laboratory of Environment and Human Health, Department of Epidemiology and Statistics, School of Public Health, Hebei Medical University, Shijiazhuang, China
- *Correspondence: Xiaolin Zhang, ; Lu Li,
| | - Lu Li
- Hebei Key Laboratory of Environment and Human Health, Department of Social Medicine and Health Care Management, School of Public Health, Hebei Medical University, Shijiazhuang, China
- *Correspondence: Xiaolin Zhang, ; Lu Li,
| |
Collapse
|
40
|
Cahill ME, Montgomery RR. Analytical Approaches to Uncover Genetic Associations for Rare Outcomes: Lessons from West Nile Neuroinvasive Disease. Methods Mol Biol 2023; 2585:193-203. [PMID: 36331775 PMCID: PMC9867870 DOI: 10.1007/978-1-0716-2760-0_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
West Nile viral infection causes severe neuroinvasive disease in less than 1% of infected humans. There are no targeted therapeutics for this serious and potentially fatal disease, and to date no vaccine has been approved for humans. With climate change expected to result in rising incidence of West Nile and other related vector-borne viral infections, there is an increasing need to identify those at risk for serious disease and potential leads for therapeutic and vaccine development. Genetic variation, particularly in genes whose products are either directly or indirectly connected to immune response to infections, is a critical avenue of investigation to identify those at higher risk of clinically apparent West Nile infection. Given the small percent of infections that progress to severe disease and the relatively low numbers of reported infections, it is challenging to conduct well-powered studies to identify genetic factors associated with more severe outcomes. In this chapter, we outline several approaches with the objective to take full advantage of all available data in order to identify genetic factors which lead to increased risk of severe West Nile neuroinvasive disease. These methods are generalizable to other conditions with limited cohort size and rare outcomes.
Collapse
Affiliation(s)
- Megan E Cahill
- Department of Chronic Disease Epidemiology and the Center for Perinatal, Pediatric and Environmental Epidemiology, Yale School of Public Health, New Haven, CT, USA
| | - Ruth R Montgomery
- Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA.
| |
Collapse
|
41
|
Montesinos-López OA, Carter AH, Bernal-Sandoval DA, Cano-Paez B, Montesinos-López A, Crossa J. A Comparison between Three Tuning Strategies for Gaussian Kernels in the Context of Univariate Genomic Prediction. Genes (Basel) 2022; 13:genes13122282. [PMID: 36553547 PMCID: PMC9778581 DOI: 10.3390/genes13122282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 11/15/2022] [Accepted: 11/29/2022] [Indexed: 12/07/2022] Open
Abstract
Genomic prediction is revolutionizing plant breeding since candidate genotypes can be selected without the need to measure their trait in the field. When a reference population contains both phenotypic and genotypic information, it is trained by a statistical machine learning method that is subsequently used for making predictions of breeding or phenotypic values of candidate genotypes that were only genotyped. Nevertheless, the successful implementation of the genomic selection (GS) methodology depends on many factors. One key factor is the type of statistical machine learning method used since some are unable to capture nonlinear patterns available in the data. While kernel methods are powerful statistical machine learning algorithms that capture complex nonlinear patterns in the data, their successful implementation strongly depends on the careful tuning process of the involved hyperparameters. As such, in this paper we compare three methods of tuning (manual tuning, grid search, and Bayesian optimization) for the Gaussian kernel under a Bayesian best linear unbiased predictor model. We used six real datasets of wheat (Triticum aestivum L.) to compare the three strategies of tuning. We found that if we want to obtain the major benefits of using Gaussian kernels, it is very important to perform a careful tuning process. The best prediction performance was observed when the tuning process was performed with grid search and Bayesian optimization. However, we did not observe relevant differences between the grid search and Bayesian optimization approach. The observed gains in terms of prediction performance were between 2.1% and 27.8% across the six datasets under study.
Collapse
Affiliation(s)
| | - Arron H. Carter
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA 99164, USA
| | | | - Bernabe Cano-Paez
- Facultad de Ciencias, Universidad Nacional Autónoma de México (UNAM), Mexico City 04510, Mexico
| | - Abelardo Montesinos-López
- Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara 44430, Mexico
- Correspondence: (A.M.-L.); (J.C.)
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), El Batan, Texcoco 56237, Mexico
- Hidrociencias, Colegio de Postgraduados, Campus Montecillos, Carretera México-Texcoco Km. 36.5, Montecillo 56230, Mexico
- Correspondence: (A.M.-L.); (J.C.)
| |
Collapse
|
42
|
Kismiantini, Montesinos-López A, Cano-Páez B, Montesinos-López JC, Chavira-Flores M, Montesinos-López OA, Crossa J. A Multi-Trait Gaussian Kernel Genomic Prediction Model under Three Tunning Strategies. Genes (Basel) 2022; 13:2279. [PMID: 36553548 PMCID: PMC9778253 DOI: 10.3390/genes13122279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 11/27/2022] [Accepted: 12/01/2022] [Indexed: 12/12/2022] Open
Abstract
While genomic selection (GS) began revolutionizing plant breeding when it was proposed around 20 years ago, its practical implementation is still challenging as many factors affect its accuracy. One such factor is the choice of the statistical machine learning method. For this reason, we explore the tuning process under a multi-trait framework using the Gaussian kernel with a multi-trait Bayesian Best Linear Unbiased Predictor (GBLUP) model. We explored three methods of tuning (manual, grid search and Bayesian optimization) using 5 real datasets of breeding programs. We found that using grid search and Bayesian optimization improve between 1.9 and 6.8% the prediction accuracy regarding of using manual tuning. While the improvement in prediction accuracy in some cases can be marginal, it is very important to carry out the tuning process carefully to improve the accuracy of the GS methodology, even though this entails greater computational resources.
Collapse
Affiliation(s)
- Kismiantini
- Statistics Study Program, Universitas Negeri Yogyakarta, Yogyakarta 55281, Indonesia
| | - Abelardo Montesinos-López
- Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara 44430, Jalisco, Mexico
| | - Bernabe Cano-Páez
- Facultad de Ciencias, Universidad Nacional Autónoma de México (UNAM), México City 04510, Mexico
| | | | - Moisés Chavira-Flores
- Instituto de Investigaciones en Matemáticas Aplicadas y Sistemas (IIMAS), Universidad Nacional Autónoma de México (UNAM), México City 04510, Mexico
| | | | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera Mexico, Veracruz 52640, Edo. de México, Mexico
- Colegio de Postgraduados, Montecillos 56230, Edo. de México, Mexico
| |
Collapse
|
43
|
Papadimitriou S, Gravel B, Nachtegael C, De Baere E, Loeys B, Vikkula M, Smits G, Lenaerts T. Toward reporting standards for the pathogenicity of variant combinations involved in multilocus/oligogenic diseases. HGG ADVANCES 2022; 4:100165. [PMID: 36578772 PMCID: PMC9791921 DOI: 10.1016/j.xhgg.2022.100165] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Although standards and guidelines for the interpretation of variants identified in genes that cause Mendelian disorders have been developed, this is not the case for more complex genetic models including variant combinations in multiple genes. During a large curation process conducted on 318 research articles presenting oligogenic variant combinations, we encountered several recurring issues concerning their proper reporting and pathogenicity assessment. These mainly concern the absence of strong evidence that refutes a monogenic model and the lack of a proper genetic and functional assessment of the joint effect of the involved variants. With the increasing accumulation of such cases, it has become essential to develop standards and guidelines on how these oligogenic/multilocus variant combinations should be interpreted, validated, and reported in order to provide high-quality data and supporting evidence to the scientific community.
Collapse
Affiliation(s)
- Sofia Papadimitriou
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, 1050 Brussels, Belgium,Machine Learning Group, Université Libre de Bruxelles, 1050 Brussels, Belgium,Artificial Intelligence Laboratory, Vrije Universiteit Brussel, 1050 Brussels, Belgium,Corresponding author
| | - Barbara Gravel
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, 1050 Brussels, Belgium,Machine Learning Group, Université Libre de Bruxelles, 1050 Brussels, Belgium,Artificial Intelligence Laboratory, Vrije Universiteit Brussel, 1050 Brussels, Belgium
| | - Charlotte Nachtegael
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, 1050 Brussels, Belgium,Machine Learning Group, Université Libre de Bruxelles, 1050 Brussels, Belgium
| | - Elfride De Baere
- Center for Medical Genetics, Ghent University Hospital, Department of Biomolecular Medicine, Ghent University, 9000 Ghent, Belgium
| | - Bart Loeys
- Center for Medical Genetics, Antwerp University Hospital/University of Antwerp, 2650 Antwerp, Belgium
| | - Miikka Vikkula
- Human Molecular Genetics, de Duve Institute, UCLouvain, Brussels, Belgium
| | - Guillaume Smits
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, 1050 Brussels, Belgium,Center of Human Genetics, Hôpital Erasme, Université Libre de Bruxelles, 1070 Brussels, Belgium,Hôpital Universitaire des Enfants Reine Fabiola, Université Libre de Bruxelles, 1020 Brussels, Belgium
| | - Tom Lenaerts
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, 1050 Brussels, Belgium,Machine Learning Group, Université Libre de Bruxelles, 1050 Brussels, Belgium,Artificial Intelligence Laboratory, Vrije Universiteit Brussel, 1050 Brussels, Belgium,Corresponding author
| |
Collapse
|
44
|
Abd El Hamid MM, Omar YM, Shaheen M, Mabrouk MS. Discovering epistasis interactions in Alzheimer's disease using deep learning model. GENE REPORTS 2022. [DOI: 10.1016/j.genrep.2022.101673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
|
45
|
Woodward AA, Urbanowicz RJ, Naj AC, Moore JH. Genetic heterogeneity: Challenges, impacts, and methods through an associative lens. Genet Epidemiol 2022; 46:555-571. [PMID: 35924480 PMCID: PMC9669229 DOI: 10.1002/gepi.22497] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 07/06/2022] [Accepted: 07/19/2022] [Indexed: 01/07/2023]
Abstract
Genetic heterogeneity describes the occurrence of the same or similar phenotypes through different genetic mechanisms in different individuals. Robustly characterizing and accounting for genetic heterogeneity is crucial to pursuing the goals of precision medicine, for discovering novel disease biomarkers, and for identifying targets for treatments. Failure to account for genetic heterogeneity may lead to missed associations and incorrect inferences. Thus, it is critical to review the impact of genetic heterogeneity on the design and analysis of population level genetic studies, aspects that are often overlooked in the literature. In this review, we first contextualize our approach to genetic heterogeneity by proposing a high-level categorization of heterogeneity into "feature," "outcome," and "associative" heterogeneity, drawing on perspectives from epidemiology and machine learning to illustrate distinctions between them. We highlight the unique nature of genetic heterogeneity as a heterogeneous pattern of association that warrants specific methodological considerations. We then focus on the challenges that preclude effective detection and characterization of genetic heterogeneity across a variety of epidemiological contexts. Finally, we discuss systems heterogeneity as an integrated approach to using genetic and other high-dimensional multi-omic data in complex disease research.
Collapse
Affiliation(s)
- Alexa A. Woodward
- Department of Biostatistics, Epidemiology and InformaticsUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Ryan J. Urbanowicz
- Department of Computational BiomedicineCedars‐Sinai Medical CenterLos AngelesCaliforniaUSA
| | - Adam C. Naj
- Department of Biostatistics, Epidemiology and InformaticsUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Jason H. Moore
- Department of Computational BiomedicineCedars‐Sinai Medical CenterLos AngelesCaliforniaUSA
| |
Collapse
|
46
|
Cui T, El Mekkaoui K, Reinvall J, Havulinna AS, Marttinen P, Kaski S. Gene-gene interaction detection with deep learning. Commun Biol 2022; 5:1238. [PMID: 36371468 PMCID: PMC9653457 DOI: 10.1038/s42003-022-04186-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Accepted: 10/27/2022] [Indexed: 11/13/2022] Open
Abstract
The extent to which genetic interactions affect observed phenotypes is generally unknown because current interaction detection approaches only consider simple interactions between top SNPs of genes. We introduce an open-source framework for increasing the power of interaction detection by considering all SNPs within a selected set of genes and complex interactions between them, beyond only the currently considered multiplicative relationships. In brief, the relation between SNPs and a phenotype is captured by a neural network, and the interactions are quantified by Shapley scores between hidden nodes, which are gene representations that optimally combine information from the corresponding SNPs. Additionally, we design a permutation procedure tailored for neural networks to assess the significance of interactions, which outperformed existing alternatives on simulated datasets with complex interactions, and in a cholesterol study on the UK Biobank it detected nine interactions which replicated on an independent FINRISK dataset.
Collapse
Affiliation(s)
- Tianyu Cui
- Department of Computer Science, Aalto University, Espoo, Finland.
| | | | - Jaakko Reinvall
- Department of Computer Science, Aalto University, Espoo, Finland
| | - Aki S Havulinna
- Finnish Institute for Health and Welfare (THL), Helsinki, Finland
- Institute for Molecular Medicine Finland, FIMM-HiLIFE, Helsinki, Finland
| | - Pekka Marttinen
- Department of Computer Science, Aalto University, Espoo, Finland
- Finnish Institute for Health and Welfare (THL), Helsinki, Finland
| | - Samuel Kaski
- Department of Computer Science, Aalto University, Espoo, Finland
- Department of Computer Science, University of Manchester, Manchester, UK
| |
Collapse
|
47
|
Saha S, Perrin L, Röder L, Brun C, Spinelli L. Epi-MEIF: detecting higher order epistatic interactions for complex traits using mixed effect conditional inference forests. Nucleic Acids Res 2022; 50:e114. [PMID: 36107776 PMCID: PMC9639209 DOI: 10.1093/nar/gkac715] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 07/29/2022] [Accepted: 09/12/2022] [Indexed: 12/04/2022] Open
Abstract
Understanding the relationship between genetic variations and variations in complex and quantitative phenotypes remains an ongoing challenge. While Genome-wide association studies (GWAS) have become a vital tool for identifying single-locus associations, we lack methods for identifying epistatic interactions. In this article, we propose a novel method for higher-order epistasis detection using mixed effect conditional inference forest (epiMEIF). The proposed method is fitted on a group of single nucleotide polymorphisms (SNPs) potentially associated with the phenotype and the tree structure in the forest facilitates the identification of n-way interactions between the SNPs. Additional testing strategies further improve the robustness of the method. We demonstrate its ability to detect true n-way interactions via extensive simulations in both cross-sectional and longitudinal synthetic datasets. This is further illustrated in an application to reveal epistatic interactions from natural variations of cardiac traits in flies (Drosophila). Overall, the method provides a generalized way to identify higher-order interactions from any GWAS data, thereby greatly improving the detection of the genetic architecture underlying complex phenotypes.
Collapse
Affiliation(s)
- Saswati Saha
- Aix Marseille Univ, INSERM, TAGC (UMR1090), Turing Centre for Living systems, Marseille, France
| | - Laurent Perrin
- Aix Marseille Univ, INSERM, TAGC (UMR1090), Turing Centre for Living systems, Marseille, France
- CNRS, Marseille, France
| | - Laurence Röder
- Aix Marseille Univ, INSERM, TAGC (UMR1090), Turing Centre for Living systems, Marseille, France
| | - Christine Brun
- Aix Marseille Univ, INSERM, TAGC (UMR1090), Turing Centre for Living systems, Marseille, France
- CNRS, Marseille, France
| | - Lionel Spinelli
- Aix Marseille Univ, INSERM, TAGC (UMR1090), Turing Centre for Living systems, Marseille, France
| |
Collapse
|
48
|
Da Y, Liang Z, Prakapenka D. Multifactorial methods integrating haplotype and epistasis effects for genomic estimation and prediction of quantitative traits. Front Genet 2022; 13:922369. [PMID: 36313431 PMCID: PMC9614238 DOI: 10.3389/fgene.2022.922369] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2022] [Accepted: 09/12/2022] [Indexed: 11/19/2022] Open
Abstract
The rapid growth in genomic selection data provides unprecedented opportunities to discover and utilize complex genetic effects for improving phenotypes, but the methodology is lacking. Epistasis effects are interaction effects, and haplotype effects may contain local high-order epistasis effects. Multifactorial methods with SNP, haplotype, and epistasis effects up to the third-order are developed to investigate the contributions of global low-order and local high-order epistasis effects to the phenotypic variance and the accuracy of genomic prediction of quantitative traits. These methods include genomic best linear unbiased prediction (GBLUP) with associated reliability for individuals with and without phenotypic observations, including a computationally efficient GBLUP method for large validation populations, and genomic restricted maximum estimation (GREML) of the variance and associated heritability using a combination of EM-REML and AI-REML iterative algorithms. These methods were developed for two models, Model-I with 10 effect types and Model-II with 13 effect types, including intra- and inter-chromosome pairwise epistasis effects that replace the pairwise epistasis effects of Model-I. GREML heritability estimate and GBLUP effect estimate for each effect of an effect type are derived, except for third-order epistasis effects. The multifactorial models evaluate each effect type based on the phenotypic values adjusted for the remaining effect types and can use more effect types than separate models of SNP, haplotype, and epistasis effects, providing a methodology capability to evaluate the contributions of complex genetic effects to the phenotypic variance and prediction accuracy and to discover and utilize complex genetic effects for improving the phenotypes of quantitative traits.
Collapse
Affiliation(s)
- Yang Da
- Department of Animal Science, University of Minnesota, Saint Paul, MN, United States
| | | | | |
Collapse
|
49
|
Mapping the genetic architecture of cortical morphology through neuroimaging: progress and perspectives. Transl Psychiatry 2022; 12:447. [PMID: 36241627 PMCID: PMC9568576 DOI: 10.1038/s41398-022-02193-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 09/06/2022] [Accepted: 09/20/2022] [Indexed: 11/26/2022] Open
Abstract
Cortical morphology is a key determinant of cognitive ability and mental health. Its development is a highly intricate process spanning decades, involving the coordinated, localized expression of thousands of genes. We are now beginning to unravel the genetic architecture of cortical morphology, thanks to the recent availability of large-scale neuroimaging and genomic data and the development of powerful biostatistical tools. Here, we review the progress made in this field, providing an overview of the lessons learned from genetic studies of cortical volume, thickness, surface area, and folding as captured by neuroimaging. It is now clear that morphology is shaped by thousands of genetic variants, with effects that are region- and time-dependent, thereby challenging conventional study approaches. The most recent genome-wide association studies have started discovering common genetic variants influencing cortical thickness and surface area, yet together these explain only a fraction of the high heritability of these measures. Further, the impact of rare variants and non-additive effects remains elusive. There are indications that the quickly increasing availability of data from whole-genome sequencing and large, deeply phenotyped population cohorts across the lifespan will enable us to uncover much of the missing heritability in the upcoming years. Novel approaches leveraging shared information across measures will accelerate this process by providing substantial increases in statistical power, together with more accurate mapping of genetic relationships. Important challenges remain, including better representation of understudied demographic groups, integration of other 'omics data, and mapping of effects from gene to brain to behavior across the lifespan.
Collapse
|
50
|
Nucleotide-based genetic networks: Methods and applications. J Biosci 2022. [PMID: 36226367 PMCID: PMC9554864 DOI: 10.1007/s12038-022-00290-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Genomic variations have been acclaimed as among the key players in understanding the biological mechanisms behind migration, evolution, and adaptation to extreme conditions. Due to stochastic evolutionary forces, the frequency of polymorphisms is affected by changes in the frequency of nearby polymorphisms in the same DNA sample, making them connected in terms of evolution. This article presents all the ingredients to understand the cumulative effects and complex behaviors of genetic variations in the human mitochondrial genome by analyzing co-occurrence networks of nucleotides, and shows key results obtained from such analyses. The article emphasizes recent investigations of these co-occurrence networks, describing the role of interactions between nucleotides in fundamental processes of human migration and viral evolution. The corresponding co-mutation-based genetic networks revealed genetic signatures of human adaptation in extreme environments. This article provides the methods of constructing such networks in detail, along with their graph-theoretical properties, and applications of the genomic networks in understanding the role of nucleotide co-evolution in evolution of the whole genome.
Collapse
|