1
|
Ebrahimie E, Rahimirad S, Tahsili M, Mohammadi-Dehcheshmeh M. Alternative RNA splicing in stem cells and cancer stem cells: Importance of transcript-based expression analysis. World J Stem Cells 2021; 13:1394-1416. [PMID: 34786151 PMCID: PMC8567453 DOI: 10.4252/wjsc.v13.i10.1394] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 06/21/2021] [Accepted: 09/14/2021] [Indexed: 02/06/2023] Open
Abstract
Alternative ribonucleic acid (RNA) splicing can lead to the assembly of different protein isoforms with distinctive functions. The outcome of alternative splicing (AS) can result in a complete loss of function or the acquisition of new functions. There is a gap in knowledge of abnormal RNA splice variants promoting cancer stem cells (CSCs), and their prospective contribution in cancer progression. AS directly regulates the self-renewal features of stem cells (SCs) and stem-like cancer cells. Notably, octamer-binding transcription factor 4A spliced variant of octamer-binding transcription factor 4 contributes to maintaining stemness properties in both SCs and CSCs. The epithelial to mesenchymal transition pathway regulates the AS events in CSCs to maintain stemness. The alternative spliced variants of CSCs markers, including cluster of differentiation 44, aldehyde dehydrogenase, and doublecortin-like kinase, α6β1 integrin, have pivotal roles in increasing self-renewal properties and maintaining the pluripotency of CSCs. Various splicing analysis tools are considered in this study. LeafCutter software can be considered as the best tool for differential splicing analysis and identification of the type of splicing events. Additionally, LeafCutter can be used for efficient mapping splicing quantitative trait loci. Altogether, the accumulating evidence re-enforces the fact that gene and protein expression need to be investigated in parallel with alternative splice variants.
Collapse
Affiliation(s)
- Esmaeil Ebrahimie
- School of Animal and Veterinary Sciences, The University of Adelaide, Adelaide 5005, South Australia, Australia
- La Trobe Genomics Research Platform, School of Life Sciences, College of Science, Health and Engineering, La Trobe University, Melbourne 3086, Australia
- School of Biosciences, The University of Melbourne, Melbourne 3010, Australia,
| | - Samira Rahimirad
- Department of Medical Genetics, National Institute of Genetic Engineering and Biotechnology, Tehran 1497716316, Iran
- Division of Urology, Department of Surgery, McGill University and the Research Institute of the McGill University Health Centre, Montreal H4A 3J1, Quebec, Canada
| | | | | |
Collapse
|
2
|
Trushina NI, Mulkidjanian AY, Brandt R. The microtubule skeleton and the evolution of neuronal complexity in vertebrates. Biol Chem 2020; 400:1163-1179. [PMID: 31116700 DOI: 10.1515/hsz-2019-0149] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Accepted: 04/17/2019] [Indexed: 12/21/2022]
Abstract
The evolution of a highly developed nervous system is mirrored by the ability of individual neurons to develop increased morphological complexity. As microtubules (MTs) are crucially involved in neuronal development, we tested the hypothesis that the evolution of complexity is driven by an increasing capacity of the MT system for regulated molecular interactions as it may be implemented by a higher number of molecular players and a greater ability of the individual molecules to interact. We performed bioinformatics analysis on different classes of components of the vertebrate neuronal MT cytoskeleton. We show that the number of orthologs of tubulin structure proteins, MT-binding proteins and tubulin-sequestering proteins expanded during vertebrate evolution. We observed that protein diversity of MT-binding and tubulin-sequestering proteins increased by alternative splicing. In addition, we found that regions of the MT-binding protein tau and MAP6 displayed a clear increase in disorder extent during evolution. The data provide evidence that vertebrate evolution is paralleled by gene expansions, changes in alternative splicing and evolution of coding sequences of components of the MT system. The results suggest that in particular evolutionary changes in tubulin-structure proteins, MT-binding proteins and tubulin-sequestering proteins were prominent drivers for the development of increased neuronal complexity.
Collapse
Affiliation(s)
- Nataliya I Trushina
- Department of Neurobiology, University of Osnabrück, Barbarastraße 11, D-49076 Osnabrück, Germany
| | - Armen Y Mulkidjanian
- Department of Physics, University of Osnabrück, Barbarastraße 7, D-49076 Osnabrück, Germany.,A.N. Belozersky Institute of Physico-Chemical Biology and School of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow 119991, Russia
| | - Roland Brandt
- Department of Neurobiology, University of Osnabrück, Barbarastraße 11, D-49076 Osnabrück, Germany.,Center for Cellular Nanoanalytics, University of Osnabrück, Barbarastraße 11, D-49076 Osnabrück, Germany.,Institute of Cognitive Science, University of Osnabrück, Barbarastraße 11, D-49076 Osnabrück, Germany
| |
Collapse
|
3
|
Xu J, Lu Z, Xu M, Rossi GC, Kest B, Waxman AR, Pasternak GW, Pan YX. Differential expressions of the alternatively spliced variant mRNAs of the µ opioid receptor gene, OPRM1, in brain regions of four inbred mouse strains. PLoS One 2014; 9:e111267. [PMID: 25343478 PMCID: PMC4208855 DOI: 10.1371/journal.pone.0111267] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2014] [Accepted: 09/19/2014] [Indexed: 01/20/2023] Open
Abstract
The µ opioid receptor gene, OPRM1, undergoes extensive alternative pre-mRNA splicing in rodents and humans, with dozens of alternatively spliced variants of the OPRM1 gene. The present studies establish a SYBR green quantitative PCR (qPCR) assay to more accurately quantify mouse OPRM1 splice variant mRNAs. Using these qPCR assays, we examined the expression of OPRM1 splice variant mRNAs in selected brain regions of four inbred mouse strains displaying differences in µ opioid-induced tolerance and physical dependence: C56BL/6J, 129P3/J, SJL/J and SWR/J. The complete mRNA expression profiles of the OPRM1 splice variants reveal marked differences of the variant mRNA expression among the brain regions in each mouse strain, suggesting region-specific alternative splicing of the OPRM1 gene. The expression of many variants was also strain-specific, implying a genetic influence on OPRM1 alternative splicing. The expression levels of a number of the variant mRNAs in certain brain regions appear to correlate with strain sensitivities to morphine analgesia, tolerance and physical dependence in four mouse strains.
Collapse
Affiliation(s)
- Jin Xu
- Department of Neurology and Molecular Pharmacology and Chemistry Program, Memorial Sloan Kettering Cancer Center, New York, New York, United States of America
| | - Zhigang Lu
- Department of Neurology and Molecular Pharmacology and Chemistry Program, Memorial Sloan Kettering Cancer Center, New York, New York, United States of America
| | - Mingming Xu
- Department of Neurology and Molecular Pharmacology and Chemistry Program, Memorial Sloan Kettering Cancer Center, New York, New York, United States of America
| | - Grace C. Rossi
- Department of Psychology, Long Island University, Post Campus, Brookville, New York, United States of America
| | - Benjamin Kest
- Department of Psychology and Center for Developmental Neuroscience, City University of New York, Staten Island, New York, United States of America
| | - Amanda R. Waxman
- Department of Psychology and Center for Developmental Neuroscience, City University of New York, Staten Island, New York, United States of America
| | - Gavril W. Pasternak
- Department of Neurology and Molecular Pharmacology and Chemistry Program, Memorial Sloan Kettering Cancer Center, New York, New York, United States of America
| | - Ying-Xian Pan
- Department of Neurology and Molecular Pharmacology and Chemistry Program, Memorial Sloan Kettering Cancer Center, New York, New York, United States of America
| |
Collapse
|
4
|
Hess JL, Glatt SJ. How might ZNF804A variants influence risk for schizophrenia and bipolar disorder? A literature review, synthesis, and bioinformatic analysis. Am J Med Genet B Neuropsychiatr Genet 2014; 165B:28-40. [PMID: 24123948 DOI: 10.1002/ajmg.b.32207] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/13/2013] [Accepted: 09/12/2013] [Indexed: 01/16/2023]
Abstract
The gene that encodes zinc finger protein 804A (ZNF804A) became a candidate risk gene for schizophrenia (SZ) after surpassing genome-wide significance thresholds in replicated genome-wide association scans and meta-analyses. Much remains unknown about this reported gene expression regulator; however, preliminary work has yielded insights into functional and biological effects of ZNF804A by targeting its regulatory activities in vitro and by characterizing allele-specific interactions with its risk-conferring single nucleotide polymorphisms (SNPs). There is now strong epidemiologic evidence for a role of ZNF804A polymorphisms in both SZ and bipolar disorder (BD); however, functional links between implicated variants and susceptible biological states have not been solidified. Here we briefly review the genetic evidence implicating ZNF804A polymorphisms as genetic risk factors for both SZ and BD, and discuss the potential functional consequences of these variants on the regulation of ZNF804A and its downstream targets. Empirical work and predictive bioinformatic analyses of the alternate alleles of the two most strongly implicated ZNF804A polymorphisms suggest they might alter the affinity of the gene sequence for DNA- and/or RNA-binding proteins, which might in turn alter expression levels of the gene or particular ZNF804A isoforms. Future work should focus on clarifying the critical periods and cofactors regulating these genetic influences on ZNF804A expression, as well as the downstream biological consequences of an imbalance in the expression of ZNF804A and its various mRNA isoforms.
Collapse
Affiliation(s)
- Jonathan L Hess
- Psychiatric Genetic Epidemiology & Neurobiology Laboratory (PsychGENe Lab), Departments of Psychiatry and Behavioral Sciences and Neuroscience and Physiology, SUNY Upstate Medical University, Syracuse, New York
| | | |
Collapse
|
5
|
TIPMaP: a web server to establish transcript isoform profiles from reliable microarray probes. BMC Genomics 2013; 14:922. [PMID: 24373374 PMCID: PMC3884118 DOI: 10.1186/1471-2164-14-922] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2013] [Accepted: 12/23/2013] [Indexed: 01/22/2023] Open
Abstract
Background Standard 3′ Affymetrix gene expression arrays have contributed a significantly higher volume of existing gene expression data than other microarray platforms. These arrays were designed to identify differentially expressed genes, but not their alternatively spliced transcript forms. No resource can currently identify expression pattern of specific mRNA forms using these microarray data, even though it is possible to do this. Results We report a web server for expression profiling of alternatively spliced transcripts using microarray data sets from 31 standard 3′ Affymetrix arrays for human, mouse and rat species. The tool has been experimentally validated for mRNAs transcribed or not-detected in a human disease condition (non-obstructive azoospermia, a male infertility condition). About 4000 gene expression datasets were downloaded from a public repository. ‘Good probes’ with complete coverage and identity to latest reference transcript sequences were first identified. Using them, ‘Transcript specific probe-clusters’ were derived for each platform and used to identify expression status of possible transcripts. The web server can lead the user to datasets corresponding to specific tissues, conditions via identifiers of the microarray studies or hybridizations, keywords, official gene symbols or reference transcript identifiers. It can identify, in the tissues and conditions of interest, about 40% of known transcripts as ‘transcribed’, ‘not-detected’ or ‘differentially regulated’. Corresponding additional information for probes, genes, transcripts and proteins can be viewed too. We identified the expression of transcripts in a specific clinical condition and validated a few of these transcripts by experiments (using reverse transcription followed by polymerase chain reaction). The experimental observations indicated higher agreements with the web server results, than contradictions. The tool is accessible at http://resource.ibab.ac.in/TIPMaP. Conclusion The newly developed online tool forms a reliable means for identification of alternatively spliced transcript-isoforms that may be differentially expressed in various tissues, cell types or physiological conditions. Thus, by making better use of existing data, TIPMaP avoids the dependence on precious tissue-samples, in experiments with a goal to establish expression profiles of alternative splice forms – at least in some cases.
Collapse
|
6
|
Roy B, Haupt LM, Griffiths LR. Review: Alternative Splicing (AS) of Genes As An Approach for Generating Protein Complexity. Curr Genomics 2013; 14:182-94. [PMID: 24179441 PMCID: PMC3664468 DOI: 10.2174/1389202911314030004] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2012] [Revised: 02/08/2013] [Accepted: 02/25/2013] [Indexed: 12/22/2022] Open
Abstract
Prior to the completion of the human genome project, the human genome was thought to have a greater number of genes as it seemed structurally and functionally more complex than other simpler organisms. This along with the belief of “one gene, one protein”, were demonstrated to be incorrect. The inequality in the ratio of gene to protein formation gave rise to the theory of alternative splicing (AS). AS is a mechanism by which one gene gives rise to multiple protein products. Numerous databases and online bioinformatic tools are available for the detection and analysis of AS. Bioinformatics provides an important approach to study mRNA and protein diversity by various tools such as expressed sequence tag (EST) sequences obtained from completely processed mRNA. Microarrays and deep sequencing approaches also aid in the detection of splicing events. Initially it was postulated that AS occurred only in about 5% of all genes but was later found to be more abundant. Using bioinformatic approaches, the level of AS in human genes was found to be fairly high with 35-59% of genes having at least one AS form. Our ability to determine and predict AS is important as disorders in splicing patterns may lead to abnormal splice variants resulting in genetic diseases. In addition, the diversity of proteins produced by AS poses a challenge for successful drug discovery and therefore a greater understanding of AS would be beneficial.
Collapse
Affiliation(s)
- Bishakha Roy
- Genomics Research Centre, Griffith Health Institute, Griffith University Gold Coast, Queensland 4222, Australia
| | | | | |
Collapse
|
7
|
Ringwald M, Wu C, Su AI. BioGPS and GXD: mouse gene expression data-the benefits and challenges of data integration. Mamm Genome 2012; 23:550-8. [PMID: 22847375 DOI: 10.1007/s00335-012-9408-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2012] [Accepted: 06/21/2012] [Indexed: 01/30/2023]
Abstract
Mouse gene expression data are complex and voluminous. To maximize the utility of these data, they must be made readily accessible through databases, and those resources need to place the expression data in the larger biological context. Here we describe two community resources that approach these problems in different but complementary ways: BioGPS and the Mouse Gene Expression Database (GXD). BioGPS connects its large and homogeneous microarray gene expression reference data sets via plugins with a heterogeneous collection of external gene centric resources, thus casting a wide but loose net. GXD acquires different types of expression data from many sources and integrates these data tightly with other types of data in the Mouse Genome Informatics (MGI) resource, with a strong emphasis on consistency checks and manual curation. We describe and contrast the "loose" and "tight" data integration strategies employed by BioGPS and GXD, respectively, and discuss the challenges and benefits of data integration. BioGPS is freely available at http://biogps.org . GXD is freely available through the MGI web site ( www.informatics.jax.org ) or directly at www.informatics.jax.org/expression.shtml .
Collapse
Affiliation(s)
- Martin Ringwald
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA.
| | | | | |
Collapse
|
8
|
Chen L. Statistical and Computational Methods for High-Throughput Sequencing Data Analysis of Alternative Splicing. STATISTICS IN BIOSCIENCES 2012; 5:138-155. [PMID: 24058384 DOI: 10.1007/s12561-012-9064-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
The burgeoning field of high-throughput sequencing significantly improves our ability to understand the complexity of transcriptomes. Alternative splicing, as one of the most important driving forces for transcriptome diversity, can now be studied at an unprecedent resolution. Efficient and powerful computational and statistical methods are in urgent need to facilitate the characterization and quantification of alternative splicing events. Here we discuss methods in splice junction read mapping, and methods in exon-centric or isoform-centric quantification of alternative splicing. In addition, we discuss HITS-CLIP and splicing QTL analyses which are novel high-throughput sequencing based approaches in the dissection of splicing regulation.
Collapse
Affiliation(s)
- Liang Chen
- Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
9
|
Pirola Y, Rizzi R, Picardi E, Pesole G, Della Vedova G, Bonizzoni P. PIntron: a fast method for detecting the gene structure due to alternative splicing via maximal pairings of a pattern and a text. BMC Bioinformatics 2012; 13 Suppl 5:S2. [PMID: 22537006 PMCID: PMC3358663 DOI: 10.1186/1471-2105-13-s5-s2] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND A challenging issue in designing computational methods for predicting the gene structure into exons and introns from a cluster of transcript (EST, mRNA) sequences, is guaranteeing accuracy as well as efficiency in time and space, when large clusters of more than 20,000 ESTs and genes longer than 1 Mb are processed. Traditionally, the problem has been faced by combining different tools, not specifically designed for this task. RESULTS We propose a fast method based on ad hoc procedures for solving the problem. Our method combines two ideas: a novel algorithm of proved small time complexity for computing spliced alignments of a transcript against a genome, and an efficient algorithm that exploits the inherent redundancy of information in a cluster of transcripts to select, among all possible factorizations of EST sequences, those allowing to infer splice site junctions that are largely confirmed by the input data. The EST alignment procedure is based on the construction of maximal embeddings, that are sequences obtained from paths of a graph structure, called embedding graph, whose vertices are the maximal pairings of a genomic sequence T and an EST P. The procedure runs in time linear in the length of P and T and in the size of the output.The method was implemented into the PIntron package. PIntron requires as input a genomic sequence or region and a set of EST and/or mRNA sequences. Besides the prediction of the full-length transcript isoforms potentially expressed by the gene, the PIntron package includes a module for the CDS annotation of the predicted transcripts. CONCLUSIONS PIntron, the software tool implementing our methodology, is available at http://www.algolab.eu/PIntron under GNU AGPL. PIntron has been shown to outperform state-of-the-art methods, and to quickly process some critical genes. At the same time, PIntron exhibits high accuracy (sensitivity and specificity) when benchmarked with ENCODE annotations.
Collapse
Affiliation(s)
- Yuri Pirola
- Dipartimento di Informatica Sistemistica e Comunicazione, Univ, degli Studi di Milano-Bicocca, Milano, 20126, Italy
| | | | | | | | | | | |
Collapse
|
10
|
Buendia P, Tyree J, Loredo R, Hsu SN. Identification of conserved splicing motifs in mutually exclusive exons of 15 insect species. BMC Genomics 2012; 13 Suppl 2:S1. [PMID: 22537296 PMCID: PMC3303723 DOI: 10.1186/1471-2164-13-s2-s1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Background During alternative splicing, the inclusion of an exon in the final mRNA molecule is determined by nuclear proteins that bind cis-regulatory sequences in a target pre-mRNA molecule. A recent study suggested that the regulatory codes of individual RNA-binding proteins may be nearly immutable between very diverse species such as mammals and insects. The model system Drosophila melanogaster therefore presents an excellent opportunity for the study of alternative splicing due to the availability of quality EST annotations in FlyBase. Methods In this paper, we describe an in silico analysis pipeline to extract putative exonic splicing regulatory sequences from a multiple alignment of 15 species of insects. Our method, ESTs-to-ESRs (E2E), uses graph analysis of EST splicing graphs to identify mutually exclusive (ME) exons and combines phylogenetic measures, a sliding window approach along the multiple alignment and the Welch's t statistic to extract conserved ESR motifs. Results The most frequent 100% conserved word of length 5 bp in different insect exons was "ATGGA". We identified 799 statistically significant "spike" hexamers, 218 motifs with either a left or right FDR corrected spike magnitude p-value < 0.05 and 83 with both left and right uncorrected p < 0.01. 11 genes were identified with highly significant motifs in one ME exon but not in the other, suggesting regulation of ME exon splicing through these highly conserved hexamers. The majority of these genes have been shown to have regulated spatiotemporal expression. 10 elements were found to match three mammalian splicing regulator databases. A putative ESR motif, GATGCAG, was identified in the ME-13b but not in the ME-13a of Drosophila N-Cadherin, a gene that has been shown to have a distinct spatiotemporal expression pattern of spliced isoforms in a recent study. Conclusions Analysis of phylogenetic relationships and variability of sequence conservation as implemented in the E2E spikes method may lead to improved identification of ESRs. We found that approximately half of the putative ESRs in common between insects and mammals have a high statistical support (p < 0.01). Several Drosophila genes with spatiotemporal expression patterns were identified to contain putative ESRs located in one exon of the ME exon pairs but not in the other.
Collapse
|
11
|
Zou X, Jiang Y, Zheng Y, Zhang M, Zhang Z. Prolyl 4-hydroxylase genes are subjected to alternative splicing in roots of maize seedlings under waterlogging. ANNALS OF BOTANY 2011; 108:1323-35. [PMID: 21969257 PMCID: PMC3197451 DOI: 10.1093/aob/mcr223] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
|
12
|
Laaser I, Theis FJ, de Angelis MH, Kolb HJ, Adamski J. Huge splicing frequency in human Y chromosomal UTY gene. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2011; 15:141-54. [PMID: 21329462 DOI: 10.1089/omi.2010.0107] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Over 90% of human genes produce more than one mRNA by alternative splicing (AS). Human UTY (ubiquitously transcribed tetratricopeptide repeat protein on the chromosome Y) has six mRNA-transcripts. UTY is subject to interdisciplinary approaches such as Y chromosomal genetics or development of leukemia immunotherapy based on UTY-specific peptides. Investigating UTY expression in a normal and leukemic setting we discovered an exceptional splicing phenomenon fostering huge transcript diversity. Transcript sequencing identified 90 novel AS-events being almost randomly combined in 284 new transcripts. We uncovered a novel system of transcript architecture and genomic organization in UTY. On a basis of a new UTY-splicing multigraph including a mathematical model we calculated the theoretical yield to exceed 1.3 billion distinct transcripts. To our knowledge, this is the greatest estimated transcript diversity by AS. On protein level we demonstrated interaction of AS-derived proteins with new interactors by yeast-two-hybrid assay. For translational research we predicted new UTY-peptide candidates for leukemia therapy development. Our study provides new insights into the complexity of human alternative splicing and its potential contribution to the transcript diversity of the transcriptome.
Collapse
Affiliation(s)
- Ingeborg Laaser
- Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Experimental Genetics, Genome Analysis Center, Neuherberg, Germany
| | | | | | | | | |
Collapse
|
13
|
Hsiao TH, Lin CH, Lee TT, Cheng JY, Wei PK, Chuang EY, Peck K. Verifying expressed transcript variants by detecting and assembling stretches of consecutive exons. Nucleic Acids Res 2010; 38:e187. [PMID: 20798177 PMCID: PMC2978383 DOI: 10.1093/nar/gkq754] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
We herein describe an integrated system for the high-throughput analysis of splicing events and the identification of transcript variants. The system resolves individual splicing events and elucidates transcript variants via a pipeline that combines aspects such as bioinformatic analysis, high-throughput transcript variant amplification, and high-resolution capillary electrophoresis. For the 14 369 human genes known to have transcript variants, minimal primer sets were designed to amplify all transcript variants and examine all splicing events; these have been archived in the ASprimerDB database, which is newly described herein. A high-throughput thermocycler, dubbed GenTank, was developed to simultaneously perform thousands of PCR amplifications. Following the resolution of the various amplicons by capillary gel electrophoresis, two new computer programs, AmpliconViewer and VariantAssembler, may be used to analyze the splicing events, assemble the consecutive exons embodied by the PCR amplicons, and distinguish expressed versus putative transcript variants. This novel system not only facilitates the validation of putative transcript variants and the detection of novel transcript variants, it also semi-quantitatively measures the transcript variant expression levels of each gene. To demonstrate the system’s capability, we used it to resolve transcript variants yielded by single and multiple splicing events, and to decipher the exon connectivity of long transcripts.
Collapse
Affiliation(s)
- Tzu-Hung Hsiao
- Departmant of Electrical Engineering, National Taiwan University, Taipei, Taiwan 106, ROC
| | | | | | | | | | | | | |
Collapse
|
14
|
Barbazuk WB. A conserved alternative splicing event in plants reveals an ancient exonization of 5S rRNA that regulates TFIIIA. RNA Biol 2010; 7:397-402. [PMID: 20699638 DOI: 10.4161/rna.7.4.12684] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Uncovering conserved alternative splicing (AS) events can identify AS events that perform important functions. This is especially useful for identifying premature stop codon containing (PTC) AS isoforms that may regulate protein expression by being targets for nonsense mediated decay. This report discusses the identification of a PTC containing splice isoform of the TFIIIA gene that is highly conserved in land plants. TFIIIA is essential for RNA Polymerase III-based transcription of 5S rRNA in eukaryotes. Two independent groups have determined that the PTC containing alternative exon is ultraconserved and is coupled with nonsense-mediated mRNA decay. The alternative exon appears to have been derived by the exonization of 5S ribosomal RNA (5S rRNA) within the gene of its own transcription regulator, TFIIIA. This provides the first evidence of ancient exaptation of 5S rRNA in plants, suggesting a novel gene regulation model mediated by the AS of an anciently exonized non-coding element.
Collapse
Affiliation(s)
- W Brad Barbazuk
- Department of Biology and the Florida Genetics Institute, University of Florida, Gainesville, FL USA.
| |
Collapse
|
15
|
Chang KY, Georgianna DR, Heber S, Payne GA, Muddiman DC. Detection of alternative splice variants at the proteome level in Aspergillus flavus. J Proteome Res 2010; 9:1209-17. [PMID: 20047314 DOI: 10.1021/pr900602d] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Identification of proteins from proteolytic peptides or intact proteins plays an essential role in proteomics. Researchers use search engines to match the acquired peptide sequences to the target proteins. However, search engines depend on protein databases to provide candidates for consideration. Alternative splicing (AS), the mechanism where the exon of pre-mRNAs can be spliced and rearranged to generate distinct mRNA and therefore protein variants, enable higher eukaryotic organisms, with only a limited number of genes, to have the requisite complexity and diversity at the proteome level. Multiple alternative isoforms from one gene often share common segments of sequences. However, many protein databases only include a limited number of isoforms to keep minimal redundancy. As a result, the database search might not identify a target protein even with high quality tandem MS data and accurate intact precursor ion mass. We computationally predicted an exhaustive list of putative isoforms of Aspergillus flavus proteins from 20 371 expressed sequence tags to investigate whether an alternative splicing protein database can assign a greater proportion of mass spectrometry data. The newly constructed AS database provided 9807 new alternatively spliced variants in addition to 12 832 previously annotated proteins. The searches of the existing tandem MS spectra data set using the AS database identified 29 new proteins encoded by 26 genes. Nine fungal genes appeared to have multiple protein isoforms. In addition to the discovery of splice variants, AS database also showed potential to improve genome annotation. In summary, the introduction of an alternative splicing database helps identify more proteins and unveils more information about a proteome.
Collapse
Affiliation(s)
- Kung-Yen Chang
- Bioinformatics Research Center, Center for Integrated Fungal Research, and W.M. Keck FT-ICR-MS Laboratory, Department of Chemistry, North Carolina State University, Raleigh, North Carolina 27695, USA
| | | | | | | | | |
Collapse
|
16
|
Alternative splicing and gene duplication differentially shaped the regulation of isochorismate synthase in Populus and Arabidopsis. Proc Natl Acad Sci U S A 2009; 106:22020-5. [PMID: 19996170 DOI: 10.1073/pnas.0906869106] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Isochorismate synthase (ICS) converts chorismate to isochorismate for the biosynthesis of phylloquinone, an essential cofactor for photosynthetic electron transport. ICS is also required for salicylic acid (SA) synthesis during Arabidopsis defense. In several other species, including Populus, SA is derived primarily from the phenylpropanoid pathway. We therefore sought to investigate ICS regulation in Populus to learn the extent of ICS involvement in SA synthesis and defense. Arabidopsis harbors duplicated AtICS genes that differ in their exon-intron structure, basal expression, and stress inducibility. In contrast, we found a single ICS gene in Populus and six other sequenced plant genomes, pointing to the AtICS duplication as a lineage-specific event. The Populus ICS encodes a functional plastidic enzyme, and was not responsive to stresses that stimulated phenylpropanoid accumulation. Populus ICS underwent extensive alternative splicing that was rare for the duplicated AtICSs. Sequencing of 184 RT-PCR Populus clones revealed 37 alternative splice variants, with normal transcripts representing approximately 50% of the population. When expressed in Arabidopsis, Populus ICS again underwent alternative splicing, but did not produce normal transcripts to complement AtICS1 function. The splice-site sequences of Populus ICS are unusual, suggesting a causal link between junction sequence, alternative splicing, and ICS function. We propose that gene duplication and alternative splicing of ICS evolved independently in Arabidopsis and Populus in accordance with their distinct defense strategies. AtICS1 represents a divergent isoform for inducible SA synthesis during defense. Populus ICS primarily functions in phylloquinone biosynthesis, a process that can be sustained at low ICS transcript levels.
Collapse
|
17
|
Chacko E, Ranganathan S. Genome-wide analysis of alternative splicing in cow: implications in bovine as a model for human diseases. BMC Genomics 2009; 10 Suppl 3:S11. [PMID: 19958474 PMCID: PMC2788363 DOI: 10.1186/1471-2164-10-s3-s11] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
BACKGROUND Alternative splicing (AS) is a primary mechanism of functional regulation in the human genome, with 60% to 80% of human genes being alternatively spliced. As part of the bovine genome annotation team, we have analysed 4567 bovine AS genes, compared to 16715 human and 16491 mouse AS genes, along with Gene Ontology (GO) analysis. We also analysed the two most important events, cassette exons and intron retention in 94 human disease genes and mapped them to the bovine orthologous genes. Of the 94 human inherited disease genes, a protein domain analysis was carried out for the transcript sequences of 12 human genes that have orthologous genes and have been characterised in cow. RESULTS Of the 21,755 bovine genes, 4,567 genes (21%) are alternatively spliced, compared to 16,715 (68%) in human and 16,491 (57%) in mouse. Gene-level analysis of the orthologous set suggested that bovine genes show fewer AS events compared to human and mouse genes. A detailed examination of cassette exons across human and cow for 94 human disease genes, suggested that a majority of cassette exons in human were present and constitutive in bovine as opposed to intron retention which exhibited 50% of the exons as present and 50% as absent in cow. We observed that AS plays a major role in disease implications in human through manipulations of essential/functional protein domains. It was also evident that majority of these 12 genes had conservation of all essential domains in their bovine orthologous counterpart, for these human diseases. CONCLUSION While alternative splicing has the potential to create many mRNA isoforms from a single gene, in cow the majority of genes generate two to three isoforms, compared to six in human and four in mouse. Our analyses demonstrated that a smaller number of bovine genes show greater transcript diversity. GO definitions for bovine AS genes provided 38% more functional information than currently available in the sequence database. Our protein domain analysis helped us verify the suitability of using bovine as a model for human diseases and also recognize the contribution of AS towards the disease phenotypes.
Collapse
Affiliation(s)
- Elsa Chacko
- Department of Chemistry and Biomolecular Sciences and ARC Centre of Excellence in Bioinformatics, Macquarie University, Sydney, NSW 2109, Australia.
| | | |
Collapse
|
18
|
Hsu SN, Hertel KJ. Spliceosomes walk the line: splicing errors and their impact on cellular function. RNA Biol 2009; 6:526-30. [PMID: 19829058 DOI: 10.4161/rna.6.5.9860] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
The splicing of nuclear pre-mRNAs is a fundamental process required for the expression of most metazoan genes. The majority of the approximately 25,000 genes encoded by the human genome has been shown to produce more than one kind of transcripts through alternative splicing. Alternative splicing of pre-mRNAs can lead to the production of multiple protein isoforms from a single gene, significantly enriching the proteomic diversity of higher eukaryotic organisms. Because regulation of this process determines the timing and location that a particular protein isoform is produced, changes of alternative splicing patterns have the potential to modulate many cellular activities. Consequently, pre-mRNA splicing must occur with a high degree of specificity and fidelity to ensure the appropriate expression of functional mRNAs. Here we review recent progress made in understanding the extent of alternative splicing within the human genome with particular emphasis on splicing fidelity.
Collapse
Affiliation(s)
- Shu-Ning Hsu
- Department of Microbiology & Molecular Genetics, University of California, Irvine, CA, USA
| | | |
Collapse
|
19
|
Chacko E, Ranganathan S. Comprehensive splicing graph analysis of alternative splicing patterns in chicken, compared to human and mouse. BMC Genomics 2009; 10 Suppl 1:S5. [PMID: 19594882 PMCID: PMC2709266 DOI: 10.1186/1471-2164-10-s1-s5] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
Background Alternative transcript diversity manifests itself as a prime cause of complexity in higher eukaryotes. Recently, transcript diversity studies have suggested that 60–80% of human genes are alternatively spliced. We have used a splicing pattern approach for the bioinformatics analysis of Alternative Splicing (AS) in chicken, human and mouse. Exons involved in splicing are subdivided into distinct and variant exons, based on the prevalence of the exons across the transcripts. Four possible permutations of these two different groups of exons were categorised as class I (distinct-variant), class II (distinct-variant), class III (variant-distinct) and class IV (variant-variant). This classification quantifies the variation in transcript diversity in the three species. Results In all, 3901 chicken AS genes have been compared with 16,715 human and 16,491 mouse AS genes, with 23% of chicken genes being alternatively spliced, compared to 68% in humans and 57% in mice. To minimize any gene structure bias in the input data, comparative genome analysis has been carried out on the orthologous subset of AS genes for the three species. Gene-level analysis suggested that chicken genes show fewer AS events compared to human and mouse. An event-level analysis showed that the percentage of AS events in chicken is similar to that of human, which implies that a smaller number of chicken genes show greater transcript diversity. Overall, chicken genes were found to have fewer transcripts per gene and shorter introns than human and mouse genes. Conclusion In chicken, the majority of genes generate only two or three isoforms, compared to almost eight in human and six in mouse. We observed that intron definition is expressed strongly when compared to exon definition for chicken genome, based on 3% intron retention in chicken, compared to 2% in human and mouse. Splicing patterns with variant exons account for 33% of AS chicken orthologous genes compared to 24% in human and 27% in mouse, providing a novel measure to describe the species-wise complexity due to alternative transcript diversity.
Collapse
Affiliation(s)
- Elsa Chacko
- Department of Chemistry and Biomolecular Sciences, Macquarie University, NSW, Australia.
| | | |
Collapse
|
20
|
Wong TKF, Lam TW, Yang W, Yiu SM. Finding alternative splicing patterns with strong support from expressed sequences on individual exons/introns. J Bioinform Comput Biol 2009; 6:1021-33. [PMID: 18942164 DOI: 10.1142/s0219720008003825] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2007] [Revised: 02/27/2008] [Accepted: 03/22/2008] [Indexed: 11/18/2022]
Abstract
We consider the problem of predicting alternative splicing patterns from a set of expressed sequences (cDNAs and ESTs). Some of these expressed sequences may be errorous, thus forming incorrect exons/introns. These incorrect exons/introns may cause a lot of false positives. For example, we examined a popular alternative splicing database, ECgene, which predicts alternate splicing patterns from expressed sequences. The result shows that about 81.3%-81.6% (sensitivity) of known patterns are found, but the specificity can be as low as 5.9%. Based on the idea that errorous sequences are usually not consistent with other sequences, in this paper we provide an alternative approach for finding alternative splicing patterns which ensures that individual exons/introns of the reported patterns have enough support from the expressed sequences. On the same dataset, our approach can achieve a much higher specificity and a slight increase in sensitivity (38.9% and 84.9%, respectively). Our approach also gives better results compared with popular alternative splicing databases (ASD, ECgene, SpliceNest) and the software ClusterMerge.
Collapse
Affiliation(s)
- Thomas K F Wong
- Department of Computer Science, The University of Hong Kong, Hong Kong.
| | | | | | | |
Collapse
|
21
|
Bonizzoni P, Mauri G, Pesole G, Picardi E, Pirola Y, Rizzi R. Detecting Alternative Gene Structures from Spliced ESTs: A Computational Approach. J Comput Biol 2009; 16:43-66. [DOI: 10.1089/cmb.2008.0028] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Affiliation(s)
- Paola Bonizzoni
- Dipartimento di Informatica Sistemistica e Comunicazione, Università degli Studi di Milano–Bicocca, Milano, Italy
| | - Giancarlo Mauri
- Dipartimento di Informatica Sistemistica e Comunicazione, Università degli Studi di Milano–Bicocca, Milano, Italy
| | - Graziano Pesole
- Dipartimento di Biochimica e Biologia Molecolare, Università degli Studi di Bari, Bari, Italy
| | - Ernesto Picardi
- Dipartimento di Biochimica e Biologia Molecolare, Università degli Studi di Bari, Bari, Italy
| | - Yuri Pirola
- Dipartimento di Informatica Sistemistica e Comunicazione, Università degli Studi di Milano–Bicocca, Milano, Italy
| | - Raffaella Rizzi
- Dipartimento di Informatica Sistemistica e Comunicazione, Università degli Studi di Milano–Bicocca, Milano, Italy
| |
Collapse
|
22
|
Bonizzoni P, Della Vedova G, Dondi R, Pirola Y, Rizzi R. Minimum Factorization Agreement of Spliced ESTs. LECTURE NOTES IN COMPUTER SCIENCE 2009. [DOI: 10.1007/978-3-642-04241-6_1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
|
23
|
Width of gene expression profile drives alternative splicing. PLoS One 2008; 3:e3587. [PMID: 18974852 PMCID: PMC2575406 DOI: 10.1371/journal.pone.0003587] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2008] [Accepted: 10/09/2008] [Indexed: 01/26/2023] Open
Abstract
Alternative splicing generates an enormous amount of functional and proteomic diversity in metazoan organisms. This process is probably central to the macromolecular and cellular complexity of higher eukaryotes. While most studies have focused on the molecular mechanism triggering and controlling alternative splicing, as well as on its incidence in different species, its maintenance and evolution within populations has been little investigated. Here, we propose to address these questions by comparing the structural characteristics as well as the functional and transcriptional profiles of genes with monomorphic or polymorphic splicing, referred to as MS and PS genes, respectively. We find that MS and PS genes differ particularly in the number of tissues and cell types where they are expressed.We find a striking deficit of PS genes on the sex chromosomes, particularly on the Y chromosome where it is shown not to be due to the observed lower breadth of expression of genes on that chromosome. The development of a simple model of evolution of cis-regulated alternative splicing leads to predictions in agreement with these observations. It further predicts the conditions for the emergence and the maintenance of cis-regulated alternative splicing, which are both favored by the tissue specific expression of splicing variants. We finally propose that the width of the gene expression profile is an essential factor for the acquisition of new transcript isoforms that could later be maintained by a new form of balancing selection.
Collapse
|
24
|
The carnitine acetyltransferase gene (CRAT): a characterization of porcine transcripts with insights into the 5'-end variants of mammalian transcripts and their possible sub-cellular localization. Cell Mol Biol Lett 2008; 14:90-9. [PMID: 18839069 PMCID: PMC6275765 DOI: 10.2478/s11658-008-0036-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2008] [Accepted: 07/11/2008] [Indexed: 11/20/2022] Open
Abstract
Carnitine acetyltransferase (CRAT) is an important enzyme for energy homeostasis and fat metabolism. We characterized the predicted full length cDNA sequence of the porcine CRAT gene. Its structure is very similar to that in humans with respect to the size and organization of the 14 exons. We demonstrated the existence of a porcine alternative transcript resulting from a partial intron-retention at the 5’ end of exon 2. To perform a comparison of the 5’ end variants of the mammalian CRAT gene, we analyzed the Genbank data, and here we propose a new 5’ variant for dog, rat and mouse. In contrast to other mammals where this variant encodes a shorter protein (−21 aa in human, mouse and rat, and −14 aa in dog), the pig variant encodes for a longer protein (+18 aa). In all mammalian species, variant 1 has a high probability of a preferential mitochondrial sub-cellular localization. Nevertheless, it is not evident, in particular in porcine and dog species, that the second variant is associated with a different sub-cellular specificity.
Collapse
|
25
|
Barbazuk WB, Fu Y, McGinnis KM. Genome-wide analyses of alternative splicing in plants: opportunities and challenges. Genome Res 2008; 18:1381-92. [PMID: 18669480 DOI: 10.1101/gr.053678.106] [Citation(s) in RCA: 261] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Alternative splicing (AS) creates multiple mRNA transcripts from a single gene. While AS is known to contribute to gene regulation and proteome diversity in animals, the study of its importance in plants is in its early stages. However, recently available plant genome and transcript sequence data sets are enabling a global analysis of AS in many plant species. Results of genome analysis have revealed differences between animals and plants in the frequency of alternative splicing. The proportion of plant genes that have one or more alternative transcript isoforms is approximately 20%, indicating that AS in plants is not rare, although this rate is approximately one-third of that observed in human. The majority of plant AS events have not been functionally characterized, but evidence suggests that AS participates in important plant functions, including stress response, and may impact domestication and trait selection. The increasing availability of plant genome sequence data will enable larger comparative analyses that will identify functionally important plant AS events based on their evolutionary conservation, determine the influence of genome duplication on the evolution of AS, and discover plant-specific cis-elements that regulate AS. This review summarizes recent analyses of AS in plants, discusses the importance of further analysis, and suggests directions for future efforts.
Collapse
Affiliation(s)
- W Brad Barbazuk
- Donald Danforth Plant Science Center, St. Louis, Missouri 63132, USA.
| | | | | |
Collapse
|
26
|
Abstract
Most alternative splicing events in human and other eukaryotic genomes are detected using sequence fragments produced by high throughput genomic technologies, such as EST sequencing and oligonucleotide microarrays. Reconstructing full-length transcript isoforms from such sequence fragments is a major interest and challenge for computational analyses of pre-mRNA alternative splicing. This chapter describes a general graph-based approach for computational inference of full-length isoforms.
Collapse
Affiliation(s)
- Yi Xing
- Department of Internal Medicine, Carver College of Medicine, University of Iowa, Iowa City, IA, USA
| | | |
Collapse
|
27
|
Abstract
The sequencing of the human genome and ensuing wave of data generation have brought new light upon the extent and importance of alternative splicing as an RNA regulatory mechanism. Alternative splicing could potentially explain the complexity of protein repertoire during evolution, and defects in the splicing mechanism are responsible for diseases as complex as cancer. Among the challenges that rise in light of these discoveries are cataloguing splice variation in the human and other eukaryotic genomes, and identifying and characterizing the splicing regulatory elements that control their expression. Bioinformatics efforts tackling these two questions are just at the beginning. This article is a survey of these methods.
Collapse
Affiliation(s)
- Liliana Florea
- Department of Computer Science, George Washington University, Academic Center-Rm 714, Washington DC 20052, USA.
| |
Collapse
|
28
|
Haas BJ. Analysis of alternative splicing in plants with bioinformatics tools. Curr Top Microbiol Immunol 2008; 326:17-37. [PMID: 18630745 DOI: 10.1007/978-3-540-76776-3_2] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Alternative splicing is a molecular mechanism utilized by a broad range of eukaryotes to extend the repertoire of functions encoded by single genes and to posttranscriptionally regulate gene expression. Recent analyses of expressed transcript sequences aligned to the complete genomes of Arabidopsis and rice indicate that alternative splicing in plants is prevalent and exhibits several features similar to other higher eukaryotes including mouse and human. This chapter reviews the computational strategies employed to study alternative splicing with bioinformatics tools and the recent findings from analyses performed on plants by applying such methods.
Collapse
Affiliation(s)
- B J Haas
- B.J. Haas Broad Institute, 7 Cambridge Center, Cambridge, MA 02142, USA.
| |
Collapse
|
29
|
Liu F, Xu W, Tan L, Xue Y, Sun C, Su Z. Case study for identification of potentially indel-caused alternative expression isoforms in the rice subspecies japonica and indica by integrative genome analysis. Genomics 2007; 91:186-94. [PMID: 18037265 DOI: 10.1016/j.ygeno.2007.10.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2007] [Revised: 09/27/2007] [Accepted: 10/03/2007] [Indexed: 11/30/2022]
Abstract
Alternative splicing (AS) is one of the most significant components of the functional complexity of the eukaryote genome, increasing protein diversity, creating isoforms, and affecting mRNA stability. Recently, whole genome sequences and large microarray data sets have become available, making data integration feasible and allowing the study of the possible regulatory mechanism of AS in rice (Oryza sativa) by erecting and testing hypotheses before doing bench studies. We have developed a new strategy and have identified 215 rice genes with alternative expression isoforms related to insertion and deletion (indel) between subspecies indica and subspecies japonica. We did a case study for alternative expression isoforms of the rice peroxidase gene LOC_Os06g48030 to investigate possible mechanisms by which indels caused alternative splicing between the indica and the japonica varieties by mining of array data together with validation by RT-PCR and genome sequencing analysis. Multiple poly(A) signals were detected in the specific indel region for LOC_Os06g48030. We present a new methodology to promote more discoveries of potentially indel-caused AS genes in rice, which may serve as the foundation for research into the regulatory mechanism of alternative expression isoforms between subspecies.
Collapse
Affiliation(s)
- Fengxia Liu
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100094, China
| | | | | | | | | | | |
Collapse
|
30
|
Foissac S, Sammeth M. ASTALAVISTA: dynamic and flexible analysis of alternative splicing events in custom gene datasets. Nucleic Acids Res 2007; 35:W297-9. [PMID: 17485470 PMCID: PMC1933205 DOI: 10.1093/nar/gkm311] [Citation(s) in RCA: 246] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
In the process of establishing more and more complete annotations of eukaryotic genomes, a constantly growing number of alternative splicing (AS) events has been reported over the last decade. Consequently, the increasing transcript coverage also revealed the real complexity of some variations in the exon–intron structure between transcript variants and the need for computational tools to address ‘complex’ AS events. ASTALAVISTA (alternative splicing transcriptional landscape visualization tool) employs an intuitive and complete notation system to univocally identify such events. The method extracts AS events dynamically from custom gene annotations, classifies them into groups of common types and visualizes a comprehensive picture of the resulting AS landscape. Thus, ASTALAVISTA can characterize AS for whole transcriptome data from reference annotations (GENCODE, REFSEQ, ENSEMBL) as well as for genes selected by the user according to common functional/structural attributes of interest: http://genome.imim.es/astalavista
Collapse
|
31
|
Tanner S, Shen Z, Ng J, Florea L, Guigó R, Briggs SP, Bafna V. Improving gene annotation using peptide mass spectrometry. Genes Dev 2007; 17:231-9. [PMID: 17189379 PMCID: PMC1781355 DOI: 10.1101/gr.5646507] [Citation(s) in RCA: 148] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2006] [Accepted: 11/09/2006] [Indexed: 11/24/2022]
Abstract
Annotation of protein-coding genes is a key goal of genome sequencing projects. In spite of tremendous recent advances in computational gene finding, comprehensive annotation remains a challenge. Peptide mass spectrometry is a powerful tool for researching the dynamic proteome and suggests an attractive approach to discover and validate protein-coding genes. We present algorithms to construct and efficiently search spectra against a genomic database, with no prior knowledge of encoded proteins. By searching a corpus of 18.5 million tandem mass spectra (MS/MS) from human proteomic samples, we validate 39,000 exons and 11,000 introns at the level of translation. We present translation-level evidence for novel or extended exons in 16 genes, confirm translation of 224 hypothetical proteins, and discover or confirm over 40 alternative splicing events. Polymorphisms are efficiently encoded in our database, allowing us to observe variant alleles for 308 coding SNPs. Finally, we demonstrate the use of mass spectrometry to improve automated gene prediction, adding 800 correct exons to our predictions using a simple rescoring strategy. Our results demonstrate that proteomic profiling should play a role in any genome sequencing project.
Collapse
Affiliation(s)
- Stephen Tanner
- Bioinformatics Program, University of California, San Diego, La Jolla, California 92093-0419, USA.
| | | | | | | | | | | | | |
Collapse
|
32
|
Frank A, Pevzner P. PepNovo: de novo peptide sequencing via probabilistic network modeling. Anal Chem 2007; 77:964-73. [PMID: 15858974 DOI: 10.1021/ac048788h] [Citation(s) in RCA: 440] [Impact Index Per Article: 24.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We present a novel scoring method for de novo interpretation of peptides from tandem mass spectrometry data. Our scoring method uses a probabilistic network whose structure reflects the chemical and physical rules that govern the peptide fragmentation. We use a likelihood ratio hypothesis test to determine whether the peaks observed in the mass spectrum are more likely to have been produced under our fragmentation model than under a model that treats peaks as random events. We tested our de novo algorithm PepNovo on ion trap data and achieved results that are superior to popular de novo peptide sequencing algorithms. PepNovo can be accessed via the URL http://www-cse.ucsd.edu/groups/bioinformatics/software.html.
Collapse
Affiliation(s)
- Ari Frank
- Department of Computer Science & Engineering, University of California, San Diego, La Jolla, California 92093-0114, USA.
| | | |
Collapse
|
33
|
Lehtonen HJ, Ylisaukko-oja SK, Kiuru M, Karhu A, Lehtonen R, Vanharanta S, Jalanko A, Aaltonen LA, Launonen V. Stress-induced expression of a novel variant of human fumarate hydratase (FH). Gene Expr 2007; 14:59-69. [PMID: 18257390 PMCID: PMC6042040 DOI: 10.3727/105221607783417592] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Fumarate hydratase (FH) is an enzyme of the mitochondrial tricarboxylic acid cycle (TCAC). Here we report the characterization of a novel FH variant (FHv) that contains an alternative exon 1b, thus lacking the mitochondrial signal sequence. Distinct from mitochondrial FH, FHv localized to cytosol and nucleus and lacked FH enzyme activity. FHv was expressed ubiquitously in human fetal and adult tissues. Heat shock and prolonged hypoxia increased FHv expression in a cell line (HTB 115) by nine- and fourfold, respectively. These results suggest that FHv has an alternative function outside the TCAC related to cellular stress response.
Collapse
Affiliation(s)
- Heli J. Lehtonen
- *Department of Medical Genetics, Biomedicum Helsinki, University of Helsinki, Helsinki, Finland
| | - Sanna K. Ylisaukko-oja
- *Department of Medical Genetics, Biomedicum Helsinki, University of Helsinki, Helsinki, Finland
| | - Maija Kiuru
- *Department of Medical Genetics, Biomedicum Helsinki, University of Helsinki, Helsinki, Finland
| | - Auli Karhu
- *Department of Medical Genetics, Biomedicum Helsinki, University of Helsinki, Helsinki, Finland
| | - Rainer Lehtonen
- *Department of Medical Genetics, Biomedicum Helsinki, University of Helsinki, Helsinki, Finland
| | - Sakari Vanharanta
- *Department of Medical Genetics, Biomedicum Helsinki, University of Helsinki, Helsinki, Finland
| | - Anu Jalanko
- †National Public Health Institute, Department of Molecular Medicine, Biomedicum Helsinki, Helsinki, Finland
| | - Lauri A. Aaltonen
- *Department of Medical Genetics, Biomedicum Helsinki, University of Helsinki, Helsinki, Finland
| | - Virpi Launonen
- *Department of Medical Genetics, Biomedicum Helsinki, University of Helsinki, Helsinki, Finland
| |
Collapse
|
34
|
Comprehensive analysis of alternative splicing in rice and comparative analyses with Arabidopsis. BMC Genomics 2006; 7:327. [PMID: 17194304 PMCID: PMC1769492 DOI: 10.1186/1471-2164-7-327] [Citation(s) in RCA: 326] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2006] [Accepted: 12/28/2006] [Indexed: 11/29/2022] Open
Abstract
Background Recently, genomic sequencing efforts were finished for Oryza sativa (cultivated rice) and Arabidopsis thaliana (Arabidopsis). Additionally, these two plant species have extensive cDNA and expressed sequence tag (EST) libraries. We employed the Program to Assemble Spliced Alignments (PASA) to identify and analyze alternatively spliced isoforms in both species. Results A comprehensive analysis of alternative splicing was performed in rice that started with >1.1 million publicly available spliced ESTs and over 30,000 full length cDNAs in conjunction with the newly enhanced PASA software. A parallel analysis was performed with Arabidopsis to compare and ascertain potential differences between monocots and dicots. Alternative splicing is a widespread phenomenon (observed in greater than 30% of the loci with transcript support) and we have described nine alternative splicing variations. While alternative splicing has the potential to create many RNA isoforms from a single locus, the majority of loci generate only two or three isoforms and transcript support indicates that these isoforms are generally not rare events. For the alternate donor (AD) and acceptor (AA) classes, the distance between the splice sites for the majority of events was found to be less than 50 basepairs (bp). In both species, the most frequent distance between AA is 3 bp, consistent with reports in mammalian systems. Conversely, the most frequent distance between AD is 4 bp in both plant species, as previously observed in mouse. Most alternative splicing variations are localized to the protein coding sequence and are predicted to significantly alter the coding sequence. Conclusion Alternative splicing is widespread in both rice and Arabidopsis and these species share many common features. Interestingly, alternative splicing may play a role beyond creating novel combinations of transcripts that expand the proteome. Many isoforms will presumably have negative consequences for protein structure and function, suggesting that their biological role involves post-transcriptional regulation of gene expression.
Collapse
|
35
|
Lee Y, Lee Y, Kim B, Shin Y, Nam S, Kim P, Kim N, Chung WH, Kim J, Lee S. ECgene: an alternative splicing database update. Nucleic Acids Res 2006; 35:D99-103. [PMID: 17132829 PMCID: PMC1716719 DOI: 10.1093/nar/gkl992] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
ECgene () was developed to provide functional annotation for alternatively spliced genes. The applications encompass the genome-based transcript modeling for alternative splicing (AS), domain analysis with Gene Ontology (GO) annotation and expression analysis based on the EST and SAGE data. We have expanded the ECgene's AS modeling and EST clustering to nine organisms for which sufficient EST data are available in the GenBank. As for the human genome, we have also introduced several new applications to analyze differential expression. ECprofiler is an ontology-based candidate gene search system that allows users to select an arbitrary combination of gene expression pattern and GO functional categories. DEGEST is a database of differentially expressed genes and isoforms based on the EST information. Importantly, gene expression is analyzed at three distinctive levels—gene, isoform and exon levels. The user interfaces for functional and expression analyses have been substantially improved. ASviewer is a dedicated java application that visualizes the transcript structure and functional features of alternatively spliced variants. The SAGE part of the expression module provides many additional features including SNP, differential expression and alternative tag positions.
Collapse
Affiliation(s)
- Yeunsook Lee
- Division of Molecular Life Sciences, Ewha Womans UniversitySeoul 120-750, Korea
| | - Younghee Lee
- Division of Molecular Life Sciences, Ewha Womans UniversitySeoul 120-750, Korea
| | - Bumjin Kim
- Division of Molecular Life Sciences, Ewha Womans UniversitySeoul 120-750, Korea
| | - Youngah Shin
- Division of Molecular Life Sciences, Ewha Womans UniversitySeoul 120-750, Korea
| | - Seungyoon Nam
- Division of Molecular Life Sciences, Ewha Womans UniversitySeoul 120-750, Korea
- Interdisciplinary Program in Bioinformatics, Seoul National UniversitySeoul 151-742, Korea
| | - Pora Kim
- Bioinformatics Team, Electronics and Telecommunications Research Institute (ETRI)Gajeong-Dong, Yuseong-Gu, Daejeon 305-350, Korea
| | - Namshin Kim
- Department of Chemistry and Biochemistry, Center for Computational Biology, Institute for Genomics and Proteomics, Molecular Biology Institute, University of California Los AngelesLos Angeles, CA 90095-1570, USA
| | - Won-Hyong Chung
- Korean Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology52 Eoeun, Yuseong, Daejeon 305-333, Korea
| | - Jaesang Kim
- Division of Molecular Life Sciences, Ewha Womans UniversitySeoul 120-750, Korea
| | - Sanghyuk Lee
- Division of Molecular Life Sciences, Ewha Womans UniversitySeoul 120-750, Korea
- To whom correspondence should be addressed. Tel: +82 2 3277 2888; Fax: +82 2 3277 3760;
| |
Collapse
|
36
|
Heintz D, Erxleben A, High AA, Wurtz V, Reski R, Van Dorsselaer A, Sarnighausen E. Rapid alteration of the phosphoproteome in the moss Physcomitrella patens after cytokinin treatment. J Proteome Res 2006; 5:2283-93. [PMID: 16944940 DOI: 10.1021/pr060152e] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Cytokinin hormones are crucial regulators of a large number of processes in plant development. Recently, significant progress has been made toward the elucidation of the molecular details of cytokinin that has led to a model for signal transduction involving a phosphorylation cascade. However, the current knowledge of cytokinin action remains largely unknown and does not explain the different roles of this hormone. To gain further insights into this aspect of cytokinin action and the inducible phosphorelay, we have produced the first large-scale map of a phosphoproteome in the moss Physcomitrella patens. Using a protocol that we recently published (Heintz, D.; et al. Electrophoresis 2004, 25, 1149-1159) that combines IMAC, MALDI-TOF-MS, and LC-MS/MS, a total of 172 phosphopeptide sequences were obtained by a peptide de novo sequencing strategy. Specific P. patens EST and raw genomic databases were interrogated, and protein homology searches resulted in the identification of 112 proteins that were then classified into functional categories. In addition, the temporal dynamics of the phosphoproteome in response to cytokinin stimulation was studied at 2, 4, 6, and 15 min after hormone addition. We identified 13 proteins that were not previously known targets of cytokinin action. Among the responsive proteins, some were involved in metabolism, and several proteins of unknown function were also identified. We have mapped the time course of their activation in response to cytokinin and discussed their hypothetical biological significance. Deciphering these early induced phosphorylation events has shown that the cytokinin effect can be rapid (few minutes), and the duration of this effect can be variable. Also phosphorylation events can be differentially regulated. Taken together our proteomic study provides an enriched look of the multistep phosphorelay system mediating cytokinin response and suggests the existence of a multidirectional interaction between cytokinin and numerous other pathways.
Collapse
Affiliation(s)
- Dimitri Heintz
- Laboratoire de Spectrométrie de Masse Bio-Organique, CNRS, ECPM, Université Louis Pasteur, 25 rue Becquerel F67087, Strasbourg, Cedex 2, France.
| | | | | | | | | | | | | |
Collapse
|
37
|
Noh SJ, Lee K, Paik H, Hur CG. TISA: tissue-specific alternative splicing in human and mouse genes. DNA Res 2006; 13:229-43. [PMID: 17107969 DOI: 10.1093/dnares/dsl011] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Alternative splicing (AS) is a mechanism by which multiple transcripts are produced from a single gene and is thought to be an important mechanism for tissue-specific expression of transcript isoforms. Here, we report a novel graphing method for transcript reconstruction and statistical prediction of tissue-specific AS. We applied three selection steps to generate the splice graph and predict the transcript isoforms: (i) a custom scoring rule for exon/intron sets, (ii) binomial statistics for selecting valid alternative splicing with a frequency of at least 1% for the predominant form and (iii) evaluation of transcript structure. We obtained 97 286 and 66 022 valid transcripts from 26 143 human and 27 741 mouse genes, respectively. In addition, we discovered 33 481 AS events for nine types of AS patterns in human. The statistical significance of tissue specificity for each gene, transcript and AS event was assessed based on EST tissue information, followed by a multiple testing correction procedure. In human, 12 711 genes, 16 016 transcripts and 1035 AS events were predicted to be tissue-specific (false discovery rate <0.01). This information on genes, transcript structures, AS events and their tissue specificities in human and mouse are freely accessible on the TISA website (http://tisa.kribb.re.kr/AGC/).
Collapse
Affiliation(s)
- Seung-Jae Noh
- Bioinformatics Lab. Plant genomics center KRIBB, 52 Eoeun-dong, Yuseong-gu, Daejon, 305-333 Korea
| | | | | | | |
Collapse
|
38
|
Lamont RJ, Meila M, Xia Q, Hackett M. Mass spectrometry-based proteomics and its application to studies of Porphyromonas gingivalis invasion and pathogenicity. Infect Disord Drug Targets 2006; 6:311-25. [PMID: 16918489 PMCID: PMC2666350 DOI: 10.2174/187152606778249935] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Porphyromonas gingivalis is a Gram-negative anaerobe that populates the subgingival crevice of the mouth. It is known to undergo a transition from its commensal status in healthy individuals to a highly invasive intracellular pathogen in human patients suffering from periodontal disease, where it is often the dominant species of pathogenic bacteria. The application of mass spectrometry-based proteomics to the study of P. gingivalis interactions with model host cell systems, invasion and pathogenicity is reviewed. These studies have evolved from qualitative identifications of small numbers of secreted proteins, using traditional gel-based methods, to quantitative whole cell proteomic studies using multiple dimension capillary HPLC coupled with linear ion trap mass spectrometry. It has become possible to generate a differential readout of protein expression change over the entire P. gingivalis proteome, in a manner analogous to whole genome mRNA arrays. Different strategies have been employed for generating protein level expression ratios from mass spectrometry data, including stable isotope metabolic labeling and most recently, spectral counting methods. A global view of changes in protein modification status remains elusive due to the limitations of existing computational tools for database searching and data mining. Such a view would be desirable for purposes of making global assessments of changes in gene regulation in response to host interactions during the course of adhesion, invasion and internalization. With a complete data matrix consisting of changes in transcription, protein abundance and protein modification during the course of invasion, the search for new protein drug targets would benefit from a more comprehensive understanding of these processes than what could be achieved prior to the advent of systems biology.
Collapse
Affiliation(s)
- Richard J. Lamont
- Department of Oral Biology, University of Florida, Gainesville, Florida, USA
| | - Marina Meila
- Department of Statistics, University of Washington, Seattle, Washington, USA
| | - Qiangwei Xia
- Department of Chemical Engineering, University of Washington, Seattle, Washington, USA
- Department of Microbiology, University of Washington, Seattle, Washington, USA
| | - Murray Hackett
- Department of Chemical Engineering, University of Washington, Seattle, Washington, USA
- Address correspondence to this author at the Department of Chemical Engineering, Box 355014, University of Washington, Seattle, Washington 98195; Telephone: (206) 616 8071; E-mail
| |
Collapse
|
39
|
Granum S, Sundvold-Gjerstad V, Dai KZ, Kolltveit KM, Hildebrand K, Huitfeldt HS, Lea T, Spurkland A. Structure function analysis of SH2D2A isoforms expressed in T cells reveals a crucial role for the proline rich region encoded by SH2D2A exon 7. BMC Immunol 2006; 7:15. [PMID: 16839418 PMCID: PMC1553471 DOI: 10.1186/1471-2172-7-15] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2006] [Accepted: 07/13/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The activation induced T cell specific adapter protein (TSAd), encoded by SH2D2A, interacts with and modulates Lck activity. Several transcript variants of TSAd mRNA exist, but their biological significance remains unknown. Here we examined expression of SH2D2A transcripts in activated CD4+ T cells and used the SH2D2A variants as tools to identify functionally important regions of TSAd. RESULTS TSAd was found to interact with Lck in human CD4+ T cells ex vivo. Three interaction modes of TSAd with Lck were identified. TSAd aa239-256 conferred binding to the Lck-SH3 domain, whereas one or more of the four tyrosines within aa239-334 encoded by SH2D2A exon 7 was found to confer interaction with the Lck-SH2-domain. Finally the TSAd-SH2 domain was found to interact with Lck. The SH2D2A exon 7 encoding TSAd aa 239-334 was found to harbour information essential not only for TSAd interaction with Lck, but also for TSAd modulation of Lck activity and translocation of TSAd to the nucleus. All five SH2D2A transcripts were found to be expressed in CD3 stimulated CD4+ T cells. CONCLUSION These data show that TSAd and Lck may interact through several different domains and that Lck TSAd interaction occurs in CD4+ T cells ex vivo. Alternative splicing of exon 7 encoding aa239-334 results in loss of the majority of protein interaction motives of TSAd and yields truncated TSAd molecules with altered ability to modulate Lck activity. Whether TSAd is regulated through differential alternative splicing of the SH2D2A transcript remains to be determined.
Collapse
Affiliation(s)
- Stine Granum
- Department of Anatomy, Institute of Basic Medical Sciences, Box 1105, Blindern, N-0317 Oslo, Norway
| | - Vibeke Sundvold-Gjerstad
- Department of Anatomy, Institute of Basic Medical Sciences, Box 1105, Blindern, N-0317 Oslo, Norway
| | - Ke-Zheng Dai
- Department of Anatomy, Institute of Basic Medical Sciences, Box 1105, Blindern, N-0317 Oslo, Norway
| | | | - Kjersti Hildebrand
- Department of Anatomy, Institute of Basic Medical Sciences, Box 1105, Blindern, N-0317 Oslo, Norway
| | - Henrik S Huitfeldt
- Institute of Pathology, Rikshospitalet University Hospital, N-0027, Norway
| | - Tor Lea
- Institute of Immunology, Rikshospitalet University Hospital, N-0027, Norway
| | - Anne Spurkland
- Department of Anatomy, Institute of Basic Medical Sciences, Box 1105, Blindern, N-0317 Oslo, Norway
| |
Collapse
|
40
|
Bollina D, Lee BTK, Tan TW, Ranganathan S. ASGS: an alternative splicing graph web service. Nucleic Acids Res 2006; 34:W444-7. [PMID: 16845045 PMCID: PMC1538904 DOI: 10.1093/nar/gkl268] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2006] [Revised: 03/01/2006] [Accepted: 03/31/2006] [Indexed: 11/13/2022] Open
Abstract
Alternative transcript diversity manifests itself a prime cause of complexity in higher eukaryotes. The Alternative Splicing Graph Server (ASGS) is a web service facilitating the systematic study of alternatively spliced genes of higher eukaryotes by generating splicing graphs for the compact visual representation of transcript diversity from a single gene. Taking a set of transcripts in General Feature Format as input, ASGS identifies distinct reference and variable exons, generates a transcript splicing graph, an exon summary, splicing events classification and a single line graph to facilitate experimental analysis. This freely available web service can be accessed at http://asgs.biolinfo.org.
Collapse
Affiliation(s)
- Durgaprasad Bollina
- Department of Chemistry and Biomolecular Sciences and Biotechnology Research Institute, Macquarie UniversitySydney, NSW 2109, Australia
| | - Bernett T. K. Lee
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of SingaporeSingapore, 119260
| | - Tin Wee Tan
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of SingaporeSingapore, 119260
| | - Shoba Ranganathan
- Department of Chemistry and Biomolecular Sciences and Biotechnology Research Institute, Macquarie UniversitySydney, NSW 2109, Australia
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of SingaporeSingapore, 119260
| |
Collapse
|
41
|
Xing Y, Yu T, Wu YN, Roy M, Kim J, Lee C. An expectation-maximization algorithm for probabilistic reconstructions of full-length isoforms from splice graphs. Nucleic Acids Res 2006; 34:3150-60. [PMID: 16757580 PMCID: PMC1475746 DOI: 10.1093/nar/gkl396] [Citation(s) in RCA: 122] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2006] [Revised: 04/13/2006] [Accepted: 05/10/2006] [Indexed: 11/13/2022] Open
Abstract
Reconstructing full-length transcript isoforms from sequence fragments (such as ESTs) is a major interest and challenge for bioinformatic analysis of pre-mRNA alternative splicing. This problem has been formulated as finding traversals across the splice graph, which is a directed acyclic graph (DAG) representation of gene structure and alternative splicing. In this manuscript we introduce a probabilistic formulation of the isoform reconstruction problem, and provide an expectation-maximization (EM) algorithm for its maximum likelihood solution. Using a series of simulated data and expressed sequences from real human genes, we demonstrate that our EM algorithm can correctly handle various situations of fragmentation and coupling in the input data. Our work establishes a general probabilistic framework for splice graph-based reconstructions of full-length isoforms.
Collapse
Affiliation(s)
- Yi Xing
- Molecular Biology Institute, Center for Computational Biology, Department of Chemistry and Biochemistry, University of CaliforniaLos Angeles, USA
| | - Tianwei Yu
- Department of Statistics, University of CaliforniaLos Angeles, USA
- Dental Research Institute, School of Dentistry, University of CaliforniaLos Angeles, USA
| | - Ying Nian Wu
- Department of Statistics, University of CaliforniaLos Angeles, USA
| | - Meenakshi Roy
- Molecular Biology Institute, Center for Computational Biology, Department of Chemistry and Biochemistry, University of CaliforniaLos Angeles, USA
| | - Joseph Kim
- Molecular Biology Institute, Center for Computational Biology, Department of Chemistry and Biochemistry, University of CaliforniaLos Angeles, USA
| | - Christopher Lee
- Molecular Biology Institute, Center for Computational Biology, Department of Chemistry and Biochemistry, University of CaliforniaLos Angeles, USA
| |
Collapse
|
42
|
Romero PR, Zaidi S, Fang YY, Uversky VN, Radivojac P, Oldfield CJ, Cortese MS, Sickmeier M, LeGall T, Obradovic Z, Dunker AK. Alternative splicing in concert with protein intrinsic disorder enables increased functional diversity in multicellular organisms. Proc Natl Acad Sci U S A 2006; 103:8390-5. [PMID: 16717195 PMCID: PMC1482503 DOI: 10.1073/pnas.0507916103] [Citation(s) in RCA: 351] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Alternative splicing of pre-mRNA generates two or more protein isoforms from a single gene, thereby contributing to protein diversity. Despite intensive efforts, an understanding of the protein structure-function implications of alternative splicing is still lacking. Intrinsic disorder, which is a lack of equilibrium 3D structure under physiological conditions, may provide this understanding. Intrinsic disorder is a common phenomenon, particularly in multicellular eukaryotes, and is responsible for important protein functions including regulation and signaling. We hypothesize that polypeptide segments affected by alternative splicing are most often intrinsically disordered such that alternative splicing enables functional and regulatory diversity while avoiding structural complications. We analyzed a set of 46 differentially spliced genes encoding experimentally characterized human proteins containing both structured and intrinsically disordered amino acid segments. We show that 81% of 75 alternatively spliced fragments in these proteins were associated with fully (57%) or partially (24%) disordered protein regions. Regions affected by alternative splicing were significantly biased toward encoding disordered residues, with a vanishingly small P value. A larger data set composed of 558 SwissProt proteins with known isoforms produced by 1,266 alternatively spliced fragments was characterized by applying the pondr vsl1 disorder predictor. Results from prediction data are consistent with those obtained from experimental data, further supporting the proposed hypothesis. Associating alternative splicing with protein disorder enables the time- and tissue-specific modulation of protein function needed for cell differentiation and the evolution of multicellular organisms.
Collapse
Affiliation(s)
- Pedro R. Romero
- *School of Informatics, Indiana University–Purdue University Indianapolis, 535 West Michigan Street, IT475, Indianapolis, IN 46202
- Department of Biochemistry and Molecular Biology and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 714 North Senate Avenue, Suite 250, Indianapolis, IN 46202
| | - Saima Zaidi
- *School of Informatics, Indiana University–Purdue University Indianapolis, 535 West Michigan Street, IT475, Indianapolis, IN 46202
| | - Ya Yin Fang
- Department of Biochemistry and Molecular Biology and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 714 North Senate Avenue, Suite 250, Indianapolis, IN 46202
| | - Vladimir N. Uversky
- Department of Biochemistry and Molecular Biology and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 714 North Senate Avenue, Suite 250, Indianapolis, IN 46202
| | - Predrag Radivojac
- School of Informatics, Indiana University, Eigenmann Hall 1005, 1900 East 10th Street, Bloomington, IN 47406; and
| | - Christopher J. Oldfield
- Department of Biochemistry and Molecular Biology and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 714 North Senate Avenue, Suite 250, Indianapolis, IN 46202
| | - Marc S. Cortese
- Department of Biochemistry and Molecular Biology and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 714 North Senate Avenue, Suite 250, Indianapolis, IN 46202
| | - Megan Sickmeier
- Department of Biochemistry and Molecular Biology and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 714 North Senate Avenue, Suite 250, Indianapolis, IN 46202
| | - Tanguy LeGall
- Department of Biochemistry and Molecular Biology and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 714 North Senate Avenue, Suite 250, Indianapolis, IN 46202
| | - Zoran Obradovic
- Center for Information Science and Technology, Temple University, 303 Wachman Hall (038-24), 1805 North Broad Street, Philadelphia, PA 19122
| | - A. Keith Dunker
- *School of Informatics, Indiana University–Purdue University Indianapolis, 535 West Michigan Street, IT475, Indianapolis, IN 46202
- Department of Biochemistry and Molecular Biology and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 714 North Senate Avenue, Suite 250, Indianapolis, IN 46202
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
43
|
Ro S, Kang SH, Farrelly AM, Ordog T, Partain R, Fleming N, Sanders KM, Kenyon JL, Keef KD. Template switching within exons 3 and 4 of KV11.1 (HERG) gives rise to a 5' truncated cDNA. Biochem Biophys Res Commun 2006; 345:1342-9. [PMID: 16723117 DOI: 10.1016/j.bbrc.2006.05.032] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2006] [Accepted: 05/02/2006] [Indexed: 10/24/2022]
Abstract
K(V)11.1 (HERG) channels contribute to membrane potential in a number of excitable cell types. We cloned a variant of K(V)11.1 from human jejunum containing a 171 bp deletion spanning exons 3 and 4. Expression of a full-length cDNA clone containing this deletion gave rise to protein that trafficked to the cell membrane and generated robust currents. The deletion occurred in a G/C-rich region and identical sequence elements of UGGUGG were located at the deletion boundaries. In recent studies these features have been implicated to cause deletions via template switching during cDNA synthesis. To examine this possibility we compared cDNAs from human brain, heart, and jejunum synthesized at lower (42 degrees C) and higher temperatures (70 degrees C). The 171 bp deletion was absent at the higher temperature. Our results suggest that the sequence and secondary structure of mRNA in the G/C rich region leads to template switching producing a cDNA product with a 171 bp deletion.
Collapse
Affiliation(s)
- S Ro
- Department of Physiology and Cell Biology, University of Nevada School of Medicine, Reno, 89557, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Le Texier V, Riethoven JJ, Kumanduri V, Gopalakrishnan C, Lopez F, Gautheret D, Thanaraj TA. AltTrans: transcript pattern variants annotated for both alternative splicing and alternative polyadenylation. BMC Bioinformatics 2006; 7:169. [PMID: 16556303 PMCID: PMC1435940 DOI: 10.1186/1471-2105-7-169] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2005] [Accepted: 03/23/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The three major mechanisms that regulate transcript formation involve the selection of alternative sites for transcription start (TS), splicing, and polyadenylation. Currently there are efforts that collect data & annotation individually for each of these variants. It is important to take an integrated view of these data sets and to derive a data set of alternate transcripts along with consolidated annotation. We have been developing in the past computational pipelines that generate value-added data at genome-scale on individual variant types; these include AltSplice on splicing and AltPAS on polyadenylation. We now extend these pipelines and integrate the resultant data sets to facilitate an integrated view of the contributions from splicing and polyadenylation in the formation of transcript variants. DESCRIPTION The AltSplice pipeline examines gene-transcript alignments and delineates alternative splice events and splice patterns; this pipeline is extended as AltTrans to delineate isoform transcript patterns for each of which both introns/exons and 'terminating' polyA site are delineated; EST/mRNA sequences that qualify the transcript pattern confirm both the underlying splicing and polyadenylation. The AltPAS pipeline examines gene-transcript alignments and delineates all potential polyA sites irrespective of underlying splicing patterns. Resultant polyA sites from both AltTrans and AltPAS are merged. The generated database reports data on alternative splicing, alternative polyadenylation and the resultant alternate transcript patterns; the basal data is annotated for various biological features. The data (named as integrated AltTrans data) generated for both the organisms of human and mouse is made available through the Alternate Transcript Diversity web site at http://www.ebi.ac.uk/atd/. CONCLUSION The reported data set presents alternate transcript patterns that are annotated for both alternative splicing and alternative polyadenylation. Results based on current transcriptome data indicate that the contribution of alternative splicing is larger than that of alternative polyadenylation.
Collapse
Affiliation(s)
- Vincent Le Texier
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Jean-Jack Riethoven
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
- 18 Crispin Close, Haverhill, Suffolk, CB9 9PT, UK
| | - Vasudev Kumanduri
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Chellappa Gopalakrishnan
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Fabrice Lopez
- INSERM ERM206, Université de la Méditerranée, Luminy case 928 – 13 288 Marseille Cedex 09, France
| | - Daniel Gautheret
- INSERM ERM206, Université de la Méditerranée, Luminy case 928 – 13 288 Marseille Cedex 09, France
| | - Thangavel Alphonse Thanaraj
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
- 4 Copperfields, Saffron Walden, Essex, CB11 4FG, UK
| |
Collapse
|
45
|
Stamm S, Riethoven JJ, Le Texier V, Gopalakrishnan C, Kumanduri V, Tang Y, Barbosa-Morais NL, Thanaraj TA. ASD: a bioinformatics resource on alternative splicing. Nucleic Acids Res 2006; 34:D46-55. [PMID: 16381912 PMCID: PMC1347394 DOI: 10.1093/nar/gkj031] [Citation(s) in RCA: 190] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2005] [Revised: 09/22/2005] [Accepted: 09/22/2005] [Indexed: 01/08/2023] Open
Abstract
Alternative splicing is an important regulatory mechanism of mammalian gene expression. The alternative splicing database (ASD) consortium is systematically collecting and annotating data on alternative splicing. We present the continuation and upgrade of the ASD [T. A. Thanaraj, S. Stamm, F. Clark, J. J. Riethoven, V. Le Texier, J. Muilu (2004) Nucleic Acids Res. 32, D64-D69] that consists of computationally and manually generated data. Its largest parts are AltSplice, a value-added database of computationally delineated alternative splicing events. Its data include alternatively spliced introns/exons, events, isoform splicing patterns and isoform peptide sequences. AltSplice data are generated by examining gene-transcript alignments. The data are annotated for various biological features including splicing signals, expression states, (SNP)-mediated splicing and cross-species conservation. AEdb forms the manually curated component of ASD. It is a literature-based data set containing sequence and properties of alternatively spliced exons, functional enumeration of observed splicing events, characterization of observed splicing regulatory elements, and a collection of experimentally clarified minigene constructs. ASD includes a workbench, which is an analysis tool that enables users to carry out splicing related analysis such as characterization of introns for various splicing signals, identification of splicing regulatory elements on a given RNA sequence, prediction of putative exons and prediction of putative translation start codons. The different ASD modules are integrated and can be accessed through user-friendly interfaces and visualization tools. ASD data has been integrated with Ensembl genome annotation project as a Distributed Annotation System (DAS) resource and can be viewed on Ensembl genome browser. The ASD resource is presented at (http://www.ebi.ac.uk/asd).
Collapse
Affiliation(s)
- Stefan Stamm
- University of Erlangen, Institute for BiochemistryFahrstrasse 17, 91054 Erlangen, Germany
| | - Jean-Jack Riethoven
- European Bioinformatics Institute, Wellcome Trust Genome CampusHinxton, Cambridge, CB10 1SD, UK
- University of Erlangen, Institute for BiochemistryFahrstrasse 17, 91054 Erlangen, Germany
- Faculty of Medicine, Institute of Molecular Medicine, University of Lisbon1649-028 Lisbon, Portugal
| | - Vincent Le Texier
- European Bioinformatics Institute, Wellcome Trust Genome CampusHinxton, Cambridge, CB10 1SD, UK
- University of Erlangen, Institute for BiochemistryFahrstrasse 17, 91054 Erlangen, Germany
- Faculty of Medicine, Institute of Molecular Medicine, University of Lisbon1649-028 Lisbon, Portugal
| | - Chellappa Gopalakrishnan
- European Bioinformatics Institute, Wellcome Trust Genome CampusHinxton, Cambridge, CB10 1SD, UK
- University of Erlangen, Institute for BiochemistryFahrstrasse 17, 91054 Erlangen, Germany
- Faculty of Medicine, Institute of Molecular Medicine, University of Lisbon1649-028 Lisbon, Portugal
| | - Vasudev Kumanduri
- European Bioinformatics Institute, Wellcome Trust Genome CampusHinxton, Cambridge, CB10 1SD, UK
- University of Erlangen, Institute for BiochemistryFahrstrasse 17, 91054 Erlangen, Germany
- Faculty of Medicine, Institute of Molecular Medicine, University of Lisbon1649-028 Lisbon, Portugal
| | - Yesheng Tang
- University of Erlangen, Institute for BiochemistryFahrstrasse 17, 91054 Erlangen, Germany
| | - Nuno L. Barbosa-Morais
- Faculty of Medicine, Institute of Molecular Medicine, University of Lisbon1649-028 Lisbon, Portugal
| | | |
Collapse
|
46
|
Wilusz JE, Devanney SC, Caputi M. Chimeric peptide nucleic acid compounds modulate splicing of the bcl-x gene in vitro and in vivo. Nucleic Acids Res 2005; 33:6547-54. [PMID: 16299354 PMCID: PMC1289079 DOI: 10.1093/nar/gki960] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Alternative splicing of the bcl-x gene generates two transcripts: the anti-apoptotic bcl-xL isoform and the pro-apoptotic bcl-xS isoform. The ratio between the two isoforms is a key factor in development and in cancer progression. Here, we show that a short antisense chimeric peptide nucleic acid (PNA) oligonucleotide conjugated to a polypeptide containing eight Ser-Arg repeats (SR)8 can modulate splicing of bcl-x both in vitro and in vivo and induces apoptosis in HeLa cells. The PNA-SR oligo was targeted to a region of bcl-x that does not contain splicing regulatory sequences and was able to override the complex network of splicing enhancers and silencers that regulates the ratio between the two bcl-x isoforms. Thus, PNA-SR oligos are powerful tools that can potentially modulate splice site choice in endogenous genes independent of the presence of other splicing regulatory mechanisms on the target gene.
Collapse
Affiliation(s)
| | - Sean C. Devanney
- Biomedical Science Department, Florida Atlantic UniversityBoca Raton, FL 33431, USA
| | - Massimo Caputi
- Biomedical Science Department, Florida Atlantic UniversityBoca Raton, FL 33431, USA
- To whom correspondence should be addressed. Tel: +1 561 297 0627; Fax: +1 561 297 2221;
| |
Collapse
|
47
|
Pan YX. Diversity and Complexity of the Mu Opioid Receptor Gene: Alternative Pre-mRNA Splicing and Promoters. DNA Cell Biol 2005; 24:736-50. [PMID: 16274294 DOI: 10.1089/dna.2005.24.736] [Citation(s) in RCA: 100] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Mu opioid receptors play an important role in mediating the actions of a class of opioids including morphine and heroin. Binding and pharmacological studies have proposed several mu opioid receptor subtypes: mu(1), mu(2), and morphine-6beta-glucuronide (M6G). The cloning of a mu opioid receptor, MOR-1, has provided an invaluable tool to explore pharmacological and physiological functions of mu opioid receptors at the molecular level. However, only one mu opioid receptor (Oprm) gene has been isolated. Alternative pre-mRNA splicing has been proposed as a molecular explanation for the existence of pharmacologically identified subtypes. In recent years, we have extensively investigated alternative splicing of the Oprm gene, particularly of the mouse Oprm gene. So far we have identified 25 splice variants from the mouse Oprm gene, which are controlled by two diverse promoters, eight splice variants from the rat Oprm gene, and 11 splice variants from the human Oprm gene. Diversity and complexity of the Oprm gene was further demonstrated by functional differences in agonist-induced G protein activation, adenylyl cyclase activity, and receptor internalization among carboxyl terminal variants. This review summarizes these recent results and provides a new perspective on understanding and exploring complex opioid actions in animals and humans.
Collapse
Affiliation(s)
- Ying-Xian Pan
- Department of Neurology, Memorial Sloan-Kettering Cancer Center, New York, New York 10021, USA.
| |
Collapse
|
48
|
Fox-Walsh KL, Dou Y, Lam BJ, Hung SP, Baldi PF, Hertel KJ. The architecture of pre-mRNAs affects mechanisms of splice-site pairing. Proc Natl Acad Sci U S A 2005; 102:16176-81. [PMID: 16260721 PMCID: PMC1283478 DOI: 10.1073/pnas.0508489102] [Citation(s) in RCA: 185] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The exon/intron architecture of genes determines whether components of the spliceosome recognize splice sites across the intron or across the exon. Using in vitro splicing assays, we demonstrate that splice-site recognition across introns ceases when intron size is between 200 and 250 nucleotides. Beyond this threshold, splice sites are recognized across the exon. Splice-site recognition across the intron is significantly more efficient than splice-site recognition across the exon, resulting in enhanced inclusion of exons with weak splice sites. Thus, intron size can profoundly influence the likelihood that an exon is constitutively or alternatively spliced. An EST-based alternative-splicing database was used to determine whether the exon/intron architecture influences the probability of alternative splicing in the Drosophila and human genomes. Drosophila exons flanked by long introns display an up to 90-fold-higher probability of being alternatively spliced compared with exons flanked by two short introns, demonstrating that the exon/intron architecture in Drosophila is a major determinant in governing the frequency of alternative splicing. Exon skipping is also more likely to occur when exons are flanked by long introns in the human genome. Interestingly, experimental and computational analyses show that the length of the upstream intron is more influential in inducing alternative splicing than is the length of the downstream intron. We conclude that the size and location of the flanking introns control the mechanism of splice-site recognition and influence the frequency and the type of alternative splicing that a pre-mRNA transcript undergoes.
Collapse
Affiliation(s)
- Kristi L Fox-Walsh
- Department of Microbiology and Molecular Genetics, University of California, Irvine, CA 92697-4025, USA
| | | | | | | | | | | |
Collapse
|
49
|
Bonizzoni P, Rizzi R, Pesole G. ASPIC: a novel method to predict the exon-intron structure of a gene that is optimally compatible to a set of transcript sequences. BMC Bioinformatics 2005; 6:244. [PMID: 16207377 PMCID: PMC1276783 DOI: 10.1186/1471-2105-6-244] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2005] [Accepted: 10/05/2005] [Indexed: 01/02/2023] Open
Abstract
Background: Currently available methods to predict splice sites are mainly based on the independent and progressive alignment of transcript data (mostly ESTs) to the genomic sequence. Apart from often being computationally expensive, this approach is vulnerable to several problems – hence the need to develop novel strategies. Results: We propose a method, based on a novel multiple genome-EST alignment algorithm, for the detection of splice sites. To avoid limitations of splice sites prediction (mainly, over-predictions) due to independent single EST alignments to the genomic sequence our approach performs a multiple alignment of transcript data to the genomic sequence based on the combined analysis of all available data. We recast the problem of predicting constitutive and alternative splicing as an optimization problem, where the optimal multiple transcript alignment minimizes the number of exons and hence of splice site observations. We have implemented a splice site predictor based on this algorithm in the software tool ASPIC (Alternative Splicing PredICtion). It is distinguished from other methods based on BLAST-like tools by the incorporation of entirely new ad hoc procedures for accurate and computationally efficient transcript alignment and adopts dynamic programming for the refinement of intron boundaries. ASPIC also provides the minimal set of non-mergeable transcript isoforms compatible with the detected splicing events. The ASPIC web resource is dynamically interconnected with the Ensembl and Unigene databases and also implements an upload facility. Conclusion: Extensive bench marking shows that ASPIC outperforms other existing methods in the detection of novel splicing isoforms and in the minimization of over-predictions. ASPIC also requires a lower computation time for processing a single gene and an EST cluster. The ASPIC web resource is available at .
Collapse
Affiliation(s)
- Paola Bonizzoni
- DISCo, University of Milan Bicocca, via Bicocca degli Arcimboldi, 8, Milan, 20135, Italy
| | - Raffaella Rizzi
- DISCo, University of Milan Bicocca, via Bicocca degli Arcimboldi, 8, Milan, 20135, Italy
| | - Graziano Pesole
- Dipartimento di Scienze Biomolecolari e Biotecnologie, University of Milan, via Celoria, 26, Milan, 20133, Italy
| |
Collapse
|
50
|
Magen A, Ast G. The importance of being divisible by three in alternative splicing. Nucleic Acids Res 2005; 33:5574-82. [PMID: 16192573 PMCID: PMC1236976 DOI: 10.1093/nar/gki858] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2005] [Revised: 08/10/2005] [Accepted: 09/07/2005] [Indexed: 11/13/2022] Open
Abstract
Alternative splicing events that are conserved in orthologous genes in different species are commonly viewed as reliable evidence of authentic, functionally significant alternative splicing events. Several recent bioinformatic analyses have shown that conserved alternative exons possess several features that distinguish them from alternative exons that are species-specific. One of the most striking differences between conserved and species-specific alternative exons is the high percentage of exons that preserve the reading frame (exons whose length is an exact multiple of 3, termed symmetrical exons) among the conserved alternative exons. Here, we examined conserved alternative exons and found several features that differentiate between symmetrical and non-symmetrical alternative exons. We show that symmetrical alternative exons have a strong tendency not to disrupt protein domain structures, whereas the tendency of non-symmetrical alternative exons to overlap with different fractions of protein domains is similar to that of constitutive exons. Additionally, skipping isoforms of non-symmetrical alternative exons are strongly underrepresented, compared with their including isoforms, suggesting that skipping of a large fraction of non-symmetrical alternative exons produces transcripts that are degraded by the nonsense-mediated mRNA decay mechanism. Non-symmetrical alternative exons also show a tendency to reside in the 5' half of the CDS. These findings suggest that alternative splicing of symmetrical and non-symmetrical exons is governed by different selective pressures and serves different purposes.
Collapse
Affiliation(s)
- Alon Magen
- Department of Human Genetics and Molecular Medicine, Sackler Faculty of Medicine, Tel Aviv UniversityRamat Aviv 69978, Israel
| | - Gil Ast
- Department of Human Genetics and Molecular Medicine, Sackler Faculty of Medicine, Tel Aviv UniversityRamat Aviv 69978, Israel
| |
Collapse
|