Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Louie B, Mork P, Martin-Sanchez F, Halevy A, Tarczy-Hornoch P. Data integration and genomic medicine. J Biomed Inform. 2007;40:5-16. [PMID: 16574494 DOI: 10.1016/j.jbi.2006.02.007] [Citation(s) in RCA: 81] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2005] [Accepted: 02/05/2006] [Indexed: 10/25/2022]

For:	Louie B, Mork P, Martin-Sanchez F, Halevy A, Tarczy-Hornoch P. Data integration and genomic medicine. J Biomed Inform. 2007;40:5-16. [PMID: 16574494 DOI: 10.1016/j.jbi.2006.02.007] [Citation(s) in RCA: 81] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2005] [Accepted: 02/05/2006] [Indexed: 10/25/2022]

Number

Cited by Other Article(s)

Akki AJ, Patil SA, Hungund S, Sahana R, Patil MM, Kulkarni RV, Raghava Reddy K, Zameer F, Raghu AV. Advances in Parkinson's disease research - A computational network pharmacological approach. Int Immunopharmacol 2024;139:112758. [PMID: 39067399 DOI: 10.1016/j.intimp.2024.112758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Revised: 07/22/2024] [Accepted: 07/22/2024] [Indexed: 07/30/2024]

Cheng C, Messerschmidt L, Bravo I, Waldbauer M, Bhavikatti R, Schenk C, Grujic V, Model T, Kubinec R, Barceló J. A General Primer for Data Harmonization. Sci Data 2024;11:152. [PMID: 38297013 PMCID: PMC10831085 DOI: 10.1038/s41597-024-02956-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 01/11/2024] [Indexed: 02/02/2024] Open

Kalpana S, Lin WY, Wang YC, Fu Y, Wang HY. Alternate Antimicrobial Therapies and Their Companion Tests. Diagnostics (Basel) 2023;13:2490. [PMID: 37568853 PMCID: PMC10417861 DOI: 10.3390/diagnostics13152490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 07/14/2023] [Indexed: 08/13/2023] Open

Tran L, He K, Wang D, Jiang H. A cross-validation statistical framework for asymmetric data integration. Biometrics 2023;79:1280-1292. [PMID: 35524490 PMCID: PMC9637892 DOI: 10.1111/biom.13685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Accepted: 04/19/2022] [Indexed: 11/26/2022]

Asif M, Martiniano HFMC, Lamurias A, Kausar S, Couto FM. DGH-GO: dissecting the genetic heterogeneity of complex diseases using gene ontology. BMC Bioinformatics 2023;24:171. [PMID: 37101154 PMCID: PMC10134522 DOI: 10.1186/s12859-023-05290-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 04/14/2023] [Indexed: 04/28/2023] Open

Abstract

BACKGROUND

Complex diseases such as neurodevelopmental disorders (NDDs) exhibit multiple etiologies. The multi-etiological nature of complex-diseases emerges from distinct but functionally similar group of genes. Different diseases sharing genes of such groups show related clinical outcomes that further restrict our understanding of disease mechanisms, thus, limiting the applications of personalized medicine approaches to complex genetic disorders.

RESULTS

Here, we present an interactive and user-friendly application, called DGH-GO. DGH-GO allows biologists to dissect the genetic heterogeneity of complex diseases by stratifying the putative disease-causing genes into clusters that may contribute to distinct disease outcome development. It can also be used to study the shared etiology of complex-diseases. DGH-GO creates a semantic similarity matrix for the input genes by using Gene Ontology (GO). The resultant matrix can be visualized in 2D plots using different dimension reduction methods (T-SNE, Principal component analysis, umap and Principal coordinate analysis). In the next step, clusters of functionally similar genes are identified from genes functional similarities assessed through GO. This is achieved by employing four different clustering methods (K-means, Hierarchical, Fuzzy and PAM). The user may change the clustering parameters and explore their effect on stratification immediately. DGH-GO was applied to genes disrupted by rare genetic variants in Autism Spectrum Disorder (ASD) patients. The analysis confirmed the multi-etiological nature of ASD by identifying four clusters of genes that were enriched for distinct biological mechanisms and clinical outcome. In the second case study, the analysis of genes shared by different NDDs showed that genes causing multiple disorders tend to aggregate in similar clusters, indicating a possible shared etiology.

CONCLUSION

DGH-GO is a user-friendly application that allows biologists to study the multi-etiological nature of complex diseases by dissecting their genetic heterogeneity. In summary, functional similarities, dimension reduction and clustering methods, coupled with interactive visualization and control over analysis allows biologists to explore and analyze their datasets without requiring expert knowledge on these methods. The source code of proposed application is available at https://github.com/Muh-Asif/DGH-GO.

Collapse

Husereau D, Steuten L, Muthu V, Thomas DM, Spinner DS, Ivany C, Mengel M, Sheffield B, Yip S, Jacobs P, Sullivan T. Effective and Efficient Delivery of Genome-Based Testing-What Conditions Are Necessary for Health System Readiness? Healthcare (Basel) 2022;10:healthcare10102086. [PMID: 36292532 PMCID: PMC9602865 DOI: 10.3390/healthcare10102086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 10/09/2022] [Accepted: 10/12/2022] [Indexed: 01/09/2023] Open

Płuciennik A, Płaczek A, Wilk A, Student S, Oczko-Wojciechowska M, Fujarewicz K. Data Integration–Possibilities of Molecular and Clinical Data Fusion on the Example of Thyroid Cancer Diagnostics. Int J Mol Sci 2022;23:ijms231911880. [PMID: 36233181 PMCID: PMC9569592 DOI: 10.3390/ijms231911880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 09/24/2022] [Accepted: 09/28/2022] [Indexed: 11/23/2022] Open

Dall'Alba G, Casa PL, Abreu FPD, Notari DL, de Avila E Silva S. A Survey of Biological Data in a Big Data Perspective. BIG DATA 2022;10:279-297. [PMID: 35394342 DOI: 10.1089/big.2020.0383] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Youn J, Rai N, Tagkopoulos I. Knowledge integration and decision support for accelerated discovery of antibiotic resistance genes. Nat Commun 2022;13:2360. [PMID: 35487919 PMCID: PMC9055065 DOI: 10.1038/s41467-022-29993-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Accepted: 03/04/2022] [Indexed: 11/09/2022] Open

Das S, Mukhopadhyay I. TiMEG: an integrative statistical method for partially missing multi-omics data. Sci Rep 2021;11:24077. [PMID: 34911979 PMCID: PMC8674330 DOI: 10.1038/s41598-021-03034-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Accepted: 11/24/2021] [Indexed: 11/25/2022] Open

Kovanda A, Zimani AN, Peterlin B. How to design a national genomic project-a systematic review of active projects. Hum Genomics 2021;15:20. [PMID: 33761998 PMCID: PMC7988644 DOI: 10.1186/s40246-021-00315-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Accepted: 02/23/2021] [Indexed: 01/18/2023] Open

Irshad O, Ghani Khan MU. Formalization and Semantic Integration of Heterogeneous Omics Annotations for Exploratory Searches. Curr Bioinform 2021. [DOI: 10.2174/1574893615666200127122818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Abstract Aim: To facilitate researchers and practitioners for unveiling the mysterious functional aspects of human cellular system through performing exploratory searching on semantically integrated heterogeneous and geographically dispersed omics annotations. Background: Improving health standards of life is one of the motives which continuously instigates researchers and practitioners to strive for uncovering the mysterious aspects of human cellular system. Inferring new knowledge from known facts always requires reasonably large amount of data in well-structured, integrated and unified form. Due to the advent of especially high throughput and sensor technologies, biological data is growing heterogeneously and geographically at astronomical rate. Several data integration systems have been deployed to cope with the issues of data heterogeneity and global dispersion. Systems based on semantic data integration models are more flexible and expandable than syntax-based ones but still lack aspect-based data integration, persistence and querying. Furthermore, these systems do not fully support to warehouse biological entities in the form of semantic associations as naturally possessed by the human cell. Objective: To develop aspect-oriented formal data integration model for semantically integrating heterogeneous and geographically dispersed omics annotations for providing exploratory querying on integrated data. Method: We propose an aspect-oriented formal data integration model which uses web semantics standards to formally specify its each construct. Proposed model supports aspect-oriented representation of biological entities while addressing the issues of data heterogeneity and global dispersion. It associates and warehouses biological entities in the way they relate with Result: To show the significance of proposed model, we developed a data warehouse and information retrieval system based on proposed model compliant multi-layered and multi-modular software architecture. Results show that our model supports well for gathering, associating, integrating, persisting and querying each entity with respect to its all possible aspects within or across the various associated omics layers. Conclusion: Formal specifications better facilitate for addressing data integration issues by providing formal means for understanding omics data based on meaning instead of syntax Collapse

Samra H, Li A, Soh B. GENE2D: A NoSQL Integrated Data Repository of Genetic Disorders Data. Healthcare (Basel) 2020;8:healthcare8030257. [PMID: 32781728 PMCID: PMC7551627 DOI: 10.3390/healthcare8030257] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Revised: 07/27/2020] [Accepted: 08/04/2020] [Indexed: 11/16/2022] Open

Irshad O, Khan MUG. Integration and Querying of Heterogeneous Omics Semantic Annotations for Biomedical and Biomolecular Knowledge Discovery. Curr Bioinform 2020. [DOI: 10.2174/1574893614666190409112025] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]

Mihaylov I, Kańduła M, Krachunov M, Vassilev D. A novel framework for horizontal and vertical data integration in cancer studies with application to survival time prediction models. Biol Direct 2019;14:22. [PMID: 31752974 PMCID: PMC6868770 DOI: 10.1186/s13062-019-0249-6] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Accepted: 09/20/2019] [Indexed: 12/17/2022] Open

Abstract

Background

Recently high-throughput technologies have been massively used alongside clinical tests to study various types of cancer. Data generated in such large-scale studies are heterogeneous, of different types and formats. With lack of effective integration strategies novel models are necessary for efficient and operative data integration, where both clinical and molecular information can be effectively joined for storage, access and ease of use. Such models, combined with machine learning methods for accurate prediction of survival time in cancer studies, can yield novel insights into disease development and lead to precise personalized therapies.

Results

We developed an approach for intelligent data integration of two cancer datasets (breast cancer and neuroblastoma) − provided in the CAMDA 2018 ‘Cancer Data Integration Challenge’, and compared models for prediction of survival time. We developed a novel semantic network-based data integration framework that utilizes NoSQL databases, where we combined clinical and expression profile data, using both raw data records and external knowledge sources. Utilizing the integrated data we introduced Tumor Integrated Clinical Feature (TICF) − a new feature for accurate prediction of patient survival time. Finally, we applied and validated several machine learning models for survival time prediction.

Conclusion

We developed a framework for semantic integration of clinical and omics data that can borrow information across multiple cancer studies. By linking data with external domain knowledge sources our approach facilitates enrichment of the studied data by discovery of internal relations. The proposed and validated machine learning models for survival time prediction yielded accurate results.

Reviewers

This article was reviewed by Eran Elhaik, Wenzhong Xiao and Carlos Loucera.

Collapse

Emam I, Elyasigomari V, Matthews A, Pavlidis S, Rocca-Serra P, Guitton F, Verbeeck D, Grainger L, Borgogni E, Del Giudice G, Saqi M, Houston P, Guo Y. PlatformTM, a standards-based data custodianship platform for translational medicine research. Sci Data 2019;6:149. [PMID: 31409798 PMCID: PMC6692384 DOI: 10.1038/s41597-019-0156-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Accepted: 07/25/2019] [Indexed: 12/20/2022] Open

Ethier J, McGilchrist M, Barton A, Cloutier A, Curcin V, Delaney BC, Burgun A. The TRANSFoRm project: Experience and lessons learned regarding functional and interoperability requirements to support primary care. Learn Health Syst 2018;2:e10037. [PMID: 31245579 PMCID: PMC6508823 DOI: 10.1002/lrh2.10037] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Revised: 07/05/2017] [Accepted: 07/12/2017] [Indexed: 01/02/2023] Open

Chen X, Gururaj AE, Ozyurt B, Liu R, Soysal E, Cohen T, Tiryaki F, Li Y, Zong N, Jiang M, Rogith D, Salimi M, Kim HE, Rocca-Serra P, Gonzalez-Beltran A, Farcas C, Johnson T, Margolis R, Alter G, Sansone SA, Fore IM, Ohno-Machado L, Grethe JS, Xu H. DataMed - an open source discovery index for finding biomedical datasets. J Am Med Inform Assoc 2018;25:300-308. [PMID: 29346583 PMCID: PMC7378878 DOI: 10.1093/jamia/ocx121] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2017] [Revised: 09/20/2017] [Accepted: 09/28/2017] [Indexed: 12/17/2022] Open

Affiliation(s)

Xiaoling Chen School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
Anupama E Gururaj School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
Burak Ozyurt Center for Research in Biological Systems
Ruiling Liu School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
Ergin Soysal School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
Trevor Cohen School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
Firat Tiryaki School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
Yueling Li Center for Research in Biological Systems
Nansu Zong Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, USA
Min Jiang School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
Deevakar Rogith School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
Mandana Salimi School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
Hyeon-Eui Kim Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, USA
Philippe Rocca-Serra e-Research Centre, University of Oxford, Oxford, UK
Alejandra Gonzalez-Beltran e-Research Centre, University of Oxford, Oxford, UK
Claudiu Farcas Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, USA
Todd Johnson School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
Ron Margolis National Institutes of Health, Bethesda, MD, USA
George Alter University of Michigan, Ann Arbor, MI, USA
Susanna-Assunta Sansone e-Research Centre, University of Oxford, Oxford, UK
Ian M Fore National Institutes of Health, Bethesda, MD, USA
Lucila Ohno-Machado Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, USA
Jeffrey S Grethe Center for Research in Biological Systems
Hua Xu School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA

Collapse

Al Kawam A, Sen A, Datta A, Dickey N. Understanding the Bioinformatics Challenges of Integrating Genomics into Healthcare. IEEE J Biomed Health Inform 2017;22:1672-1683. [PMID: 29990071 DOI: 10.1109/jbhi.2017.2778263] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Conceptual Modeling for Genomics: Building an Integrated Repository of Open Data. CONCEPTUAL MODELING 2017. [DOI: 10.1007/978-3-319-69904-2_26] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]

Savonnet M, Leclercq E, Naubourg P. eClims: An Extensible and Dynamic Integration Framework for Biomedical Information Systems. IEEE J Biomed Health Inform 2016;20:1640-1649. [DOI: 10.1109/jbhi.2015.2464353] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Warner JL, Jain SK, Levy MA. Integrating cancer genomic data into electronic health records. Genome Med 2016;8:113. [PMID: 27784327 PMCID: PMC5081968 DOI: 10.1186/s13073-016-0371-3] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open

Liu Y, Chiaromonte F, Li B. Structured Ordinary Least Squares: A Sufficient Dimension Reduction approach for regressions with partitioned predictors and heterogeneous units. Biometrics 2016;73:529-539. [PMID: 27649087 DOI: 10.1111/biom.12579] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2015] [Revised: 07/01/2016] [Accepted: 07/01/2016] [Indexed: 11/29/2022]

Myneni S, Patel VL, Bova GS, Wang J, Ackerman CF, Berlinicke CA, Chen SH, Lindvall M, Zack DJ. Resolving complex research data management issues in biomedical laboratories: Qualitative study of an industry-academia collaboration. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2016;126:160-70. [PMID: 26652980 PMCID: PMC4778387 DOI: 10.1016/j.cmpb.2015.11.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/26/2015] [Revised: 10/21/2015] [Accepted: 11/03/2015] [Indexed: 06/05/2023]

Masseroli M, Canakoglu A, Ceri S. Integration and Querying of Genomic and Proteomic Semantic Annotations for Biomedical Knowledge Extraction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016;13:209-219. [PMID: 27045824 DOI: 10.1109/tcbb.2015.2453944] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Herr TM, Bielinski SJ, Bottinger E, Brautbar A, Brilliant M, Chute CG, Cobb BL, Denny JC, Hakonarson H, Hartzler AL, Hripcsak G, Kannry J, Kohane IS, Kullo IJ, Lin S, Manzi S, Marsolo K, Overby CL, Pathak J, Peissig P, Pulley J, Ralston J, Rasmussen L, Roden DM, Tromp G, Uphoff T, Weng C, Wolf W, Williams MS, Starren J. Practical considerations in genomic decision support: The eMERGE experience. J Pathol Inform 2015;6:50. [PMID: 26605115 PMCID: PMC4629307 DOI: 10.4103/2153-3539.165999] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Accepted: 07/23/2015] [Indexed: 11/04/2022] Open

Abstract

BACKGROUND

Genomic medicine has the potential to improve care by tailoring treatments to the individual. There is consensus in the literature that pharmacogenomics (PGx) may be an ideal starting point for real-world implementation, due to the presence of well-characterized drug-gene interactions. Clinical Decision Support (CDS) is an ideal avenue by which to implement PGx at the bedside. Previous literature has established theoretical models for PGx CDS implementation and discussed a number of anticipated real-world challenges. However, work detailing actual PGx CDS implementation experiences has been limited. Anticipated challenges include data storage and management, system integration, physician acceptance, and more.

METHODS

In this study, we analyzed the experiences of ten members of the Electronic Medical Records and Genomics (eMERGE) Network, and one affiliate, in their attempts to implement PGx CDS. We examined the resulting PGx CDS system characteristics and conducted a survey to understand the unanticipated implementation challenges sites encountered.

RESULTS

Ten sites have successfully implemented at least one PGx CDS rule in the clinical setting. The majority of sites elected to create an Omic Ancillary System (OAS) to manage genetic and genomic data. All sites were able to adapt their existing CDS tools for PGx knowledge. The most common and impactful delays were not PGx-specific issues. Instead, they were general IT implementation problems, with top challenges including team coordination/communication and staffing. The challenges encountered caused a median total delay in system go-live of approximately two months.

CONCLUSIONS

These results suggest that barriers to PGx CDS implementations are generally surmountable. Moreover, PGx CDS implementation may not be any more difficult than other healthcare IT projects of similar scope, as the most significant delays encountered were not unique to genomic medicine. These are encouraging results for any institution considering implementing a PGx CDS tool, and for the advancement of genomic medicine.

Collapse

Affiliation(s)

Timothy M Herr Department of Preventive Medicine, Division of Health and Biomedical Informatics, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
Suzette J Bielinski Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
Erwin Bottinger The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine, Mount Sinai, New York, USA
Ariel Brautbar Division of Genetics and Endocrinology, Cook Children's Medical Center, Fort Worth, Texas, USA
Murray Brilliant Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, USA
Christopher G Chute Division of General Internal Medicine, Johns Hopkins University, Baltimore, Maryland, USA
Beth L Cobb Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
Joshua C Denny Department of Biomedical Informatics, Vanderbilt University, Baltimore, MD, USA
Hakon Hakonarson Department of Pediatrics, The Children's Hospital of Philadelphia, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA
Andrea L Hartzler Group Health Research Institute, Seattle, Washington, USA
George Hripcsak Department of Biomedical Informatics, Columbia University Medical Center, New York, USA
Joseph Kannry Icahn School of Medicine, Mount Sinai, New York, USA
Isaac S Kohane Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
Iftikhar J Kullo Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN, USA
Simon Lin Nationwide Children's Hospital, Columbus, Ohio, USA
Shannon Manzi Department of Pharmacy, Division of Genetics and Genomics, Boston Children's Hospital, Boston, Massachusetts, USA
Keith Marsolo Department of Pediatrics, University of Cincinnati College of Medicine, Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
Casey Lynnette Overby University of Maryland School of Medicine, Baltimore, Maryland, USA
Jyotishman Pathak Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
Peggy Peissig Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, USA
Jill Pulley Vanderbilt University School of Medicine, Nashville, Tennessee, USA
James Ralston Group Health Research Institute, Seattle, Washington, USA
Luke Rasmussen Department of Preventive Medicine, Division of Health and Biomedical Informatics, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
Dan M Roden Vanderbilt University School of Medicine, Nashville, Tennessee, USA
Gerard Tromp Weis Center for Research, Geisinger Clinic, Danville, Pennsylvania, USA
Timothy Uphoff Molecular Pathology, Mashfield Labs, Marshfield, Wisconsin, USA
Chunhua Weng Department of Biomedical Informatics, Columbia University, New York, USA
Wendy Wolf Department of Pediatrics, Harvard Medical School, Division of Genetics and Genomics, Boston Children's Hospital, Boston, Massachusetts, USA
Marc S Williams Genomic Medicine Institute, Geisinger Health System, Danville, Pennsylvania, USA
Justin Starren Department of Preventive Medicine, Division of Health and Biomedical Informatics, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA

Collapse

Orechia J, Pathak A, Shi Y, Nawani A, Belozerov A, Fontes C, Lakhiani C, Jawale C, Patel C, Quinn D, Botvinnik D, Mei E, Cotter E, Byleckie J, Ullman-Cullere M, Chhetri P, Chalasani P, Karnam P, Beaudoin R, Sahu S, Belozerova Y, Mathew JP. OncDRS: An integrative clinical and genomic data platform for enabling translational research and precision medicine. Appl Transl Genom 2015;6:18-25. [PMID: 27054074 PMCID: PMC4803771 DOI: 10.1016/j.atg.2015.08.005] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2015] [Accepted: 08/05/2015] [Indexed: 02/01/2023]

Affiliation(s)

John Orechia Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
Ameet Pathak Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
Yunling Shi Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
Aniket Nawani Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
Andrey Belozerov Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
Caitlin Fontes Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
Camille Lakhiani Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
Chetan Jawale Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
Chetansharan Patel Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
Daniel Quinn Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
Dmitry Botvinnik Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
Eddie Mei Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
Elizabeth Cotter Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
James Byleckie Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
Mollie Ullman-Cullere Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
Padam Chhetri Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
Poornima Chalasani Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
Purushotham Karnam Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
Ronald Beaudoin Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
Sandeep Sahu Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
Yelena Belozerova Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
Jomol P Mathew Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States

Collapse

Roman S, Panduro A. Genomic medicine in gastroenterology: A new approach or a new specialty? World J Gastroenterol 2015;21:8227-8237. [PMID: 26217074 PMCID: PMC4507092 DOI: 10.3748/wjg.v21.i27.8227] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/27/2015] [Revised: 03/24/2015] [Accepted: 05/04/2015] [Indexed: 02/06/2023] Open

Livingston KM, Bada M, Baumgartner WA, Hunter LE. KaBOB: ontology-based semantic integration of biomedical databases. BMC Bioinformatics 2015;16:126. [PMID: 25903923 PMCID: PMC4448321 DOI: 10.1186/s12859-015-0559-3] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2014] [Accepted: 03/30/2015] [Indexed: 04/04/2023] Open

Abstract

Background

The ability to query many independent biological databases using a common ontology-based semantic model would facilitate deeper integration and more effective utilization of these diverse and rapidly growing resources. Despite ongoing work moving toward shared data formats and linked identifiers, significant problems persist in semantic data integration in order to establish shared identity and shared meaning across heterogeneous biomedical data sources.

Results

We present five processes for semantic data integration that, when applied collectively, solve seven key problems. These processes include making explicit the differences between biomedical concepts and database records, aggregating sets of identifiers denoting the same biomedical concepts across data sources, and using declaratively represented forward-chaining rules to take information that is variably represented in source databases and integrating it into a consistent biomedical representation. We demonstrate these processes and solutions by presenting KaBOB (the Knowledge Base Of Biomedicine), a knowledge base of semantically integrated data from 18 prominent biomedical databases using common representations grounded in Open Biomedical Ontologies. An instance of KaBOB with data about humans and seven major model organisms can be built using on the order of 500 million RDF triples. All source code for building KaBOB is available under an open-source license.

Conclusions

KaBOB is an integrated knowledge base of biomedical data representationally based in prominent, actively maintained Open Biomedical Ontologies, thus enabling queries of the underlying data in terms of biomedical concepts (e.g., genes and gene products, interactions and processes) rather than features of source-specific data schemas or file formats. KaBOB resolves many of the issues that routinely plague biomedical researchers intending to work with data from multiple data sources and provides a platform for ongoing data integration and development and for formal reasoning over a wealth of integrated biomedical data.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0559-3) contains supplementary material, which is available to authorized users.

Collapse

Ashish N, Toga AW. Medical data transformation using rewriting. Front Neuroinform 2015;9:1. [PMID: 25750622 PMCID: PMC4335467 DOI: 10.3389/fninf.2015.00001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2014] [Accepted: 01/29/2015] [Indexed: 11/13/2022] Open

Machado CM, Rebholz-Schuhmann D, Freitas AT, Couto FM. The semantic web in translational medicine: current applications and future directions. Brief Bioinform 2015;16:89-103. [PMID: 24197933 PMCID: PMC4293377 DOI: 10.1093/bib/bbt079] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2013] [Accepted: 10/08/2013] [Indexed: 11/14/2022] Open

Wade TD, Zelarney PT, Hum RC, McGee S, Batson DH. Using patient lists to add value to integrated data repositories. J Biomed Inform 2014;52:72-7. [PMID: 24534444 DOI: 10.1016/j.jbi.2014.02.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2013] [Revised: 12/20/2013] [Accepted: 02/04/2014] [Indexed: 01/16/2023]

Alawieh A, Sabra Z, Nokkari A, El-Assaad A, Mondello S, Zaraket F, Fadlallah B, Kobeissy FH. Bioinformatics approach to understanding interacting pathways in neuropsychiatric disorders. Methods Mol Biol 2014;1168:157-172. [PMID: 24870135 DOI: 10.1007/978-1-4939-0847-9_9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]

Keator DB, Helmer K, Steffener J, Turner JA, Van Erp TGM, Gadde S, Ashish N, Burns GA, Nichols BN. Towards structured sharing of raw and derived neuroimaging data across existing resources. Neuroimage 2013;82:647-61. [PMID: 23727024 PMCID: PMC4028152 DOI: 10.1016/j.neuroimage.2013.05.094] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2012] [Revised: 05/11/2013] [Accepted: 05/18/2013] [Indexed: 10/26/2022] Open

Lengauer T. Stellenwert der Bioinformatik für die personalisierte Medizin. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz 2013;56:1489-94. [DOI: 10.1007/s00103-013-1819-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Devine EB, Capurro D, van Eaton E, Alfonso-Cristancho R, Devlin A, Yanez ND, Yetisgen-Yildiz M, Flum DR, Tarczy-Hornoch P. Preparing Electronic Clinical Data for Quality Improvement and Comparative Effectiveness Research: The SCOAP CERTAIN Automation and Validation Project. EGEMS 2013;1:1025. [PMID: 25848565 PMCID: PMC4371452 DOI: 10.13063/2327-9214.1025] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Abstract

Background:

The field of clinical research informatics includes creation of clinical data repositories (CDRs) used to conduct quality improvement (QI) activities and comparative effectiveness research (CER). Ideally, CDR data are accurately and directly abstracted from disparate electronic health records (EHRs), across diverse health-systems.

Objective:

Investigators from Washington State’s Surgical Care Outcomes and Assessment Program (SCOAP) Comparative Effectiveness Research Translation Network (CERTAIN) are creating such a CDR. This manuscript describes the automation and validation methods used to create this digital infrastructure.

Methods:

SCOAP is a QI benchmarking initiative. Data are manually abstracted from EHRs and entered into a data management system. CERTAIN investigators are now deploying Caradigm’s Amalga™ tool to facilitate automated abstraction of data from multiple, disparate EHRs. Concordance is calculated to compare data automatically to manually abstracted. Performance measures are calculated between Amalga and each parent EHR. Validation takes place in repeated loops, with improvements made over time. When automated abstraction reaches the current benchmark for abstraction accuracy - 95% - itwill ‘go-live’ at each site.

Progress to Date:

A technical analysis was completed at 14 sites. Five sites are contributing; the remaining sites prioritized meeting Meaningful Use criteria. Participating sites are contributing 15–18 unique data feeds, totaling 13 surgical registry use cases. Common feeds are registration, laboratory, transcription/dictation, radiology, and medications. Approximately 50% of 1,320 designated data elements are being automatically abstracted—25% from structured data; 25% from text mining.

Conclusion:

In semi-automating data abstraction and conducting a rigorous validation, CERTAIN investigators will semi-automate data collection to conduct QI and CER, while advancing the Learning Healthcare System.

Collapse

Using Commercially Available Tools for Multifaceted Health Assessment. Comput Inform Nurs 2013;31:329-34. [DOI: 10.1097/nxn.0b013e318295e58f] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Shoenbill K, Fost N, Tachinardi U, Mendonca EA. Genetic data and electronic health records: a discussion of ethical, logistical and technological considerations. J Am Med Inform Assoc 2013;21:171-80. [PMID: 23771953 PMCID: PMC3912723 DOI: 10.1136/amiajnl-2013-001694] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open

Harrow I, Filsell W, Woollard P, Dix I, Braxenthaler M, Gedye R, Hoole D, Kidd R, Wilson J, Rebholz-Schuhmann D. Towards Virtual Knowledge Broker services for semantic integration of life science literature and data sources. Drug Discov Today 2013;18:428-34. [DOI: 10.1016/j.drudis.2012.11.012] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2012] [Revised: 11/09/2012] [Accepted: 11/22/2012] [Indexed: 10/27/2022]

Lonergan DF, Ehrenfeld JM. Advancement of information technology in outpatient and perioperative settings to support patient care and translational research. Pain Manag 2012;2:445-9. [DOI: 10.2217/pmt.12.43] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open

Kounelakis MG, Zervakis ME, Giakos GC, Postma GJ, Buydens LMC, Kotsiakis X. On the relevance of glycolysis process on brain gliomas. IEEE J Biomed Health Inform 2012;17:128-35. [PMID: 22614725 DOI: 10.1109/titb.2012.2199128] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Palma JP, Benitz WE, Tarczy-Hornoch P, Butte AJ, Longhurst CA. Neonatal Informatics: Transforming Neonatal Care Through Translational Bioinformatics. Neoreviews 2012;13:e281-e284. [PMID: 22924023 DOI: 10.1542/neo.13-5-e281] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Bauer-Mehren A, van Mullingen EM, Avillach P, Carrascosa MDC, Garcia-Serna R, Piñero J, Singh B, Lopes P, Oliveira JL, Diallo G, Ahlberg Helgee E, Boyer S, Mestres J, Sanz F, Kors JA, Furlong LI. Automatic filtering and substantiation of drug safety signals. PLoS Comput Biol 2012;8:e1002457. [PMID: 22496632 PMCID: PMC3320573 DOI: 10.1371/journal.pcbi.1002457] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2011] [Accepted: 02/20/2012] [Indexed: 02/02/2023] Open

Abstract

Drug safety issues pose serious health threats to the population and constitute a major cause of mortality worldwide. Due to the prominent implications to both public health and the pharmaceutical industry, it is of great importance to unravel the molecular mechanisms by which an adverse drug reaction can be potentially elicited. These mechanisms can be investigated by placing the pharmaco-epidemiologically detected adverse drug reaction in an information-rich context and by exploiting all currently available biomedical knowledge to substantiate it. We present a computational framework for the biological annotation of potential adverse drug reactions. First, the proposed framework investigates previous evidences on the drug-event association in the context of biomedical literature (signal filtering). Then, it seeks to provide a biological explanation (signal substantiation) by exploring mechanistic connections that might explain why a drug produces a specific adverse reaction. The mechanistic connections include the activity of the drug, related compounds and drug metabolites on protein targets, the association of protein targets to clinical events, and the annotation of proteins (both protein targets and proteins associated with clinical events) to biological pathways. Hence, the workflows for signal filtering and substantiation integrate modules for literature and database mining, in silico drug-target profiling, and analyses based on gene-disease networks and biological pathways. Application examples of these workflows carried out on selected cases of drug safety signals are discussed. The methodology and workflows presented offer a novel approach to explore the molecular mechanisms underlying adverse drug reactions.

Adverse drug reactions (ADRs) constitute a major cause of morbidity and mortality worldwide. Due to the relevance of ADRs for both public health and pharmaceutical industry, it is important to develop efficient ways to monitor ADRs in the population. In addition, it is also essential to comprehend why a drug produces an adverse effect. To unravel the molecular mechanisms of ADRs, it is necessary to consider the ADR in the context of current biomedical knowledge that might explain it. Nowadays there are plenty of information sources that can be exploited in order to accomplish this goal. Nevertheless, the fragmentation of information and, more importantly, the diverse knowledge domains that need to be traversed, pose challenges to the task of exploring the molecular mechanisms of ADRs. We present a novel computational framework to aid in the collection and exploration of evidences that support the causal inference of ADRs detected by mining clinical records. This framework was implemented as publicly available tools integrating state-of-the-art bioinformatics methods for the analysis of drugs, targets, biological processes and clinical events. The availability of such tools for in silico experiments will facilitate research on the mechanisms that underlie ADR, contributing to the development of safer drugs.

Collapse

Target discovery from data mining approaches. Drug Discov Today 2012;17 Suppl:S16-23. [DOI: 10.1016/j.drudis.2011.12.006] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Müller H, Freytag JC, Leser U. Improving data quality by source analysis. ACM JOURNAL OF DATA AND INFORMATION QUALITY 2012. [DOI: 10.1145/2107536.2107538] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]

Abstract In many domains, data cleaning is hampered by our limited ability to specify a comprehensive set of integrity constraints to assist in identification of erroneous data. An alternative approach to improve data quality is to exploit different data sources that contain information about the same set of objects. Such overlapping sources highlight hot-spots of poor data quality through conflicting data values and immediately provide alternative values for conflict resolution. In order to derive a dataset of high quality, we can merge the overlapping sources based on a quality assessment of the conflicting values. The quality of the resulting dataset, however, is highly dependent on our ability to asses the quality of conflicting values effectively. The main objective of this article is to introduce methods that aid the developer of an integrated system over overlapping, but contradicting sources in the task of improving the quality of data. Value conflicts between contradicting sources are often systematic, caused by some characteristic of the different sources. Our goal is to identify such systematic differences and outline data patterns that occur in conjunction with them. Evaluated by an expert user, the regularities discovered provide insights into possible conflict reasons and help to assess the quality of inconsistent values. The contributions of this article are two concepts of systematic conflicts: contradiction patterns and minimal update sequences. Contradiction patterns resemble a special form of association rules that summarize characteristic data properties for conflict occurrence. We adapt existing association rule mining algorithms for mining contradiction patterns. Contradiction patterns, however, view each class of conflicts in isolation, sometimes leading to largely overlapping patterns. Sequences of set-oriented update operations that transform one data source into the other are compact descriptions for all regular differences among the sources. We consider minimal update sequences as the most likely explanation for observed differences between overlapping data sources. Furthermore, the order of operations within the sequences point out potential dependencies between systematic differences. Finding minimal update sequences, however, is beyond reach in practice. We show that the problem already is NP-complete for a restricted set of operations. In the light of this intractability result, we present heuristics that lead to convincing results for all examples we considered. Collapse

Greene CS, Troyanskaya OG. Accurate evaluation and analysis of functional genomics data and methods. Ann N Y Acad Sci 2012;1260:95-100. [PMID: 22268703 DOI: 10.1111/j.1749-6632.2011.06383.x] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Wade TD, Hum RC, Murphy JR. A Dimensional Bus model for integrating clinical and research data. J Am Med Inform Assoc 2011;18 Suppl 1:i96-102. [PMID: 21856687 PMCID: PMC3241170 DOI: 10.1136/amiajnl-2011-000339] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2011] [Accepted: 07/11/2011] [Indexed: 11/04/2022] Open

Malin B, Loukides G, Benitez K, Clayton EW. Identifiability in biobanks: models, measures, and mitigation strategies. Hum Genet 2011;130:383-92. [PMID: 21739176 PMCID: PMC3621020 DOI: 10.1007/s00439-011-1042-5] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2011] [Accepted: 06/12/2011] [Indexed: 12/29/2022]

Sarkar IN, Butte AJ, Lussier YA, Tarczy-Hornoch P, Ohno-Machado L. Translational bioinformatics: linking knowledge across biological and clinical realms. J Am Med Inform Assoc 2011;18:354-7. [PMID: 21561873 PMCID: PMC3128415 DOI: 10.1136/amiajnl-2011-000245] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2011] [Accepted: 04/19/2011] [Indexed: 11/30/2022] Open

Kim SS, Bhak J. Post-GWAS Strategies. Genomics Inform 2011. [DOI: 10.5808/gi.2011.9.1.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open