1
|
Ziegler J, Erpenbeck MP, Fuchs T, Saibold A, Volkmer PC, Schmidt G, Eicher J, Pallaoro P, De Souza Falguera R, Aubele F, Hagedorn M, Vansovich E, Raffler J, Ringshandl S, Kerscher A, Maurer JK, Kühnel B, Schenkirsch G, Kampf M, Kapsner LA, Ghanbarian H, Spengler H, Soto-Rey I, Albashiti F, Hellwig D, Ertl M, Fette G, Kraska D, Boeker M, Prokosch HU, Gulden C. Bridging Data Silos in Oncology with Modular Software for Federated Analysis on Fast Healthcare Interoperability Resources: Multisite Implementation Study. J Med Internet Res 2025; 27:e65681. [PMID: 40233352 PMCID: PMC12041822 DOI: 10.2196/65681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2024] [Revised: 11/24/2024] [Accepted: 12/18/2024] [Indexed: 04/17/2025] Open
Abstract
BACKGROUND Real-world data (RWD) from sources like administrative claims, electronic health records, and cancer registries offer insights into patient populations beyond the tightly regulated environment of randomized controlled trials. To leverage this and to advance cancer research, 6 university hospitals in Bavaria have established a joint research IT infrastructure. OBJECTIVE This study aimed to outline the design, implementation, and deployment of a modular data transformation pipeline that transforms oncological RWD into a Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR) format and then into a tabular format in preparation for a federated analysis (FA) across the 6 Bavarian Cancer Research Center university hospitals. METHODS To harness RWD effectively, we designed a pipeline to convert the oncological basic dataset (oBDS) into HL7 FHIR format and prepare it for FA. The pipeline handles diverse IT infrastructures and systems while maintaining privacy by keeping data decentralized for analysis. To assess the functionality and validity of our implementation, we defined a cohort to address two specific medical research questions. We evaluated our findings by comparing the results of the FA with reports from the Bavarian Cancer Registry and the original data from local tumor documentation systems. RESULTS We conducted an FA of 17,885 cancer cases from 2021/2022. Breast cancer was the most common diagnosis at 3 sites, prostate cancer ranked in the top 2 at 4 sites, and malignant melanoma was notably prevalent. Gender-specific trends showed larynx and esophagus cancers were more common in males, while breast and thyroid cancers were more frequent in females. Discrepancies between the Bavarian Cancer Registry and our data, such as higher rates of malignant melanoma (3400/63,771, 5.3% vs 1921/17,885, 10.7%) and lower representation of colorectal cancers (8100/63,771, 12.7% vs 1187/17,885, 6.6%) likely result from differences in the time periods analyzed (2019 vs 2021/2022) and the scope of data sources used. The Bavarian Cancer Registry reports approximately 3 times more cancer cases than the 6 university hospitals alone. CONCLUSIONS The modular pipeline successfully transformed oncological RWD across 6 hospitals, and the federated approach preserved privacy while enabling comprehensive analysis. Future work will add support for recent oBDS versions, automate data quality checks, and integrate additional clinical data. Our findings highlight the potential of federated health data networks and lay the groundwork for future research that can leverage high-quality RWD, aiming to contribute valuable knowledge to the field of cancer research.
Collapse
Affiliation(s)
- Jasmin Ziegler
- Medical Center for Information and Communication Technology, Universitätsklinikum Erlangen, Erlangen, Germany
- Bavarian Cancer Research Center (BZKF), Erlangen, Germany
- Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Marcel Pascal Erpenbeck
- Medical Center for Information and Communication Technology, Universitätsklinikum Erlangen, Erlangen, Germany
| | - Timo Fuchs
- Bavarian Cancer Research Center (BZKF), Erlangen, Germany
- Department of Nuclear Medicine, University Hospital Regensburg, Regensburg, Germany
- Medical Data Integration Center, University Hospital Regensburg, Regensburg, Germany
| | - Anna Saibold
- Bavarian Cancer Research Center (BZKF), Erlangen, Germany
- Department of Information Technology, University Hospital Regensburg, Regensburg, Germany
| | - Paul-Christian Volkmer
- Bavarian Cancer Research Center (BZKF), Erlangen, Germany
- Comprehensive Cancer Center Mainfranken, University Hospital Würzburg, Würzburg, Germany
| | - Guenter Schmidt
- Bavarian Cancer Research Center (BZKF), Erlangen, Germany
- Data Integration Center, University Hospital Würzburg, Würzburg, Germany
| | - Johanna Eicher
- Institute for Artificial Intelligence and Informatics in Medicine, Klinikum rechts der Isar, School of Medicine and Health, Technical University of Munich, Munich, Germany
- Data Integration Center, Klinikum rechts der Isar, School of Medicine and Health, Technical University of Munich, Munich, Germany
| | - Peter Pallaoro
- Bavarian Cancer Research Center (BZKF), Erlangen, Germany
- Institute for Artificial Intelligence and Informatics in Medicine, Klinikum rechts der Isar, School of Medicine and Health, Technical University of Munich, Munich, Germany
- Data Integration Center, Klinikum rechts der Isar, School of Medicine and Health, Technical University of Munich, Munich, Germany
| | - Renata De Souza Falguera
- Institute for Artificial Intelligence and Informatics in Medicine, Klinikum rechts der Isar, School of Medicine and Health, Technical University of Munich, Munich, Germany
- Section of Precision Psychiatry, Clinic for Psychiatry and Psychotherapy, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Fabio Aubele
- Medical Data Integration Center, LMU University Hospital, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Marlien Hagedorn
- Medical Data Integration Center, LMU University Hospital, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Ekaterina Vansovich
- Bavarian Cancer Research Center (BZKF), Erlangen, Germany
- Digital Medicine, University Hospital of Augsburg, Augsburg, Germany
| | - Johannes Raffler
- Bavarian Cancer Research Center (BZKF), Erlangen, Germany
- Digital Medicine, University Hospital of Augsburg, Augsburg, Germany
| | - Stephan Ringshandl
- Department of Medicine, Data Integration Center, Philipps-University Marburg, Marburg, Germany
| | - Alexander Kerscher
- Bavarian Cancer Research Center (BZKF), Erlangen, Germany
- Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
- Comprehensive Cancer Center Mainfranken, University Hospital Würzburg, Würzburg, Germany
| | - Julia Karolin Maurer
- Bavarian Cancer Research Center (BZKF), Erlangen, Germany
- University Cancer Center Regensburg, University Hospital Regensburg, Regensburg, Germany
| | - Brigitte Kühnel
- Bavarian Cancer Research Center (BZKF), Erlangen, Germany
- Comprehensive Cancer Center Munich, Klinikum rechts der Isar, Technical University of Munich, Munich, Germany
| | - Gerhard Schenkirsch
- Bavarian Cancer Research Center (BZKF), Erlangen, Germany
- Comprehensive Cancer Center Augsburg, University Hospital of Augsburg, Augsburg, Germany
| | - Marvin Kampf
- Medical Center for Information and Communication Technology, Universitätsklinikum Erlangen, Erlangen, Germany
| | - Lorenz A Kapsner
- Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
- Institute of Radiology, Uniklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Hadieh Ghanbarian
- Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Helmut Spengler
- Bavarian Cancer Research Center (BZKF), Erlangen, Germany
- Data Integration Center, Klinikum rechts der Isar, School of Medicine and Health, Technical University of Munich, Munich, Germany
| | - Iñaki Soto-Rey
- Bavarian Cancer Research Center (BZKF), Erlangen, Germany
- Digital Medicine, University Hospital of Augsburg, Augsburg, Germany
| | - Fady Albashiti
- Bavarian Cancer Research Center (BZKF), Erlangen, Germany
- Medical Data Integration Center, LMU University Hospital, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Dirk Hellwig
- Bavarian Cancer Research Center (BZKF), Erlangen, Germany
- Department of Nuclear Medicine, University Hospital Regensburg, Regensburg, Germany
- Medical Data Integration Center, University Hospital Regensburg, Regensburg, Germany
| | - Maximilian Ertl
- Data Integration Center, University Hospital Würzburg, Würzburg, Germany
| | - Georg Fette
- Data Integration Center, University Hospital Würzburg, Würzburg, Germany
| | - Detlef Kraska
- Medical Center for Information and Communication Technology, Universitätsklinikum Erlangen, Erlangen, Germany
| | - Martin Boeker
- Bavarian Cancer Research Center (BZKF), Erlangen, Germany
- Institute for Artificial Intelligence and Informatics in Medicine, Klinikum rechts der Isar, School of Medicine and Health, Technical University of Munich, Munich, Germany
| | - Hans-Ulrich Prokosch
- Medical Center for Information and Communication Technology, Universitätsklinikum Erlangen, Erlangen, Germany
- Bavarian Cancer Research Center (BZKF), Erlangen, Germany
- Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Christian Gulden
- Bavarian Cancer Research Center (BZKF), Erlangen, Germany
- Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
2
|
Oomens JE, Moonen JEF, Vos SJB, Beran M, Mateus P, De Deyn PP, van der Flier WM, Geerlings MI, Huisman MA, Ikram MA, Schram MT, Slagboom PE, Verschuren WMM, Beekman M, Bermejo I, Birhanu M, Bron EE, Dekker A, Frentz I, Garst SJF, Jaarsma E, Kok AAL, Marcolini S, Mei L, van Charante EPM, Richard E, Schalkwijk CG, van Sloten TT, Teunissen CE, Twait EL, Verberk IMW, Vonk JMJ, van de Waarenburg MPH, Wolters FJ, Jansen WJ, Visser PJ. Identifying pathways to the prevention of dementia: the Netherlands consortium of dementia cohorts. BMC Neurol 2025; 25:59. [PMID: 39939930 PMCID: PMC11816548 DOI: 10.1186/s12883-024-03995-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2024] [Accepted: 12/11/2024] [Indexed: 02/14/2025] Open
Abstract
BACKGROUND Aggregation of cohort data increases precision for studying neurodegenerative disease pathways, but efforts to combine data and expertise are often hampered by infrastructural, ethical and legal considerations. We aimed to unite various cohort studies in the Netherlands to enhance research infrastructure and facilitate research on dementia etiology and its public health implications. METHODS The Netherlands Consortium of Dementia Cohorts (NCDC) includes participants with initially no established cognitive impairment from 9 Dutch cohorts: the Amsterdam Dementia Cohort (ADC), Doetinchem Cohort Study (DCS), European Medical Information Framework for Alzheimer's Disease (EMIF-AD), Longitudinal Aging Study Amsterdam (LASA), the Leiden Longevity Study (LLS), The Maastricht Study, the Memolife substudy of the Lifelines cohort, Rotterdam Study and Second Manifestations of ARTerial disease-Magnetic Resonance (SMART-MR) study. The objectives of NCDC are to improve data infrastructure and access to cohorts related to aging and dementia, investigate the role of Alzheimer's disease and vascular pathology in the development of dementia and estimate the public health impact of established dementia risk factors by assessing their relative contribution to the population burden of dementia. RESULTS We increased the findability, accessibility, interoperability and reusability (FAIR) status of the cohorts through harmonization of data across cohorts, implementation of medical imaging repositories for scan management, implementation of the Personal Health Train infrastructure and provision of meta-data in existing cohort catalogues. We established the ethical and legal frameworks required for federated and pooled analyses and performed the first remote federated data analyses using the Personal Health Train infrastructure. To determine biomarkers of Alzheimer's disease, endothelial dysfunction and inflammation, 2554 plasma samples were analyzed centrally. Federated, pooled, and coordinated meta-analyses have led to multiple publications in the context of NCDC. CONCLUSION The combination of population-based and clinical cohorts, the coordinated assessment of plasma markers in previously collected samples and implementation and use of the Personal Health Train infrastructure for federated analysis are both feasible and promising for future collaborative efforts.
Collapse
Affiliation(s)
- Julie E Oomens
- Department of Psychiatry and Neuropsychology, School for Mental Health and Neuroscience, Alzheimer Centrum Limburg, Maastricht University, P.O. Box 616, Maastricht, 6200 MD, The Netherlands.
| | - Justine E F Moonen
- Department of Neurology, Alzheimer Centre Amsterdam, Amsterdam Neuroscience, Amsterdam UMC location VUmc, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Stephanie J B Vos
- Department of Psychiatry and Neuropsychology, School for Mental Health and Neuroscience, Alzheimer Centrum Limburg, Maastricht University, P.O. Box 616, Maastricht, 6200 MD, The Netherlands
| | - Magdalena Beran
- Department of Internal Medicine, Cardiovascular Research Institute Maastricht (CARIM), Maastricht University, Maastricht, The Netherlands
- Department of Epidemiology and Global Health, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
| | - Pedro Mateus
- Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, The Netherlands
| | - Peter P De Deyn
- Department of Neurology and Alzheimer Center, University Medical Center Groningen, Groningen, The Netherlands
- Laboratory of Neurochemistry and Behavior, Experimental Neurobiology Unit, University of Antwerp, Wilrijk, Antwerp, Belgium
| | - Wiesje M van der Flier
- Department of Neurology, Alzheimer Centre Amsterdam, Amsterdam Neuroscience, Amsterdam UMC location VUmc, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Epidemiology and Data Science, Vrije Universiteit Amsterdam, Amsterdam UMC location VUmc, Amsterdam, The Netherlands
| | - Mirjam I Geerlings
- Department of Epidemiology and Global Health, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
- Aging & Later life, and Personalized Medicine, Amsterdam Public Health, Amsterdam, The Netherlands
- Amsterdam Neuroscience, Neurodegeneration, and Mood, Psychosis, Stress, and Sleep, Amsterdam, The Netherlands
- Department of General Practice, University of Amsterdam, Amsterdam UMC, location, Amsterdam, The Netherlands
| | - Martijn A Huisman
- Epidemiology and Data Science, Vrije Universiteit Amsterdam, Amsterdam UMC location VUmc, Amsterdam, The Netherlands
- Aging & Later life, and Personalized Medicine, Amsterdam Public Health, Amsterdam, The Netherlands
- Department of Sociology, Faculty of Social Sciences, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - M Arfan Ikram
- Department of Epidemiology, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Miranda T Schram
- Department of Internal Medicine, Cardiovascular Research Institute Maastricht (CARIM), Maastricht University, Maastricht, The Netherlands
- Department of Epidemiology, Erasmus University Medical Center, Rotterdam, The Netherlands
- MHeNS School for Mental Health and Neuroscience, Maastricht University, Maastricht, Netherlands
- Heart and Vascular Center, Maastricht University Medical Center+, Maastricht, Netherlands
| | - P Eline Slagboom
- Molecular Epidemiology, Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - W M Monique Verschuren
- Department of Epidemiology and Global Health, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
- National Institute for Public Health and the Environment, Bilthoven, The Netherlands
| | - Marian Beekman
- Molecular Epidemiology, Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - Iñigo Bermejo
- Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, The Netherlands
| | - Mahlet Birhanu
- Biomedical Imaging Group Rotterdam, Dept. Radiology & Nuclear Medicine, Erasmus MC - University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - Esther E Bron
- Biomedical Imaging Group Rotterdam, Dept. Radiology & Nuclear Medicine, Erasmus MC - University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - Andre Dekker
- Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, The Netherlands
| | - Ingeborg Frentz
- Department of Neurology and Alzheimer Center, University Medical Center Groningen, Groningen, The Netherlands
- Department of Epidemiology, Erasmus University Medical Center, Rotterdam, The Netherlands
| | | | - Eva Jaarsma
- Epidemiology and Data Science, Vrije Universiteit Amsterdam, Amsterdam UMC location VUmc, Amsterdam, The Netherlands
- Center for Nutrition, Prevention, and Health Services, National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands
| | - Almar A L Kok
- Epidemiology and Data Science, Vrije Universiteit Amsterdam, Amsterdam UMC location VUmc, Amsterdam, The Netherlands
- Aging & Later life, and Personalized Medicine, Amsterdam Public Health, Amsterdam, The Netherlands
| | - Sofia Marcolini
- Department of Neurology and Alzheimer Center, University Medical Center Groningen, Groningen, The Netherlands
| | - Leon Mei
- Sequencing Analysis Support Core, Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - Eric P Moll van Charante
- Department of General Practice, Amsterdam Public Health, Amsterdam University Medical Centers location AMC, University of Amsterdam, Amsterdam, The Netherlands
- Department of Public & Occupational Health, Research Institute, Amsterdam UMC, Amsterdam Public Health, University of Amsterdam, Amsterdam, The Netherlands
| | - Edo Richard
- Department of Neurology, Donders Institute for Brain, Cognition and Behaviour, Center of Expertise for Parkinson & Movement Disorders, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Casper G Schalkwijk
- Department of Internal Medicine, Cardiovascular Research Institute Maastricht (CARIM), Maastricht University, Maastricht, The Netherlands
| | - Thomas T van Sloten
- Department of Vascular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Charlotte E Teunissen
- Neurochemistry Lab, Department of Laboratory Medicine, Amsterdam Neuroscience, Vrije Universiteit Amsterdam, Amsterdam UMC location VUmc, Amsterdam, The Netherlands
| | - Emma L Twait
- Department of Epidemiology and Global Health, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
- Aging & Later life, and Personalized Medicine, Amsterdam Public Health, Amsterdam, The Netherlands
- Department of General Practice, Amsterdam Public Health, Amsterdam University Medical Centers location AMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Inge M W Verberk
- Neurochemistry Lab, Department of Laboratory Medicine, Amsterdam Neuroscience, Vrije Universiteit Amsterdam, Amsterdam UMC location VUmc, Amsterdam, The Netherlands
| | - Jet M J Vonk
- Department of Epidemiology and Global Health, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
- Memory and Aging Center, Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Marjo P H van de Waarenburg
- Department of Internal Medicine, Cardiovascular Research Institute Maastricht (CARIM), Maastricht University, Maastricht, The Netherlands
| | - Frank J Wolters
- Department of Epidemiology, Erasmus University Medical Center, Rotterdam, The Netherlands
- Department of Radiology & Nuclear Medicine, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Willemijn J Jansen
- Department of Psychiatry and Neuropsychology, School for Mental Health and Neuroscience, Alzheimer Centrum Limburg, Maastricht University, P.O. Box 616, Maastricht, 6200 MD, The Netherlands
| | - Pieter Jelle Visser
- Department of Psychiatry and Neuropsychology, School for Mental Health and Neuroscience, Alzheimer Centrum Limburg, Maastricht University, P.O. Box 616, Maastricht, 6200 MD, The Netherlands
- Department of Neurology, Alzheimer Centre Amsterdam, Amsterdam Neuroscience, Amsterdam UMC location VUmc, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Department of Neurobiology, Care Sciences and Society, Division of Neurogeriatrics, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
3
|
Choudhury A, Volmer L, Martin F, Fijten R, Wee L, Dekker A, Soest JV. Advancing Privacy-Preserving Health Care Analytics and Implementation of the Personal Health Train: Federated Deep Learning Study. JMIR AI 2025; 4:e60847. [PMID: 39912580 PMCID: PMC11843053 DOI: 10.2196/60847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Revised: 10/01/2024] [Accepted: 10/17/2024] [Indexed: 02/07/2025]
Abstract
BACKGROUND The rapid advancement of deep learning in health care presents significant opportunities for automating complex medical tasks and improving clinical workflows. However, widespread adoption is impeded by data privacy concerns and the necessity for large, diverse datasets across multiple institutions. Federated learning (FL) has emerged as a viable solution, enabling collaborative artificial intelligence model development without sharing individual patient data. To effectively implement FL in health care, robust and secure infrastructures are essential. Developing such federated deep learning frameworks is crucial to harnessing the full potential of artificial intelligence while ensuring patient data privacy and regulatory compliance. OBJECTIVE The objective is to introduce an innovative FL infrastructure called the Personal Health Train (PHT) that includes the procedural, technical, and governance components needed to implement FL on real-world health care data, including training deep learning neural networks. The study aims to apply this federated deep learning infrastructure to the use case of gross tumor volume segmentation on chest computed tomography images of patients with lung cancer and present the results from a proof-of-concept experiment. METHODS The PHT framework addresses the challenges of data privacy when sharing data, by keeping data close to the source and instead bringing the analysis to the data. Technologically, PHT requires 3 interdependent components: "tracks" (protected communication channels), "trains" (containerized software apps), and "stations" (institutional data repositories), which are supported by the open source "Vantage6" software. The study applies this federated deep learning infrastructure to the use case of gross tumor volume segmentation on chest computed tomography images of patients with lung cancer, with the introduction of an additional component called the secure aggregation server, where the model averaging is done in a trusted and inaccessible environment. RESULTS We demonstrated the feasibility of executing deep learning algorithms in a federated manner using PHT and presented the results from a proof-of-concept study. The infrastructure linked 12 hospitals across 8 nations, covering 4 continents, demonstrating the scalability and global reach of the proposed approach. During the execution and training of the deep learning algorithm, no data were shared outside the hospital. CONCLUSIONS The findings of the proof-of-concept study, as well as the implications and limitations of the infrastructure and the results, are discussed. The application of federated deep learning to unstructured medical imaging data, facilitated by the PHT framework and Vantage6 platform, represents a significant advancement in the field. The proposed infrastructure addresses the challenges of data privacy and enables collaborative model development, paving the way for the widespread adoption of deep learning-based tools in the medical domain and beyond. The introduction of the secure aggregation server implied that data leakage problems in FL can be prevented by careful design decisions of the infrastructure. TRIAL REGISTRATION ClinicalTrials.gov NCT05775068; https://clinicaltrials.gov/study/NCT05775068.
Collapse
Affiliation(s)
- Ananya Choudhury
- GROW Research Institute for Oncology and Reproduction, Maastricht University Medical Center+, Maastricht, Netherlands
- Clinical Data Science, Maastricht University, Maastricht, Netherlands
| | - Leroy Volmer
- GROW Research Institute for Oncology and Reproduction, Maastricht University Medical Center+, Maastricht, Netherlands
- Clinical Data Science, Maastricht University, Maastricht, Netherlands
| | - Frank Martin
- Netherlands Comprehensive Cancer Organization (IKNL), Eindhoven, Netherlands
| | - Rianne Fijten
- GROW Research Institute for Oncology and Reproduction, Maastricht University Medical Center+, Maastricht, Netherlands
- Clinical Data Science, Maastricht University, Maastricht, Netherlands
| | - Leonard Wee
- GROW Research Institute for Oncology and Reproduction, Maastricht University Medical Center+, Maastricht, Netherlands
- Clinical Data Science, Maastricht University, Maastricht, Netherlands
| | - Andre Dekker
- GROW Research Institute for Oncology and Reproduction, Maastricht University Medical Center+, Maastricht, Netherlands
- Clinical Data Science, Maastricht University, Maastricht, Netherlands
- Brightlands Institute for Smart Society (BISS), Faculty of Science and Engineering (FSE), Maastricht University, Heerlen, Netherlands
| | - Johan van Soest
- GROW Research Institute for Oncology and Reproduction, Maastricht University Medical Center+, Maastricht, Netherlands
- Clinical Data Science, Maastricht University, Maastricht, Netherlands
- Brightlands Institute for Smart Society (BISS), Faculty of Science and Engineering (FSE), Maastricht University, Heerlen, Netherlands
| |
Collapse
|
4
|
van Timmeren JE, Bussink J, Koopmans P, Smeenk RJ, Monshouwer R. Longitudinal Image Data for Outcome Modeling. Clin Oncol (R Coll Radiol) 2025; 38:103610. [PMID: 39003124 DOI: 10.1016/j.clon.2024.06.053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 04/15/2024] [Accepted: 06/24/2024] [Indexed: 07/15/2024]
Abstract
In oncology, medical imaging is crucial for diagnosis, treatment planning and therapy execution. Treatment responses can be complex and varied and are known to involve factors of treatment, patient characteristics and tumor microenvironment. Longitudinal image analysis is able to track temporal changes, aiding in disease monitoring, treatment evaluation, and outcome prediction. This allows for the enhancement of personalized medicine. However, analyzing longitudinal 2D and 3D images presents unique challenges, including image registration, reliable segmentation, dealing with variable imaging intervals, and sparse data. This review presents an overview of techniques and methodologies in longitudinal image analysis, with a primary focus on outcome modeling in radiation oncology.
Collapse
Affiliation(s)
- J E van Timmeren
- Department of Radiation Oncology, Radboud University Medical Center, Nijmegen, the Netherlands.
| | - J Bussink
- Department of Radiation Oncology, Radboud University Medical Center, Nijmegen, the Netherlands.
| | - P Koopmans
- Department of Radiation Oncology, Radboud University Medical Center, Nijmegen, the Netherlands.
| | - R J Smeenk
- Department of Radiation Oncology, Radboud University Medical Center, Nijmegen, the Netherlands.
| | - R Monshouwer
- Department of Radiation Oncology, Radboud University Medical Center, Nijmegen, the Netherlands.
| |
Collapse
|
5
|
Price G, Peek N, Eleftheriou I, Spencer K, Paley L, Hogenboom J, van Soest J, Dekker A, van Herk M, Faivre-Finn C. An Overview of Real-World Data Infrastructure for Cancer Research. Clin Oncol (R Coll Radiol) 2025; 38:103545. [PMID: 38631976 DOI: 10.1016/j.clon.2024.03.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 02/27/2024] [Accepted: 03/13/2024] [Indexed: 04/19/2024]
Abstract
AIMS There is increasing interest in the opportunities offered by Real World Data (RWD) to provide evidence where clinical trial data does not exist, but access to appropriate data sources is frequently cited as a barrier to RWD research. This paper discusses current RWD resources and how they can be accessed for cancer research. MATERIALS AND METHODS There has been significant progress on facilitating RWD access in the last few years across a range of scales, from local hospital research databases, through regional care records and national repositories, to the impact of federated learning approaches on internationally collaborative studies. We use a series of case studies, principally from the UK, to illustrate how RWD can be accessed for research and healthcare improvement at each of these scales. RESULTS For each example we discuss infrastructure and governance requirements with the aim of encouraging further work in this space that will help to fill evidence gaps in oncology. CONCLUSION There are challenges, but real-world data research across a range of scales is already a reality. Taking advantage of the current generation of data sources requires researchers to carefully define their research question and the scale at which it would be best addressed.
Collapse
Affiliation(s)
- G Price
- Division of Cancer Sciences, University of Manchester, Manchester, UK; The Christie NHS Foundation Trust, Manchester, UK.
| | - N Peek
- Division of Informatics, Imaging and Data Sciences, University of Manchester, Manchester, UK; The Healthcare Improvement Studies Institute (THIS Institute), University of Cambridge, Cambridge, UK
| | - I Eleftheriou
- Division of Informatics, Imaging and Data Sciences, University of Manchester, Manchester, UK
| | - K Spencer
- Leeds Institute of Health Sciences, University of Leeds, Leeds, UK; Leeds Teaching Hospitals NHS Trust, Leeds, UK; National Disease Registration Service, NHS England, UK
| | - L Paley
- National Disease Registration Service, NHS England, UK
| | - J Hogenboom
- Department of Radiation Oncology (Maastro), GROW-School for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, The Netherlands
| | - J van Soest
- Department of Radiation Oncology (Maastro), GROW-School for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, The Netherlands; Brightlands Institute for Smart Society (BISS), Faculty of Science and Engineering, Maastricht University, Maastricht, The Netherlands
| | - A Dekker
- Department of Radiation Oncology (Maastro), GROW-School for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, The Netherlands
| | - M van Herk
- Division of Cancer Sciences, University of Manchester, Manchester, UK; The Christie NHS Foundation Trust, Manchester, UK
| | - C Faivre-Finn
- Division of Cancer Sciences, University of Manchester, Manchester, UK; The Christie NHS Foundation Trust, Manchester, UK
| |
Collapse
|
6
|
Bujotzek MR, Akünal Ü, Denner S, Neher P, Zenk M, Frodl E, Jaiswal A, Kim M, Krekiehn NR, Nickel M, Ruppel R, Both M, Döllinger F, Opitz M, Persigehl T, Kleesiek J, Penzkofer T, Maier-Hein K, Bucher A, Braren R. Real-world federated learning in radiology: hurdles to overcome and benefits to gain. J Am Med Inform Assoc 2025; 32:193-205. [PMID: 39455061 DOI: 10.1093/jamia/ocae259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Revised: 09/24/2024] [Accepted: 09/27/2024] [Indexed: 10/28/2024] Open
Abstract
OBJECTIVE Federated Learning (FL) enables collaborative model training while keeping data locally. Currently, most FL studies in radiology are conducted in simulated environments due to numerous hurdles impeding its translation into practice. The few existing real-world FL initiatives rarely communicate specific measures taken to overcome these hurdles. To bridge this significant knowledge gap, we propose a comprehensive guide for real-world FL in radiology. Minding efforts to implement real-world FL, there is a lack of comprehensive assessments comparing FL to less complex alternatives in challenging real-world settings, which we address through extensive benchmarking. MATERIALS AND METHODS We developed our own FL infrastructure within the German Radiological Cooperative Network (RACOON) and demonstrated its functionality by training FL models on lung pathology segmentation tasks across six university hospitals. Insights gained while establishing our FL initiative and running the extensive benchmark experiments were compiled and categorized into the guide. RESULTS The proposed guide outlines essential steps, identified hurdles, and implemented solutions for establishing successful FL initiatives conducting real-world experiments. Our experimental results prove the practical relevance of our guide and show that FL outperforms less complex alternatives in all evaluation scenarios. DISCUSSION AND CONCLUSION Our findings justify the efforts required to translate FL into real-world applications by demonstrating advantageous performance over alternative approaches. Additionally, they emphasize the importance of strategic organization, robust management of distributed data and infrastructure in real-world settings. With the proposed guide, we are aiming to aid future FL researchers in circumventing pitfalls and accelerating translation of FL into radiological applications.
Collapse
Affiliation(s)
- Markus Ralf Bujotzek
- Division of Medical Image Computing, German Cancer Research Center Heidelberg, Heidelberg, 69120, Germany
- Medical Faculty Heidelberg, University of Heidelberg, Heidelberg, 69120, Germany
| | - Ünal Akünal
- Division of Medical Image Computing, German Cancer Research Center Heidelberg, Heidelberg, 69120, Germany
| | - Stefan Denner
- Division of Medical Image Computing, German Cancer Research Center Heidelberg, Heidelberg, 69120, Germany
- Faculty of Mathematics and Computer Science, Heidelberg University, Heidelberg, 69120, Germany
| | - Peter Neher
- Division of Medical Image Computing, German Cancer Research Center Heidelberg, Heidelberg, 69120, Germany
- Pattern Analysis and Learning Group, Department of Radiation Oncology, Heidelberg University Hospital, Heidelberg, 69120, Germany
- German Cancer Consortium (DKTK), Partner Site Heidelberg, Heidelberg, 69120, Germany
| | - Maximilian Zenk
- Division of Medical Image Computing, German Cancer Research Center Heidelberg, Heidelberg, 69120, Germany
- Medical Faculty Heidelberg, University of Heidelberg, Heidelberg, 69120, Germany
| | - Eric Frodl
- Institute for Diagnostic and Interventional Radiology, University Hospital Frankfurt, Frankfurt (Main), 60590, Germany
- Goethe University Frankfurt, Frankfurt, 60590, Germany
| | - Astha Jaiswal
- Institute for Diagnostic and Interventional Radiology, Faculty of Medicine, University Hospital Cologne, University of Cologne, Cologne, 50937, Germany
| | - Moon Kim
- Institute for AI in Medicine (IKIM), University Hospital Essen (AöR), Essen, 45131, Germany
| | - Nicolai R Krekiehn
- Intelligent Imaging Lab@Section Biomedical Imaging, Department of Radiology and Neuroradiology, University Medical Center Schleswig-Holstein (UKSH), Kel, 24118, Germany
| | - Manuel Nickel
- Institute for AI in Medicine, Technical University of Munich, Munich, 81675, Germany
| | - Richard Ruppel
- Department of Radiology, Charité-Universitätsmedizin Berlin, Berlin, 10117, Germany
| | - Marcus Both
- Department of Radiology and Neuroradiology, University Medical Centers Schleswig-Holstein, Kiel, 24105, Germany
| | - Felix Döllinger
- Department of Radiology, Charité-Universitätsmedizin Berlin, Berlin, 10117, Germany
| | - Marcel Opitz
- Institute for Diagnostic and Interventional Radiology and Neuroradiology, University Hospital Essen (AÖR), Essen, 45131, Germany
| | - Thorsten Persigehl
- Institute for Diagnostic and Interventional Radiology, Faculty of Medicine, University Hospital Cologne, University of Cologne, Cologne, 50937, Germany
| | - Jens Kleesiek
- Institute for AI in Medicine (IKIM), University Hospital Essen (AöR), Essen, 45131, Germany
| | - Tobias Penzkofer
- Department of Radiology, Charité-Universitätsmedizin Berlin, Berlin, 10117, Germany
- Berlin Institute of Health, Berlin, 10178, Germany
| | - Klaus Maier-Hein
- Division of Medical Image Computing, German Cancer Research Center Heidelberg, Heidelberg, 69120, Germany
- Pattern Analysis and Learning Group, Department of Radiation Oncology, Heidelberg University Hospital, Heidelberg, 69120, Germany
- German Cancer Consortium (DKTK), Partner Site Heidelberg, Heidelberg, 69120, Germany
- National Center for Tumor Diseases (NCT), NCT Heidelberg, A Partnership Between DKFZ and The University Medical Center Heidelberg, Heidelberg, 69120, Germany
| | - Andreas Bucher
- Institute for Diagnostic and Interventional Radiology, University Hospital Frankfurt, Frankfurt (Main), 60590, Germany
- Goethe University Frankfurt, Frankfurt, 60590, Germany
| | - Rickmer Braren
- Institute for Diagnostic and Interventional Radiology, Klinikum rechts der Isar, Technical University of Munich, Munich, 81675, Germany
| |
Collapse
|
7
|
Conroy L, Winter J, Khalifa A, Tsui G, Berlin A, Purdie TG. Artificial Intelligence for Radiation Treatment Planning: Bridging Gaps From Retrospective Promise to Clinical Reality. Clin Oncol (R Coll Radiol) 2025; 37:103630. [PMID: 39531894 DOI: 10.1016/j.clon.2024.08.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 07/31/2024] [Accepted: 08/08/2024] [Indexed: 11/16/2024]
Abstract
Artificial intelligence (AI) radiation therapy (RT) planning holds promise for enhancing the consistency and efficiency of the RT planning process. Despite technical advancements, the widespread integration of AI into RT treatment planning faces challenges. The transition from controlled retrospective environments to real-world clinical settings introduces heightened scrutiny from clinical end users, potentially leading to decreased clinical acceptance. Key considerations for implementing AI RT planning include ensuring the AI model performance aligns with clinical standards, using high-quality training data, and incorporating sufficient data variation through meticulous curation by clinical experts. Beyond technical aspects, factors such as potential biases and the level of trust clinical end users place in AI may present unforeseen obstacles for real-world clinical use. Addressing these challenges requires bridging education and expertise gaps among clinical end users, enabling them to confidently embrace and utilize AI for routine RT planning. By fostering a better understanding of AI capabilities, building trust, and providing comprehensive training, the promises of AI RT planning can be a reality in the clinical setting. This article assesses the current clinical use of AI RT planning and explores challenges and considerations for bridging gaps in knowledge and expertise for AI operationalization, with focus on training data curation, workflow integration, explainability, bias, and domain knowledge. Remaining challenges in clinical implementation of AI RT treatment planning are examined in the context of trust building approaches.
Collapse
Affiliation(s)
- L Conroy
- Radiation Medicine Program, Princess Margaret Cancer Centre, 610 University Avenue, Toronto, Ontario, M5G 2M9, Canada; Techna Insitute, University Health Network, 190 Elizabeth St, Toronto, Ontario, M5G 2C4, Canada; Department of Radiation Oncology, University of Toronto, 149 College Street - Stewart Building Suite 504, Toronto, Ontario, M5T 1P5, Canada.
| | - J Winter
- Radiation Medicine Program, Princess Margaret Cancer Centre, 610 University Avenue, Toronto, Ontario, M5G 2M9, Canada; Techna Insitute, University Health Network, 190 Elizabeth St, Toronto, Ontario, M5G 2C4, Canada; Department of Radiation Oncology, University of Toronto, 149 College Street - Stewart Building Suite 504, Toronto, Ontario, M5T 1P5, Canada.
| | - A Khalifa
- Techna Insitute, University Health Network, 190 Elizabeth St, Toronto, Ontario, M5G 2C4, Canada; Department of Medical Biophysics, University of Toronto, Princess Maragret Cancer Research Tower, MaRS Centre, 101 College Street, Room 15-701, Toronto, Ontario, M5G 1L7, Canada.
| | - G Tsui
- Radiation Medicine Program, Princess Margaret Cancer Centre, 610 University Avenue, Toronto, Ontario, M5G 2M9, Canada.
| | - A Berlin
- Radiation Medicine Program, Princess Margaret Cancer Centre, 610 University Avenue, Toronto, Ontario, M5G 2M9, Canada; Techna Insitute, University Health Network, 190 Elizabeth St, Toronto, Ontario, M5G 2C4, Canada; Department of Radiation Oncology, University of Toronto, 149 College Street - Stewart Building Suite 504, Toronto, Ontario, M5T 1P5, Canada.
| | - T G Purdie
- Radiation Medicine Program, Princess Margaret Cancer Centre, 610 University Avenue, Toronto, Ontario, M5G 2M9, Canada; Techna Insitute, University Health Network, 190 Elizabeth St, Toronto, Ontario, M5G 2C4, Canada; Department of Radiation Oncology, University of Toronto, 149 College Street - Stewart Building Suite 504, Toronto, Ontario, M5T 1P5, Canada; Department of Medical Biophysics, University of Toronto, Princess Maragret Cancer Research Tower, MaRS Centre, 101 College Street, Room 15-701, Toronto, Ontario, M5G 1L7, Canada.
| |
Collapse
|
8
|
Bregonzio M, Bernasconi A, Pinoli P. Advancing healthcare through data: the BETTER project's vision for distributed analytics. Front Med (Lausanne) 2024; 11:1473874. [PMID: 39416867 PMCID: PMC11480012 DOI: 10.3389/fmed.2024.1473874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Accepted: 09/12/2024] [Indexed: 10/19/2024] Open
Abstract
Introduction Data-driven medicine is essential for enhancing the accessibility and quality of the healthcare system. The availability of data plays a crucial role in achieving this goal. Methods We propose implementing a robust data infrastructure of FAIRification and data fusion for clinical, genomic, and imaging data. This will be embedded within the framework of a distributed analytics platform for healthcare data analysis, utilizing the Personal Health Train paradigm. Results This infrastructure will ensure the findability, accessibility, interoperability, and reusability of data, metadata, and results among multiple medical centers participating in the BETTER Horizon Europe project. The project focuses on studying rare diseases, such as intellectual disability and inherited retinal dystrophies. Conclusion The anticipated impacts will benefit a wide range of healthcare practitioners and potentially influence health policymakers.
Collapse
Affiliation(s)
| | - Anna Bernasconi
- Department of Information, Electronics, and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Pietro Pinoli
- Department of Information, Electronics, and Bioengineering, Politecnico di Milano, Milan, Italy
| |
Collapse
|
9
|
Field M, Vinod S, Delaney GP, Aherne N, Bailey M, Carolan M, Dekker A, Greenham S, Hau E, Lehmann J, Ludbrook J, Miller A, Rezo A, Selvaraj J, Sykes J, Thwaites D, Holloway L. Federated Learning Survival Model and Potential Radiotherapy Decision Support Impact Assessment for Non-small Cell Lung Cancer Using Real-World Data. Clin Oncol (R Coll Radiol) 2024; 36:e197-e208. [PMID: 38631978 DOI: 10.1016/j.clon.2024.03.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 02/07/2024] [Accepted: 03/11/2024] [Indexed: 04/19/2024]
Abstract
AIMS The objective of this study was to develop a two-year overall survival model for inoperable stage I-III non-small cell lung cancer (NSCLC) patients using routine radiation oncology data over a federated (distributed) learning network and evaluate the potential of decision support for curative versus palliative radiotherapy. METHODS A federated infrastructure of data extraction, de-identification, standardisation, image analysis, and modelling was installed for seven clinics to obtain clinical and imaging features and survival information for patients treated in 2011-2019. A logistic regression model was trained for the 2011-2016 curative patient cohort and validated for the 2017-2019 cohort. Features were selected with univariate and model-based analysis and optimised using bootstrapping. System performance was assessed by the receiver operating characteristic (ROC) and corresponding area under curve (AUC), C-index, calibration metrics and Kaplan-Meier survival curves, with risk groups defined by model probability quartiles. Decision support was evaluated using a case-control analysis using propensity matching between treatment groups. RESULTS 1655 patient datasets were included. The overall model AUC was 0.68. Fifty-eight percent of patients treated with palliative radiotherapy had a low-to-moderate risk prediction according to the model, with survival times not significantly different (p = 0.87 and 0.061) from patients treated with curative radiotherapy classified as high-risk by the model. When survival was simulated by risk group and model-indicated treatment, there was an estimated 11% increase in survival rate at two years (p < 0.01). CONCLUSION Federated learning over multiple institution data can be used to develop and validate decision support systems for lung cancer while quantifying the potential impact of their use in practice. This paves the way for personalised medicine, where decisions can be based more closely on individual patient details from routine care.
Collapse
Affiliation(s)
- M Field
- South Western Sydney Clinical Campus, School of Clinical Medicine, UNSW, Sydney, New South Wales, Australia; Ingham Institute for Applied Medical Research, Liverpool, New South Wales, Australia; South Western Sydney Cancer Services, NSW Health, Sydney, New South Wales, Australia.
| | - S Vinod
- South Western Sydney Clinical Campus, School of Clinical Medicine, UNSW, Sydney, New South Wales, Australia; Ingham Institute for Applied Medical Research, Liverpool, New South Wales, Australia; South Western Sydney Cancer Services, NSW Health, Sydney, New South Wales, Australia
| | - G P Delaney
- South Western Sydney Clinical Campus, School of Clinical Medicine, UNSW, Sydney, New South Wales, Australia; Ingham Institute for Applied Medical Research, Liverpool, New South Wales, Australia; South Western Sydney Cancer Services, NSW Health, Sydney, New South Wales, Australia
| | - N Aherne
- Mid North Coast Cancer Institute, Coffs Harbour, New South Wales, Australia; Rural Clinical School, Faculty of Medicine, University of New South Wales, Sydney, New South Wales, Australia
| | - M Bailey
- Illawarra Cancer Care Centre, Wollongong, New South Wales, Australia
| | - M Carolan
- Illawarra Cancer Care Centre, Wollongong, New South Wales, Australia
| | - A Dekker
- Department of Radiation Oncology (MAASTRO), GROW School for Oncology and Developmental Biology, Maastricht University, Maastricht, The Netherlands
| | - S Greenham
- Mid North Coast Cancer Institute, Coffs Harbour, New South Wales, Australia
| | - E Hau
- Sydney West Radiation Oncology Network, Sydney, Australia; Westmead Clinical School, University of Sydney, Sydney, New South Wales, Australia
| | - J Lehmann
- School of Mathematical and Physical Sciences, University of Newcastle, Newcastle, New South Wales, Australia; Department of Radiation Oncology, Calvary Mater, Newcastle, New South Wales, Australia; Institute of Medical Physics, School of Physics, University of Sydney, Sydney, New South Wales, Australia
| | - J Ludbrook
- Department of Radiation Oncology, Calvary Mater, Newcastle, New South Wales, Australia
| | - A Miller
- Illawarra Cancer Care Centre, Wollongong, New South Wales, Australia
| | - A Rezo
- Canberra Health Services, Canberra, Australian Capital Territory, Australia
| | - J Selvaraj
- South Western Sydney Clinical Campus, School of Clinical Medicine, UNSW, Sydney, New South Wales, Australia; Canberra Health Services, Canberra, Australian Capital Territory, Australia
| | - J Sykes
- Sydney West Radiation Oncology Network, Sydney, Australia; Institute of Medical Physics, School of Physics, University of Sydney, Sydney, New South Wales, Australia
| | - D Thwaites
- Institute of Medical Physics, School of Physics, University of Sydney, Sydney, New South Wales, Australia; Radiotherapy Research Group, Leeds Institute for Medical Research, St James's Hospital and the University of Leeds, Leeds, UK
| | - L Holloway
- South Western Sydney Clinical Campus, School of Clinical Medicine, UNSW, Sydney, New South Wales, Australia; Ingham Institute for Applied Medical Research, Liverpool, New South Wales, Australia; South Western Sydney Cancer Services, NSW Health, Sydney, New South Wales, Australia; Institute of Medical Physics, School of Physics, University of Sydney, Sydney, New South Wales, Australia
| |
Collapse
|
10
|
Welten S, de Arruda Botelho Herr M, Hempel L, Hieber D, Placzek P, Graf M, Weber S, Neumann L, Jugl M, Tirpitz L, Kindermann K, Geisler S, Bonino da Silva Santos LO, Decker S, Pfeifer N, Kohlbacher O, Kirsten T. A study on interoperability between two Personal Health Train infrastructures in leukodystrophy data analysis. Sci Data 2024; 11:663. [PMID: 38909050 PMCID: PMC11193731 DOI: 10.1038/s41597-024-03450-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Accepted: 05/31/2024] [Indexed: 06/24/2024] Open
Abstract
The development of platforms for distributed analytics has been driven by a growing need to comply with various governance-related or legal constraints. Among these platforms, the so-called Personal Health Train (PHT) is one representative that has emerged over the recent years. However, in projects that require data from sites featuring different PHT infrastructures, institutions are facing challenges emerging from the combination of multiple PHT ecosystems, including data governance, regulatory compliance, or the modification of existing workflows. In these scenarios, the interoperability of the platforms is preferable. In this work, we introduce a conceptual framework for the technical interoperability of the PHT covering five essential requirements: Data integration, unified station identifiers, mutual metadata, aligned security protocols, and business logic. We evaluated our concept in a feasibility study that involves two distinct PHT infrastructures: PHT-meDIC and PADME. We analyzed data on leukodystrophy from patients in the University Hospitals of Tübingen and Leipzig, and patients with differential diagnoses at the University Hospital Aachen. The results of our study demonstrate the technical interoperability between these two PHT infrastructures, allowing researchers to perform analyses across the participating institutions. Our method is more space-efficient compared to the multi-homing strategy, and it shows only a minimal time overhead.
Collapse
Affiliation(s)
- Sascha Welten
- RWTH Aachen University, Chair of Computer Science 5, Aachen, 52074, Germany.
| | - Marius de Arruda Botelho Herr
- University Hospital Tübingen, Institute for Translational Bioinformatics, Tübingen, 72072, Germany.
- Methods in Medical Informatics, University Tübingen, Tübingen, 72076, Germany.
| | - Lars Hempel
- Mittweida University of Applied Sciences, Faculty Applied Computer and Bio Sciences, Mittweida, 09644, Germany
- Leipzig University Medical Center, Dept. Medical Data Science, Leipzig, 04107, Germany
- Leipzig University, Institute for Medical Informatics, Statistics and Epidemiology, Leipzig, 04107, Germany
| | - David Hieber
- University Hospital Tübingen, Institute for Translational Bioinformatics, Tübingen, 72072, Germany
| | - Peter Placzek
- University Hospital Tübingen, Institute for Translational Bioinformatics, Tübingen, 72072, Germany
| | - Michael Graf
- University Hospital Tübingen, Institute for Translational Bioinformatics, Tübingen, 72072, Germany
| | - Sven Weber
- RWTH Aachen University, Chair of Computer Science 5, Aachen, 52074, Germany
| | - Laurenz Neumann
- RWTH Aachen University, Chair of Computer Science 5, Aachen, 52074, Germany
| | - Maximilian Jugl
- Mittweida University of Applied Sciences, Faculty Applied Computer and Bio Sciences, Mittweida, 09644, Germany
- Leipzig University Medical Center, Dept. Medical Data Science, Leipzig, 04107, Germany
- Leipzig University, Institute for Medical Informatics, Statistics and Epidemiology, Leipzig, 04107, Germany
| | - Liam Tirpitz
- RWTH Aachen University, Data Stream Management and Analysis, Aachen, 52074, Germany
| | - Karl Kindermann
- RWTH Aachen University, Chair of Computer Science 5, Aachen, 52074, Germany
| | - Sandra Geisler
- RWTH Aachen University, Data Stream Management and Analysis, Aachen, 52074, Germany
- Fraunhofer Institute for Applied Information Technology FIT, Sankt Augustin, 53757, Germany
| | - Luiz Olavo Bonino da Silva Santos
- University of Twente - Enschede, Services and Cybersecurity Group, Faculty of Electrical Engineering, Mathematics and Computer Science, 7513 GB, Enschede, the Netherlands
| | - Stefan Decker
- RWTH Aachen University, Chair of Computer Science 5, Aachen, 52074, Germany
- Fraunhofer Institute for Applied Information Technology FIT, Sankt Augustin, 53757, Germany
| | - Nico Pfeifer
- Methods in Medical Informatics, University Tübingen, Tübingen, 72076, Germany
| | - Oliver Kohlbacher
- University Hospital Tübingen, Institute for Translational Bioinformatics, Tübingen, 72072, Germany
| | - Toralf Kirsten
- Mittweida University of Applied Sciences, Faculty Applied Computer and Bio Sciences, Mittweida, 09644, Germany
- Leipzig University Medical Center, Dept. Medical Data Science, Leipzig, 04107, Germany
- RWTH Aachen University, Data Stream Management and Analysis, Aachen, 52074, Germany
- Leipzig University, Center for Scalable Data Analytics and Artificial Intelligence, Leipzig, 04107, Germany
| |
Collapse
|
11
|
Choudhury A, Janssen E, Bongers BC, van Meeteren NLU, Dekker A, van Soest J. Colorectal cancer health and care quality indicators in a federated setting using the Personal Health Train. BMC Med Inform Decis Mak 2024; 24:121. [PMID: 38724966 PMCID: PMC11080148 DOI: 10.1186/s12911-024-02526-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Accepted: 05/02/2024] [Indexed: 05/13/2024] Open
Abstract
OBJECTIVE Hospitals and healthcare providers should assess and compare the quality of care given to patients and based on this improve the care. In the Netherlands, hospitals provide data to national quality registries, which in return provide annual quality indicators. However, this process is time-consuming, resource intensive and risks patient privacy and confidentiality. In this paper, we presented a multicentric 'Proof of Principle' study for federated calculation of quality indicators in patients with colorectal cancer. The findings suggest that the proposed approach is highly time-efficient and consume significantly lesser resources. MATERIALS AND METHODS Two quality indicators are calculated in an efficient and privacy presevering federated manner, by i) applying the Findable Accessible Interoperable and Reusable (FAIR) data principles and ii) using the Personal Health Train (PHT) infrastructure. Instead of sharing data to a centralized registry, PHT enables analysis by sending algorithms and sharing only insights from the data. RESULTS ETL process extracted data from the Electronic Health Record systems of the hospitals, converted them to FAIR data and hosted in RDF endpoints within each hospital. Finally, quality indicators from each center are calculated using PHT and the mean result along with the individual results plotted. DISCUSSION AND CONCLUSION PHT and FAIR data principles can efficiently calculate quality indicators in a privacy-preserving federated approach and the work can be scaled up both nationally and internationally. Despite this, application of the methodology was largely hampered by ELSI issues. However, the lessons learned from this study can provide other hospitals and researchers to adapt to the process easily and take effective measures in building quality of care infrastructures.
Collapse
Affiliation(s)
- Ananya Choudhury
- Department of Radiation Oncology (Maastro), GROW Research Institute for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, The Netherlands.
- Clinical Data Science Group, Faculty of Health Medicine and Life Sciences, Maastricht University Medical Center+, Paul-Henri Spaaklaan 1, Maastricht, 6229 GT, Netherlands.
| | - Esther Janssen
- Department of Orthopaedics, Maastricht University Medical Center+, Maastricht, The Netherlands
- Department of Orthopaedic Surgery, VieCuri Medical Center, Venlo, The Netherlands
| | - Bart C Bongers
- Department of Nutrition and Movement Sciences, Faculty of Health, Medicine and Life Sciences, School of Nutrition and Translational Research in Metabolism (NUTRIM), Maastricht University, Maastricht, the Netherlands
- Department of Epidemiology, Faculty of Health, Medicine and Life Sciences, Care and Public Health Research Institute (CAPHRI), Maastricht University, Maastricht, the Netherlands
| | - Nico L U van Meeteren
- Top Sector Life Sciences and Health (Health∼Holland), the Hague, the Netherlands
- Department of Anesthesiology, Erasmus Medical Center, Rotterdam, the Netherlands
- Topcare, Leiden, the Netherlands
| | - Andre Dekker
- Department of Radiation Oncology (Maastro), GROW Research Institute for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, The Netherlands
| | - Johan van Soest
- Department of Radiation Oncology (Maastro), GROW Research Institute for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, The Netherlands
- Brightlands Institute for Smart Society (BISS), Faculty of Science and Engineering (FSE), Maastricht University, Heerlen, the Netherlands
| |
Collapse
|
12
|
Buosi S, Timilsina M, Torrente M, Provencio M, Fey D, Nováček V. Boosting predictive models and augmenting patient data with relevant genomic and pathway information. Comput Biol Med 2024; 174:108398. [PMID: 38608322 DOI: 10.1016/j.compbiomed.2024.108398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 03/07/2024] [Accepted: 04/01/2024] [Indexed: 04/14/2024]
Abstract
The recurrence of low-stage lung cancer poses a challenge due to its unpredictable nature and diverse patient responses to treatments. Personalized care and patient outcomes heavily rely on early relapse identification, yet current predictive models, despite their potential, lack comprehensive genetic data. This inadequacy fuels our research focus-integrating specific genetic information, such as pathway scores, into clinical data. Our aim is to refine machine learning models for more precise relapse prediction in early-stage non-small cell lung cancer. To address the scarcity of genetic data, we employ imputation techniques, leveraging publicly available datasets such as The Cancer Genome Atlas (TCGA), integrating pathway scores into our patient cohort from the Cancer Long Survivor Artificial Intelligence Follow-up (CLARIFY) project. Through the integration of imputed pathway scores from the TCGA dataset with clinical data, our approach achieves notable strides in predicting relapse among a held-out test set of 200 patients. By training machine learning models on enriched knowledge graph data, inclusive of triples derived from pathway score imputation, we achieve a promising precision of 82% and specificity of 91%. These outcomes highlight the potential of our models as supplementary tools within tumour, node, and metastasis (TNM) classification systems, offering improved prognostic capabilities for lung cancer patients. In summary, our research underscores the significance of refining machine learning models for relapse prediction in early-stage non-small cell lung cancer. Our approach, centered on imputing pathway scores and integrating them with clinical data, not only enhances predictive performance but also demonstrates the promising role of machine learning in anticipating relapse and ultimately elevating patient outcomes.
Collapse
Affiliation(s)
- Samuele Buosi
- Data Science Institute, University of Galway, University Road, H91 TK33, Co. Galway, Ireland.
| | - Mohan Timilsina
- Data Science Institute, University of Galway, University Road, H91 TK33, Co. Galway, Ireland
| | - Maria Torrente
- Medical Oncology Department, Hospital Universitario Puerta de Hierro Majadahonda, C. Joaquín Rodrigo, 1, Majadahonda, Madrid, 28222, Spain
| | - Mariano Provencio
- Medical Oncology Department, Hospital Universitario Puerta de Hierro Majadahonda, C. Joaquín Rodrigo, 1, Majadahonda, Madrid, 28222, Spain
| | - Dirk Fey
- Systems Biology Ireland, University College Dublin, Co. Dublin, Ireland
| | - Vít Nováček
- Data Science Institute, University of Galway, University Road, H91 TK33, Co. Galway, Ireland; Faculty of Informatics, Masaryk University, Botanická 68a, 60200, Czech Republic; Masaryk Memorial Cancer Institute, Žlutý kopec 7, 65653, Czech Republic
| |
Collapse
|
13
|
Gottardelli B, Gouthamchand V, Masciocchi C, Boldrini L, Martino A, Mazzarella C, Massaccesi M, Monshouwer R, Findhammer J, Wee L, Dekker A, Gambacorta MA, Damiani A. A distributed feature selection pipeline for survival analysis using radiomics in non-small cell lung cancer patients. Sci Rep 2024; 14:7814. [PMID: 38570606 PMCID: PMC10991291 DOI: 10.1038/s41598-024-58241-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Accepted: 03/27/2024] [Indexed: 04/05/2024] Open
Abstract
Predictive modelling of cancer outcomes using radiomics faces dimensionality problems and data limitations, as radiomics features often number in the hundreds, and multi-institutional data sharing is ()often unfeasible. Federated learning (FL) and feature selection (FS) techniques combined can help overcome these issues, as one provides the means of training models without exchanging sensitive data, while the other identifies the most informative features, reduces overfitting, and improves model interpretability. Our proposed FS pipeline based on FL principles targets data-driven radiomics FS in a multivariate survival study of non-small cell lung cancer patients. The pipeline was run across datasets from three institutions without patient-level data exchange. It includes two FS techniques, Correlation-based Feature Selection and LASSO regularization, and Cox Proportional-Hazard regression with Overall Survival as endpoint. Trained and validated on 828 patients overall, our pipeline yielded a radiomic signature comprising "intensity-based energy" and "mean discretised intensity". Validation resulted in a mean Harrell C-index of 0.59, showcasing fair efficacy in risk stratification. In conclusion, we suggest a distributed radiomics approach that incorporates preliminary feature selection to systematically decrease the feature set based on data-driven considerations. This aims to address dimensionality challenges beyond those associated with data constraints and interpretability concerns.
Collapse
Affiliation(s)
- Benedetta Gottardelli
- Department of Diagnostica per Immagini, Radioterapia Oncologica ed Ematologia, Università Cattolica del Sacro Cuore, Rome, Italy
| | - Varsha Gouthamchand
- Clinical Data Science, GROW School of Oncology and Reproduction, Maastricht University, Maastricht, The Netherlands
| | - Carlotta Masciocchi
- Real World Data Facility, Gemelli Generator, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy.
| | - Luca Boldrini
- Department of Diagnostica per Immagini, Radioterapia Oncologica ed Ematologia, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy
| | - Antonella Martino
- Department of Diagnostica per Immagini, Radioterapia Oncologica ed Ematologia, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy
| | - Ciro Mazzarella
- Department of Diagnostica per Immagini, Radioterapia Oncologica ed Ematologia, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy
| | - Mariangela Massaccesi
- Department of Diagnostica per Immagini, Radioterapia Oncologica ed Ematologia, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy
| | - René Monshouwer
- Department of Radiation Oncology, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Jeroen Findhammer
- Department of Radiation Oncology, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Leonard Wee
- Department of Radiation Oncology (Maastro), GROW-School for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, The Netherlands
| | - Andre Dekker
- Department of Radiation Oncology (Maastro), GROW-School for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, The Netherlands
| | - Maria Antonietta Gambacorta
- Department of Diagnostica per Immagini, Radioterapia Oncologica ed Ematologia, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy
| | - Andrea Damiani
- Real World Data Facility, Gemelli Generator, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy
| |
Collapse
|
14
|
Mullie L, Afilalo J, Archambault P, Bouchakri R, Brown K, Buckeridge DL, Cavayas YA, Turgeon AF, Martineau D, Lamontagne F, Lebrasseur M, Lemieux R, Li J, Sauthier M, St-Onge P, Tang A, Witteman W, Chassé M. CODA: an open-source platform for federated analysis and machine learning on distributed healthcare data. J Am Med Inform Assoc 2024; 31:651-665. [PMID: 38128123 PMCID: PMC10873779 DOI: 10.1093/jamia/ocad235] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 10/28/2023] [Accepted: 12/02/2023] [Indexed: 12/23/2023] Open
Abstract
OBJECTIVES Distributed computations facilitate multi-institutional data analysis while avoiding the costs and complexity of data pooling. Existing approaches lack crucial features, such as built-in medical standards and terminologies, no-code data visualizations, explicit disclosure control mechanisms, and support for basic statistical computations, in addition to gradient-based optimization capabilities. MATERIALS AND METHODS We describe the development of the Collaborative Data Analysis (CODA) platform, and the design choices undertaken to address the key needs identified during our survey of stakeholders. We use a public dataset (MIMIC-IV) to demonstrate end-to-end multi-modal FL using CODA. We assessed the technical feasibility of deploying the CODA platform at 9 hospitals in Canada, describe implementation challenges, and evaluate its scalability on large patient populations. RESULTS The CODA platform was designed, developed, and deployed between January 2020 and January 2023. Software code, documentation, and technical documents were released under an open-source license. Multi-modal federated averaging is illustrated using the MIMIC-IV and MIMIC-CXR datasets. To date, 8 out of the 9 participating sites have successfully deployed the platform, with a total enrolment of >1M patients. Mapping data from legacy systems to FHIR was the biggest barrier to implementation. DISCUSSION AND CONCLUSION The CODA platform was developed and successfully deployed in a public healthcare setting in Canada, with heterogeneous information technology systems and capabilities. Ongoing efforts will use the platform to develop and prospectively validate models for risk assessment, proactive monitoring, and resource usage. Further work will also make tools available to facilitate migration from legacy formats to FHIR and DICOM.
Collapse
Affiliation(s)
- Louis Mullie
- Department of Medicine, Centre Hospitalier de l'Université de Montréal, Montréal, H2X 3E4, Canada
- Faculty of Medicine, Université de Montréal, Montréal, H3C 3J7, Canada
- Mila Quebec Artificial Intelligence Institute, Montréal, H2S 3H1, Canada
| | - Jonathan Afilalo
- Department of Medicine, Jewish General Hospital, Montréal, H3T 1E4, Canada
| | - Patrick Archambault
- Department of Emergency Medicine and Family Medicine, Université Laval, Québec, G1V 0A6, Canada
- Department of Anesthesiology and Critical Care Medicine, Université Laval, Québec, G1V 0A6, Canada
- Centre de Recherche Intégré pour un Système Apprenant en santé et Services Sociaux, Centre intégré de santé et de Services Sociaux de Chaudière-Appalaches, Lévis, G6V 3Z1, Canada
| | - Rima Bouchakri
- Centre de Recherche du Centre Hospitalier de l'Université de Montréal, Université de Montréal, Montréal, H2X 0A9, Canada
| | - Kip Brown
- Centre de Recherche du Centre Hospitalier de l'Université de Montréal, Université de Montréal, Montréal, H2X 0A9, Canada
| | - David L Buckeridge
- Mila Quebec Artificial Intelligence Institute, Montréal, H2S 3H1, Canada
- Department of Epidemiology and Biostatistics, School of Population and Global Health, McGill University Health Centre, Montréal, H3A 1G1, Canada
| | | | - Alexis F Turgeon
- Department of Anesthesiology and Critical Care Medicine, Université Laval, Québec, G1V 0A6, Canada
- Centre de recherche du CHU de Québec-Université Laval, Université Laval, Québec, G1V 4G2, Canada
| | - Denis Martineau
- Centre de recherche du CHU de Québec-Université Laval, Université Laval, Québec, G1V 4G2, Canada
| | - François Lamontagne
- Centre de recherche du CHUS, Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, J1G 2E8, Canada
| | - Martine Lebrasseur
- Centre de Recherche du Centre Hospitalier de l'Université de Montréal, Université de Montréal, Montréal, H2X 0A9, Canada
| | - Renald Lemieux
- Centre de recherche du CHUS, Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, J1G 2E8, Canada
| | - Jeffrey Li
- Centre de Recherche du Centre Hospitalier de l'Université de Montréal, Université de Montréal, Montréal, H2X 0A9, Canada
| | - Michaël Sauthier
- Faculty of Medicine, Université de Montréal, Montréal, H3C 3J7, Canada
- Department of Pediatrics, Université de Montréal and CHU Sainte-Justine Research Centre, Montréal, H3C 3J7, Canada
| | - Pascal St-Onge
- Centre de Recherche du Centre Hospitalier de l'Université de Montréal, Université de Montréal, Montréal, H2X 0A9, Canada
| | - An Tang
- Faculty of Medicine, Université de Montréal, Montréal, H3C 3J7, Canada
- Department of Radiology, Centre Hospitalier de l’Université de Montréal, Montréal, H2X 3E4, Canada
| | - William Witteman
- Centre de Recherche Intégré pour un Système Apprenant en santé et Services Sociaux, Centre intégré de santé et de Services Sociaux de Chaudière-Appalaches, Lévis, G6V 3Z1, Canada
| | - Michaël Chassé
- Department of Medicine, Centre Hospitalier de l'Université de Montréal, Montréal, H2X 3E4, Canada
- Faculty of Medicine, Université de Montréal, Montréal, H3C 3J7, Canada
| |
Collapse
|
15
|
Yan B, Cao D, Jiang X, Chen Y, Dai W, Dong F, Huang W, Zhang T, Gao C, Chen Q, Yan Z, Wang Z. FedEYE: A scalable and flexible end-to-end federated learning platform for ophthalmology. PATTERNS (NEW YORK, N.Y.) 2024; 5:100928. [PMID: 38370128 PMCID: PMC10873155 DOI: 10.1016/j.patter.2024.100928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 10/03/2023] [Accepted: 01/11/2024] [Indexed: 02/20/2024]
Abstract
Data-driven machine learning, as a promising approach, possesses the capability to build high-quality, exact, and robust models from ophthalmic medical data. Ophthalmic medical data, however, presently exist across disparate data silos with privacy limitations, making centralized training challenging. While ophthalmologists may not specialize in machine learning and artificial intelligence (AI), considerable impediments arise in the associated realm of research. To address these issues, we design and develop FedEYE, a scalable and flexible end-to-end ophthalmic federated learning platform. During FedEYE design, we adhere to four fundamental design principles, ensuring that ophthalmologists can effortlessly create independent and federated AI research tasks. Benefiting from the design principles and architecture of FedEYE, it encloses numerous key features, including rich and customizable capabilities, separation of concerns, scalability, and flexible deployment. We also validated the applicability of FedEYE by employing several prevalent neural networks on ophthalmic disease image classification tasks.
Collapse
Affiliation(s)
- Bingjie Yan
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- Beijing Key Laboratory of Mobile Computing and Pervasive Device, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Danmin Cao
- Aier Eye Hospital of Wuhan University, Wuhan, China
| | - Xinlong Jiang
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- Beijing Key Laboratory of Mobile Computing and Pervasive Device, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yiqiang Chen
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- Beijing Key Laboratory of Mobile Computing and Pervasive Device, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
- Peng Cheng Laboratory, Shenzhen, Guangdong, China
| | - Weiwei Dai
- Institute of Digital Ophthalmology and Visual Science, Changsha Aier Eye Hospital, Hunan, China
- AnHui Aier Eye Hospital, Anhui Medical University, Anhui, China
| | - Fan Dong
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- Beijing Key Laboratory of Mobile Computing and Pervasive Device, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Wuliang Huang
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- Beijing Key Laboratory of Mobile Computing and Pervasive Device, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Teng Zhang
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- Beijing Key Laboratory of Mobile Computing and Pervasive Device, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Chenlong Gao
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- Beijing Key Laboratory of Mobile Computing and Pervasive Device, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Qian Chen
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- Beijing Key Laboratory of Mobile Computing and Pervasive Device, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhen Yan
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- Beijing Key Laboratory of Mobile Computing and Pervasive Device, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhirui Wang
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- Beijing Key Laboratory of Mobile Computing and Pervasive Device, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
16
|
Mahon P, Chatzitheofilou I, Dekker A, Fernández X, Hall G, Helland A, Traverso A, Van Marcke C, Vehreschild J, Ciliberto G, Tonon G. A federated learning system for precision oncology in Europe: DigiONE. Nat Med 2024; 30:334-337. [PMID: 38195748 DOI: 10.1038/s41591-023-02715-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2024]
Affiliation(s)
- Piers Mahon
- Digital Institute for Cancer Outcomes Research E.E.I.G, Brussels, Belgium.
- IQVIA Cancer Research BV, Zaventem, Belgium.
| | | | - Andre Dekker
- Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre, Maastricht, the Netherlands
| | | | - Geoff Hall
- Leeds Teaching Hospital NHS Trust, Leeds, UK
- DATA-CAN, the Health Data Research UK Hub for Cancer, Leeds, UK
| | - Aslaug Helland
- Division for Cancer Medicine, Oslo University Hospital, Oslo, Norway
| | - Alberto Traverso
- Digital Institute for Cancer Outcomes Research E.E.I.G, Brussels, Belgium
- Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre, Maastricht, the Netherlands
- IRCCS Ospedale San Raffaele, Milan, Italy
| | - Cedric Van Marcke
- Department of Medical Oncology, Institut Roi Albert II, Cliniques Universitaires Saint-Luc, Brussels, Belgium
- Pôle Oncologie, Institut de Recherche Clinique et Expérimentale, UCLouvain, Brussels, Belgium
| | - Janne Vehreschild
- Goethe University Frankfurt, University Hospital, Center for Internal Medicine, Medical Department II, Frankfurt, Germany
| | | | - Giovanni Tonon
- IRCCS Ospedale San Raffaele, Milan, Italy
- Università Vita-Salute San Raffaele, Milan, Italy
| |
Collapse
|
17
|
Shin H, Ryu K, Kim JY, Lee S. Application of privacy protection technology to healthcare big data. Digit Health 2024; 10:20552076241282242. [PMID: 39502481 PMCID: PMC11536567 DOI: 10.1177/20552076241282242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 08/23/2024] [Indexed: 11/08/2024] Open
Abstract
With the advent of the big data era, data security issues are becoming more common. Healthcare organizations have more data to use for analysis, but they lose money every year due to their inability to prevent data leakage. To overcome these challenges, research on the use of data protection technologies in healthcare is actively underway, particularly research on state-of-the-art technologies, such as federated learning announced by Google and blockchain technology, which has recently attracted attention. To learn about these research efforts, we explored the research, methods, and limitations of the most widely used privacy technologies. After investigating related papers published between 2017 and 2023 and identifying the latest technology trends, we selected related papers and reviewed related technologies. In the process, four technologies were the focus of this study: blockchain, federated learning, isomorphic encryption, and differential privacy. Overall, our analysis provides researchers with insight into privacy technology research by suggesting the limitations of current privacy technologies and suggesting future research directions.
Collapse
Affiliation(s)
- Hyunah Shin
- Department of Healthcare Data Science Center, Konyang University Hospital, Daejeon, Republic of Korea
| | - Kyeongmin Ryu
- Department of Healthcare Data Science Center, Konyang University Hospital, Daejeon, Republic of Korea
| | - Jong-Yeup Kim
- Department of Healthcare Data Science Center, Konyang University Hospital, Daejeon, Republic of Korea
- Department of Otorhinolaryngology—Head and Neck Surgery, Konyang University College of Medicine, Daejeon, Republic of Korea
- Department of Biomedical Informatics, Konyang University College of Medicine, Daejeon, Republic of Korea
| | - Suehyun Lee
- College of IT Convergence, Gachon University, Seongnam, Republic of Korea
| |
Collapse
|
18
|
Pirmani A, De Brouwer E, Geys L, Parciak T, Moreau Y, Peeters LM. The Journey of Data Within a Global Data Sharing Initiative: A Federated 3-Layer Data Analysis Pipeline to Scale Up Multiple Sclerosis Research. JMIR Med Inform 2023; 11:e48030. [PMID: 37943585 PMCID: PMC10667980 DOI: 10.2196/48030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 08/25/2023] [Accepted: 09/30/2023] [Indexed: 11/10/2023] Open
Abstract
BACKGROUND Investigating low-prevalence diseases such as multiple sclerosis is challenging because of the rather small number of individuals affected by this disease and the scattering of real-world data across numerous data sources. These obstacles impair data integration, standardization, and analysis, which negatively impact the generation of significant meaningful clinical evidence. OBJECTIVE This study aims to present a comprehensive, research question-agnostic, multistakeholder-driven end-to-end data analysis pipeline that accommodates 3 prevalent data-sharing streams: individual data sharing, core data set sharing, and federated model sharing. METHODS A demand-driven methodology is employed for standardization, followed by 3 streams of data acquisition, a data quality enhancement process, a data integration procedure, and a concluding analysis stage to fulfill real-world data-sharing requirements. This pipeline's effectiveness was demonstrated through its successful implementation in the COVID-19 and multiple sclerosis global data sharing initiative. RESULTS The global data sharing initiative yielded multiple scientific publications and provided extensive worldwide guidance for the community with multiple sclerosis. The pipeline facilitated gathering pertinent data from various sources, accommodating distinct sharing streams and assimilating them into a unified data set for subsequent statistical analysis or secure data examination. This pipeline contributed to the assembly of the largest data set of people with multiple sclerosis infected with COVID-19. CONCLUSIONS The proposed data analysis pipeline exemplifies the potential of global stakeholder collaboration and underlines the significance of evidence-based decision-making. It serves as a paradigm for how data sharing initiatives can propel advancements in health care, emphasizing its adaptability and capacity to address diverse research inquiries.
Collapse
Affiliation(s)
- Ashkan Pirmani
- ESAT, STADIUS, KU Leuven, Leuven, Belgium
- Biomedical Research Institute, Hasselt University, Diepenbeek, Belgium
- Data Science Institute, Hasselt University, Diepenbeek, Belgium
- University Multiple Sclerosis Center, Hasselt University, Diepenbeek, Belgium
| | | | - Lotte Geys
- Biomedical Research Institute, Hasselt University, Diepenbeek, Belgium
- Data Science Institute, Hasselt University, Diepenbeek, Belgium
- University Multiple Sclerosis Center, Hasselt University, Diepenbeek, Belgium
| | - Tina Parciak
- Biomedical Research Institute, Hasselt University, Diepenbeek, Belgium
- Data Science Institute, Hasselt University, Diepenbeek, Belgium
- University Multiple Sclerosis Center, Hasselt University, Diepenbeek, Belgium
| | | | - Liesbet M Peeters
- Biomedical Research Institute, Hasselt University, Diepenbeek, Belgium
- Data Science Institute, Hasselt University, Diepenbeek, Belgium
- University Multiple Sclerosis Center, Hasselt University, Diepenbeek, Belgium
| |
Collapse
|
19
|
Gavai A, Bouzembrak Y, Mu W, Martin F, Kaliyaperumal R, van Soest J, Choudhury A, Heringa J, Dekker A, Marvin HJP. Applying federated learning to combat food fraud in food supply chains. NPJ Sci Food 2023; 7:46. [PMID: 37658060 PMCID: PMC10474077 DOI: 10.1038/s41538-023-00220-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 08/16/2023] [Indexed: 09/03/2023] Open
Abstract
Ensuring safe and healthy food is a big challenge due to the complexity of food supply chains and their vulnerability to many internal and external factors, including food fraud. Recent research has shown that Artificial Intelligence (AI) based algorithms, in particularly data driven Bayesian Network (BN) models, are very suitable as a tool to predict future food fraud and hence allowing food producers to take proper actions to avoid that such problems occur. Such models become even more powerful when data can be used from all actors in the supply chain, but data sharing is hampered by different interests, data security and data privacy. Federated learning (FL) may circumvent these issues as demonstrated in various areas of the life sciences. In this research, we demonstrate the potential of the FL technology for food fraud using a data driven BN, integrating data from different data owners without the data leaving the database of the data owners. To this end, a framework was constructed consisting of three geographically different data stations hosting different datasets on food fraud. Using this framework, a BN algorithm was implemented that was trained on the data of different data stations while the data remained at its physical location abiding by privacy principles. We demonstrated the applicability of the federated BN in food fraud and anticipate that such framework may support stakeholders in the food supply chain for better decision-making regarding food fraud control while still preserving the privacy and confidentiality nature of these data.
Collapse
Affiliation(s)
- Anand Gavai
- Industrial Engineering & Business Information Systems, University of Twente, Enschede, The Netherlands
- Wageningen Food Safety Research, Akkermaalsbos 2, 6708 WB, Wageningen, The Netherlands
| | - Yamine Bouzembrak
- Wageningen Food Safety Research, Akkermaalsbos 2, 6708 WB, Wageningen, The Netherlands.
- Information Technology Group, Wageningen University and Research, Wageningen, The Netherlands.
| | - Wenjuan Mu
- Wageningen Food Safety Research, Akkermaalsbos 2, 6708 WB, Wageningen, The Netherlands
| | - Frank Martin
- Netherlands Comprehensive Cancer Organization (IKNL), Eindhoven, The Netherlands
| | - Rajaram Kaliyaperumal
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Johan van Soest
- Brightlands Institute for Smart Society, Faculty of Science and Engineering, Maastricht University, Heerlen, The Netherlands
- Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre, Maastricht, The Netherlands
| | - Ananya Choudhury
- Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre, Maastricht, The Netherlands
| | - Jaap Heringa
- Centre for Integrative Bioinformatics (IBIVU), VU University Amsterdam, Amsterdam, The Netherlands
| | - Andre Dekker
- Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre, Maastricht, The Netherlands
| | - Hans J P Marvin
- Wageningen Food Safety Research, Akkermaalsbos 2, 6708 WB, Wageningen, The Netherlands
- Department of Research, Hayan Group, Rhenen, The Netherlands
| |
Collapse
|
20
|
Timilsina M, Fey D, Buosi S, Janik A, Costabello L, Carcereny E, Abreu DR, Cobo M, Castro RL, Bernabé R, Minervini P, Torrente M, Provencio M, Nováček V. Synergy between imputed genetic pathway and clinical information for predicting recurrence in early stage non-small cell lung cancer. J Biomed Inform 2023; 144:104424. [PMID: 37352900 DOI: 10.1016/j.jbi.2023.104424] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 06/06/2023] [Accepted: 06/11/2023] [Indexed: 06/25/2023]
Abstract
OBJECTIVE Lung cancer exhibits unpredictable recurrence in low-stage tumors and variable responses to different therapeutic interventions. Predicting relapse in early-stage lung cancer can facilitate precision medicine and improve patient survivability. While existing machine learning models rely on clinical data, incorporating genomic information could enhance their efficiency. This study aims to impute and integrate specific types of genomic data with clinical data to improve the accuracy of machine learning models for predicting relapse in early-stage, non-small cell lung cancer patients. METHODS The study utilized a publicly available TCGA lung cancer cohort and imputed genetic pathway scores into the Spanish Lung Cancer Group (SLCG) data, specifically in 1348 early-stage patients. Initially, tumor recurrence was predicted without imputed pathway scores. Subsequently, the SLCG data were augmented with pathway scores imputed from TCGA. The integrative approach aimed to enhance relapse risk prediction performance. RESULTS The integrative approach achieved improved relapse risk prediction with the following evaluation metrics: an area under the precision-recall curve (PR-AUC) score of 0.75, an area under the ROC (ROC-AUC) score of 0.80, an F1 score of 0.61, and a Precision of 0.80. The prediction explanation model SHAP (SHapley Additive exPlanations) was employed to explain the machine learning model's predictions. CONCLUSION We conclude that our explainable predictive model is a promising tool for oncologists that addresses an unmet clinical need of post-treatment patient stratification based on the relapse risk while also improving the predictive power by incorporating proxy genomic data not available for specific patients.
Collapse
Affiliation(s)
- Mohan Timilsina
- Data Science Institute, Insight Centre for Data Analytics, University of Galway, Ireland.
| | - Dirk Fey
- Systems Biology Ireland, University College Dublin, Ireland.
| | - Samuele Buosi
- Data Science Institute, Insight Centre for Data Analytics, University of Galway, Ireland.
| | | | | | - Enric Carcereny
- Catalan Institute of Oncology, Hospital Universitari Germans Trias i Pujol, B-ARGO, IGTP, Badalona, Spain.
| | | | - Manuel Cobo
- Medical Oncology Intercenter Unit. Regional and Virgen de la Victoria University Hospitals. IBIMA. Málaga., Spain.
| | | | - Reyes Bernabé
- Hospital Universitario Virgen del Rocio, Sevilla, Spain.
| | | | - Maria Torrente
- Medical Oncology Department, Hospital Universitario Puerta de Hierro Majadahonda, Madrid, Spain.
| | - Mariano Provencio
- Medical Oncology Department, Hospital Universitario Puerta de Hierro Majadahonda, Madrid, Spain.
| | - Vít Nováček
- Data Science Institute, Insight Centre for Data Analytics, University of Galway, Ireland; Faculty of Informatics, Masaryk University Brno, Czech Republic; Masaryk Memorial Cancer Institute, Brno, Czech Republic.
| |
Collapse
|
21
|
Bon JJ, Bretherton A, Buchhorn K, Cramb S, Drovandi C, Hassan C, Jenner AL, Mayfield HJ, McGree JM, Mengersen K, Price A, Salomone R, Santos-Fernandez E, Vercelloni J, Wang X. Being Bayesian in the 2020s: opportunities and challenges in the practice of modern applied Bayesian statistics. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2023; 381:20220156. [PMID: 36970822 PMCID: PMC10041356 DOI: 10.1098/rsta.2022.0156] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 01/06/2023] [Indexed: 06/18/2023]
Abstract
Building on a strong foundation of philosophy, theory, methods and computation over the past three decades, Bayesian approaches are now an integral part of the toolkit for most statisticians and data scientists. Whether they are dedicated Bayesians or opportunistic users, applied professionals can now reap many of the benefits afforded by the Bayesian paradigm. In this paper, we touch on six modern opportunities and challenges in applied Bayesian statistics: intelligent data collection, new data sources, federated analysis, inference for implicit models, model transfer and purposeful software products. This article is part of the theme issue 'Bayesian inference: challenges, perspectives, and prospects'.
Collapse
Affiliation(s)
- Joshua J. Bon
- Centre for Data Science, Queensland University of Technology, Brisbane, Queensland, Australia
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Adam Bretherton
- Centre for Data Science, Queensland University of Technology, Brisbane, Queensland, Australia
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Katie Buchhorn
- Centre for Data Science, Queensland University of Technology, Brisbane, Queensland, Australia
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Susanna Cramb
- Centre for Data Science, Queensland University of Technology, Brisbane, Queensland, Australia
- School of Public Health and Social Work, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Christopher Drovandi
- Centre for Data Science, Queensland University of Technology, Brisbane, Queensland, Australia
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Conor Hassan
- Centre for Data Science, Queensland University of Technology, Brisbane, Queensland, Australia
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Adrianne L. Jenner
- Centre for Data Science, Queensland University of Technology, Brisbane, Queensland, Australia
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Helen J. Mayfield
- Centre for Data Science, Queensland University of Technology, Brisbane, Queensland, Australia
- School of Public Health, The University of Queensland, Saint Lucia, Queensland, Australia
| | - James M. McGree
- Centre for Data Science, Queensland University of Technology, Brisbane, Queensland, Australia
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Kerrie Mengersen
- Centre for Data Science, Queensland University of Technology, Brisbane, Queensland, Australia
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Aiden Price
- Centre for Data Science, Queensland University of Technology, Brisbane, Queensland, Australia
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Robert Salomone
- Centre for Data Science, Queensland University of Technology, Brisbane, Queensland, Australia
- School of Computer Science, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Edgar Santos-Fernandez
- Centre for Data Science, Queensland University of Technology, Brisbane, Queensland, Australia
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Julie Vercelloni
- Centre for Data Science, Queensland University of Technology, Brisbane, Queensland, Australia
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Xiaoyu Wang
- Centre for Data Science, Queensland University of Technology, Brisbane, Queensland, Australia
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia
| |
Collapse
|
22
|
Wenzel HHB, Hardie AN, Moncada-Torres A, Høgdall CK, Bekkers RLM, Falconer H, Jensen PT, Nijman HW, van der Aa MA, Martin F, van Gestel AJ, Lemmens VEPP, Dahm-Kähler P, Alfonzo E, Persson J, Ekdahl L, Salehi S, Frøding LP, Markauskas A, Fuglsang K, Schnack TH. A federated approach to identify women with early-stage cervical cancer at low risk of lymph node metastases. Eur J Cancer 2023; 185:61-68. [PMID: 36965329 DOI: 10.1016/j.ejca.2023.02.021] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 02/22/2023] [Accepted: 02/22/2023] [Indexed: 02/27/2023]
Abstract
OBJECTIVE Lymph node metastases (pN+) in presumed early-stage cervical cancer negatively impact prognosis. Using federated learning, we aimed to develop a tool to identify a group of women at low risk of pN+, to guide the shared decision-making process concerning the extent of lymph node dissection. METHODS Women with cervical cancer between 2005 and 2020 were identified retrospectively from population-based registries: the Danish Gynaecological Cancer Database, Swedish Quality Registry for Gynaecologic Cancer and Netherlands Cancer Registry. Inclusion criteria were: squamous cell carcinoma, adenocarcinoma or adenosquamous carcinoma; The International Federation of Gynecology and Obstetrics 2009 IA2, IB1 and IIA1; treatment with radical hysterectomy and pelvic lymph node assessment. We applied privacy-preserving federated logistic regression to identify risk factors of pN+. Significant factors were used to stratify the risk of pN+. RESULTS We included 3606 women (pN+ 11%). The most important risk factors of pN+ were lymphovascular space invasion (LVSI) (odds ratio [OR] 5.16, 95% confidence interval [CI], 4.59-5.79), tumour size 21-40 mm (OR 2.14, 95% CI, 1.89-2.43) and depth of invasion>10 mm (OR 1.81, 95% CI, 1.59-2.08). A group of 1469 women (41%)-with tumours without LVSI, tumour size ≤20 mm, and depth of invasion ≤10 mm-had a very low risk of pN+ (2.4%, 95% CI, 1.7-3.3%). CONCLUSION Early-stage cervical cancer without LVSI, a tumour size ≤20 mm and depth of invasion ≤10 mm, confers a low risk of pN+. Based on an international privacy-preserving analysis, we developed a useful tool to guide the shared decision-making process regarding lymph node dissection.
Collapse
Affiliation(s)
- Hans H B Wenzel
- Department of Research & Development, Netherlands Comprehensive Cancer Organisation, Utrecht, the Netherlands; Department of Obstetrics and Gynaecology, University Medical Centre Groningen, University of Groningen, Groningen, the Netherlands.
| | - Anna N Hardie
- Department of Pelvic Cancer, Karolinska University Hospital and Department of Women's and Children's Health, Karolinska Institutet, Stockholm, Sweden
| | - Arturo Moncada-Torres
- Department of Research & Development, Netherlands Comprehensive Cancer Organisation, Utrecht, the Netherlands
| | - Claus K Høgdall
- Department of Gynaecology, Juliane Marie Centre, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Ruud L M Bekkers
- Department of Obstetrics and Gynaecology, GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre+, Maastricht, the Netherlands; Department of Obstetrics and Gynaecology, Catharina Hospital, Eindhoven, the Netherlands
| | - Henrik Falconer
- Department of Pelvic Cancer, Karolinska University Hospital and Department of Women's and Children's Health, Karolinska Institutet, Stockholm, Sweden
| | - Pernille T Jensen
- Department of Gynaecology and Obstetrics, Aarhus University Hospital, Aarhus, Denmark; Department of Clinical Medicine, Faculty of Health, Aarhus University, Aarhus, Denmark; Department of Clinical Research, University of Southern Denmark, Odense, Denmark
| | - Hans W Nijman
- Department of Obstetrics and Gynaecology, University Medical Centre Groningen, University of Groningen, Groningen, the Netherlands
| | - Maaike A van der Aa
- Department of Research & Development, Netherlands Comprehensive Cancer Organisation, Utrecht, the Netherlands
| | - Frank Martin
- Department of Research & Development, Netherlands Comprehensive Cancer Organisation, Utrecht, the Netherlands
| | - Anna J van Gestel
- Department of Research & Development, Netherlands Comprehensive Cancer Organisation, Utrecht, the Netherlands
| | - Valery E P P Lemmens
- Department of Research & Development, Netherlands Comprehensive Cancer Organisation, Utrecht, the Netherlands; Department of Public Health, Erasmus MC University Medical Centre, Rotterdam, the Netherlands
| | - Pernilla Dahm-Kähler
- Department of Obstetrics and Gynaecology, Sahlgrenska University Hospital and Department of Obstetrics and Gynaecology, Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Emilia Alfonzo
- Department of Obstetrics and Gynaecology, Sahlgrenska University Hospital and Department of Obstetrics and Gynaecology, Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Jan Persson
- Department of Obstetrics and Gynaecology, Division of Gynaecologic Oncology, Skåne University Hospital and Lund University Faculty of Medicine, Department of Clinical Sciences, Obstetrics and Gynaecology, Lund, Sweden
| | - Linnea Ekdahl
- Department of Obstetrics and Gynaecology, Division of Gynaecologic Oncology, Skåne University Hospital and Lund University Faculty of Medicine, Department of Clinical Sciences, Obstetrics and Gynaecology, Lund, Sweden
| | - Sahar Salehi
- Department of Pelvic Cancer, Karolinska University Hospital and Department of Women's and Children's Health, Karolinska Institutet, Stockholm, Sweden
| | - Ligita P Frøding
- Department of Gynaecology, Copenhagen University Hospital, Copenhagen, Denmark
| | | | - Katrine Fuglsang
- Department of Gynaecology and Obstetrics, Aarhus University Hospital, Aarhus, Denmark
| | - Tine H Schnack
- Department of Gynaecology, Juliane Marie Centre, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark; Department of Gynaecology, Odense University Hospital, Odense, Denmark
| |
Collapse
|
23
|
Hulsen T, Friedecký D, Renz H, Melis E, Vermeersch P, Fernandez-Calle P. From big data to better patient outcomes. Clin Chem Lab Med 2023; 61:580-586. [PMID: 36539928 DOI: 10.1515/cclm-2022-1096] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Accepted: 12/12/2022] [Indexed: 12/24/2022]
Abstract
Among medical specialties, laboratory medicine is the largest producer of structured data and must play a crucial role for the efficient and safe implementation of big data and artificial intelligence in healthcare. The area of personalized therapies and precision medicine has now arrived, with huge data sets not only used for experimental and research approaches, but also in the "real world". Analysis of real world data requires development of legal, procedural and technical infrastructure. The integration of all clinical data sets for any given patient is important and necessary in order to develop a patient-centered treatment approach. Data-driven research comes with its own challenges and solutions. The Findability, Accessibility, Interoperability, and Reusability (FAIR) Guiding Principles provide guidelines to make data findable, accessible, interoperable and reusable to the research community. Federated learning, standards and ontologies are useful to improve robustness of artificial intelligence algorithms working on big data and to increase trust in these algorithms. When dealing with big data, the univariate statistical approach changes to multivariate statistical methods significantly shifting the potential of big data. Combining multiple omics gives previously unsuspected information and provides understanding of scientific questions, an approach which is also called the systems biology approach. Big data and artificial intelligence also offer opportunities for laboratories and the In Vitro Diagnostic industry to optimize the productivity of the laboratory, the quality of laboratory results and ultimately patient outcomes, through tools such as predictive maintenance and "moving average" based on the aggregate of patient results.
Collapse
Affiliation(s)
- Tim Hulsen
- Department of Hospital Services & Informatics, Philips Research, Eindhoven, The Netherlands
| | - David Friedecký
- Department of Clinical Biochemistry, Laboratory for Inherited Metabolic Disorders, University Hospital Olomouc and Faculty of Medicine and Dentistry, Palacký University in Olomouc, Olomouc, Czech Republic
| | - Harald Renz
- Institute of Laboratory Medicine, member of the German Center for Lung Research (DZL), and the Universities of Giessen and Marburg Lung Center (UGMLC), Philipps University Marburg, Marburg, Germany
- Department of Clinical Immunology and Allergy, Laboratory of Immunopathology, I.M. Sechenov First Moscow State Medical University (Sechenov University), Moscow, Russia
| | - Els Melis
- Ortho Clinical Diagnostics, Zaventem, Belgium
| | - Pieter Vermeersch
- Clinical Department of Laboratory Medicine, University Hospitals Leuven, Leuven, Belgium
- Department of Cardiovascular Sciences, KU Leuven, Leuven, Belgium
- European Federation of Clinical Chemistry and Laboratory Medicine (EFLM), Milan, Italy
| | - Pilar Fernandez-Calle
- European Federation of Clinical Chemistry and Laboratory Medicine (EFLM), Milan, Italy
- Department of Laboratory Medicine, Hospital Universitario La Paz, Madrid, Spain
| |
Collapse
|
24
|
D’Amario D, Laborante R, Delvinioti A, Lenkowicz J, Iacomini C, Masciocchi C, Luraschi A, Damiani A, Rodolico D, Restivo A, Ciliberti G, Paglianiti DA, Canonico F, Patarnello S, Cesario A, Valentini V, Scambia G, Crea F. GENERATOR HEART FAILURE DataMart: An integrated framework for heart failure research. Front Cardiovasc Med 2023; 10:1104699. [PMID: 37034335 PMCID: PMC10073733 DOI: 10.3389/fcvm.2023.1104699] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 03/07/2023] [Indexed: 04/11/2023] Open
Abstract
Background Heart failure (HF) is a multifaceted clinical syndrome characterized by different etiologies, risk factors, comorbidities, and a heterogeneous clinical course. The current model, based on data from clinical trials, is limited by the biases related to a highly-selected sample in a protected environment, constraining the applicability of evidence in the real-world scenario. If properly leveraged, the enormous amount of data from real-world may have a groundbreaking impact on clinical care pathways. We present, here, the development of an HF DataMart framework for the management of clinical and research processes. Methods Within our institution, Fondazione Policlinico Universitario A. Gemelli in Rome (Italy), a digital platform dedicated to HF patients has been envisioned (GENERATOR HF DataMart), based on two building blocks: 1. All retrospective information has been integrated into a multimodal, longitudinal data repository, providing in one single place the description of individual patients with drill-down functionalities in multiple dimensions. This functionality might allow investigators to dynamically filter subsets of patient populations characterized by demographic characteristics, biomarkers, comorbidities, and clinical events (e.g., re-hospitalization), enabling agile analyses of the outcomes by subsets of patients. 2. With respect to expected long-term health status and response to treatments, the use of the disease trajectory toolset and predictive models for the evolution of HF has been implemented. The methodological scaffolding has been constructed in respect of a set of the preferred standards recommended by the CODE-EHR framework. Results Several examples of GENERATOR HF DataMart utilization are presented as follows: to select a specific retrospective cohort of HF patients within a particular period, along with their clinical and laboratory data, to explore multiple associations between clinical and laboratory data, as well as to identify a potential cohort for enrollment in future studies; to create a multi-parametric predictive models of early re-hospitalization after discharge; to cluster patients according to their ejection fraction (EF) variation, investigating its potential impact on hospital admissions. Conclusion The GENERATOR HF DataMart has been developed to exploit a large amount of data from patients with HF from our institution and generate evidence from real-world data. The two components of the HF platform might provide the infrastructural basis for a combined patient support program dedicated to continuous monitoring and remote care, assisting patients, caregivers, and healthcare professionals.
Collapse
Affiliation(s)
- Domenico D’Amario
- Department of Cardiovascular and Pulmonary Sciences, Catholic University of the Sacred Heart, Rome, Italy
- Department of Cardiovascular Sciences, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
- Università del Piemonte Orientale, Dipartimento Medicina Translazionale, Azienda Ospedaliero-Universitaria Maggiore della Carità, Dipartimento Toraco-Cardio-Vascolare, Unità Operativa Complessa di Cardiologia 1, Novara, Italy
| | - Renzo Laborante
- Department of Cardiovascular and Pulmonary Sciences, Catholic University of the Sacred Heart, Rome, Italy
| | - Agni Delvinioti
- Gemelli Generator, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Jacopo Lenkowicz
- Gemelli Generator, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Chiara Iacomini
- Gemelli Generator, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Carlotta Masciocchi
- Gemelli Generator, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Alice Luraschi
- Gemelli Generator, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Andrea Damiani
- Gemelli Generator, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Daniele Rodolico
- Department of Cardiovascular and Pulmonary Sciences, Catholic University of the Sacred Heart, Rome, Italy
| | - Attilio Restivo
- Department of Cardiovascular and Pulmonary Sciences, Catholic University of the Sacred Heart, Rome, Italy
| | - Giuseppe Ciliberti
- Department of Cardiovascular and Pulmonary Sciences, Catholic University of the Sacred Heart, Rome, Italy
| | - Donato Antonio Paglianiti
- Department of Cardiovascular and Pulmonary Sciences, Catholic University of the Sacred Heart, Rome, Italy
| | - Francesco Canonico
- Department of Cardiovascular and Pulmonary Sciences, Catholic University of the Sacred Heart, Rome, Italy
| | - Stefano Patarnello
- Gemelli Generator, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Alfredo Cesario
- Gemelli Generator, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Vincenzo Valentini
- Department of Bioimaging, Radiation Oncology and Hematology, Fondazione Policlinico Universitario “A. Gemelli” IRCCS, Università Cattolica S. Cuore, Rome, Italy
| | - Giovanni Scambia
- Gemelli Generator, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Filippo Crea
- Department of Cardiovascular and Pulmonary Sciences, Catholic University of the Sacred Heart, Rome, Italy
- Department of Cardiovascular Sciences, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| |
Collapse
|
25
|
Bovée JVMG, Webster F, Amary F, Baumhoer D, Bloem JLH, Bridge JA, Cates JMM, de Alava E, Dei Tos AP, Jones KB, Mahar A, Nielsen GP, Righi A, Wagner AJ, Yoshida A, Fletcher CDM. Datasets for the reporting of primary tumour in bone: recommendations from the International Collaboration on Cancer Reporting (ICCR). Histopathology 2023; 82:531-540. [PMID: 36464647 PMCID: PMC10107487 DOI: 10.1111/his.14849] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 11/28/2022] [Accepted: 11/29/2022] [Indexed: 12/12/2022]
Abstract
BACKGROUND AND OBJECTIVES Bone tumours are relatively rare and, as a consequence, treatment in a centre with expertise is required. Current treatment guidelines also recommend review by a specialised pathologist. Here we report on international consensus-based datasets for the pathology reporting of biopsy and resection specimens of bone sarcomas. The datasets were produced under the auspices of the International Collaboration on Cancer Reporting (ICCR), a global alliance of major (inter-)national pathology and cancer organisations. METHODS AND RESULTS According to the ICCR's process for dataset development, an international expert panel consisting of pathologists, an oncologic orthopaedic surgeon, a medical oncologist, and a radiologist produced a set of core and noncore data items for biopsy and resection specimens based on a critical review and discussion of current evidence. All professionals involved were bone tumour experts affiliated with tertiary referral centres. Commentary was provided for each data item to explain the rationale for selecting it as a core or noncore element, its clinical relevance, and to highlight potential areas of disagreement or lack of evidence, in which case a consensus position was formulated. Following international public consultation, the documents were finalised and ratified, and the datasets, including a synoptic reporting guide, were published on the ICCR website. CONCLUSION These first international datasets for bone sarcomas are intended to promote high-quality, standardised pathology reporting. Their widespread adoption will improve the consistency of reporting, facilitate multidisciplinary communication, and enhance comparability of data, all of which will help to improve management of bone sarcoma patients.
Collapse
Affiliation(s)
- Judith V M G Bovée
- Department of Pathology, Leiden University Medical Center, Leiden, The Netherlands.,Leiden Center for Computational Oncology, LUMC, Leiden, The Netherlands
| | - Fleur Webster
- International Collaboration on Cancer Reporting, Sydney, NSW, Australia
| | - Fernanda Amary
- Department of Histopathology, Royal National Orthopaedic Hospital, Stanmore, Greater London, UK.,Cancer Institute, University College London, London, UK
| | - Daniel Baumhoer
- Bone Tumour Reference Centre, Institute of Pathology, University Hospital Basel, Basel, Switzerland
| | - J L Hans Bloem
- Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Julia A Bridge
- Division of Molecular Pathology, ProPath, Dallas, TX, USA.,Department of Pathology and Microbiology, University of Nebraska Medical Center, Omaha, NE, USA
| | - Justin M M Cates
- Department of Pathology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Enrique de Alava
- Institute of Biomedicine of Sevilla (IBiS), Virgen del Rocio University Hospital, CSIC, University of Seville, Seville, Spain.,Department of Normal and Pathological Cytology and Histology, School of Medicine, University of Seville, Seville, Spain
| | - Angelo Paolo Dei Tos
- Department of Pathology, Azienda Ospedaliera Universitaria di Padova, Padova, Italy.,Department of Medicine, University of Padua, School of Medicine, Padua, Italy
| | - Kevin B Jones
- Department of Orthopaedics, Huntsman Cancer Institute, University of Utah School of Medicine, Salt Lake City, UT, USA.,Department of Oncological Sciences, Huntsman Cancer Institute, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Annabelle Mahar
- Department of Tissue Pathology and Diagnostic Oncology, Royal Prince Alfred Hospital, Camperdown, NSW, Australia
| | - G Petur Nielsen
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA.,Harvard Medical School, Boston, MA, USA
| | - Alberto Righi
- Department of Pathology, IRCCS Istituto Ortopedico Rizzoli, Bologna, Italy
| | - Andrew J Wagner
- Harvard Medical School, Boston, MA, USA.,Center for Sarcoma and Bone Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Akihiko Yoshida
- Department of Diagnostic Pathology, National Cancer Center Hospital, Tokyo, Japan.,Rare Cancer Center, National Cancer Center Hospital, Tokyo, Japan
| | | |
Collapse
|
26
|
Moshawrab M, Adda M, Bouzouane A, Ibrahim H, Raad A. Reviewing Federated Machine Learning and Its Use in Diseases Prediction. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23042112. [PMID: 36850717 PMCID: PMC9958993 DOI: 10.3390/s23042112] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 02/04/2023] [Accepted: 02/09/2023] [Indexed: 05/31/2023]
Abstract
Machine learning (ML) has succeeded in improving our daily routines by enabling automation and improved decision making in a variety of industries such as healthcare, finance, and transportation, resulting in increased efficiency and production. However, the development and widespread use of this technology has been significantly hampered by concerns about data privacy, confidentiality, and sensitivity, particularly in healthcare and finance. The "data hunger" of ML describes how additional data can increase performance and accuracy, which is why this question arises. Federated learning (FL) has emerged as a technology that helps solve the privacy problem by eliminating the need to send data to a primary server and collect it where it is processed and the model is trained. To maintain privacy and improve model performance, FL shares parameters rather than data during training, in contrast to the typical ML practice of sending user data during model development. Although FL is still in its infancy, there are already applications in various industries such as healthcare, finance, transportation, and others. In addition, 32% of companies have implemented or plan to implement federated learning in the next 12-24 months, according to the latest figures from KPMG, which forecasts an increase in investment in this area from USD 107 million in 2020 to USD 538 million in 2025. In this context, this article reviews federated learning, describes it technically, differentiates it from other technologies, and discusses current FL aggregation algorithms. It also discusses the use of FL in the diagnosis of cardiovascular disease, diabetes, and cancer. Finally, the problems hindering progress in this area and future strategies to overcome these limitations are discussed in detail.
Collapse
Affiliation(s)
- Mohammad Moshawrab
- Département de Mathématiques, Informatique et Génie, Université du Québec à Rimouski, 300 Allée des Ursulines, Rimouski, QC G5L 3A1, Canada
| | - Mehdi Adda
- Département de Mathématiques, Informatique et Génie, Université du Québec à Rimouski, 300 Allée des Ursulines, Rimouski, QC G5L 3A1, Canada
| | - Abdenour Bouzouane
- Département d’Informatique et de Mathématique, Université du Québec à Chicoutimi, 555 Boulevard de l’Université, Chicoutimi, QC G7H 2B1, Canada
| | - Hussein Ibrahim
- Institut Technologique de Maintenance Industrielle, 175 Rue de la Vérendrye, Sept-Îles, QC G4R 5B7, Canada
| | - Ali Raad
- Faculty of Arts & Sciences, Islamic University of Lebanon, Wardaniyeh P.O. Box 30014, Lebanon
| |
Collapse
|
27
|
Personal Health Train Architecture with Dynamic Cloud Staging. SN COMPUTER SCIENCE 2023; 4:14. [PMID: 36274815 PMCID: PMC9574821 DOI: 10.1007/s42979-022-01422-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Accepted: 09/16/2022] [Indexed: 11/05/2022]
Abstract
Scientific advances, especially in the healthcare domain, can be accelerated by making data available for analysis. However, in traditional data analysis systems, data need to be moved to a central processing unit that performs analyses, which may be undesirable, e.g. due to privacy regulations in case these data contain personal information. This paper discusses the Personal Health Train (PHT) approach in which data processing is brought to the (personal health) data rather than the other way around, allowing (private) data accessed to be controlled, and to observe ethical and legal concerns. This paper introduces the PHT architecture and discusses the data staging solution that allows processing to be delegated to components spawned in a private cloud environment in case the (health) organisation hosting the data has limited resources to execute the required processing. This paper shows the feasibility and suitability of the solution with a relatively simple, yet representative, case study of data analysis of Covid-19 infections, which is performed by components that are created on demand and run in the Amazon Web Services platform. This paper also shows that the performance of our solution is acceptable, and that our solution is scalable. This paper demonstrates that the PHT approach enables data analysis with controlled access, preserving privacy and complying with regulations such as GDPR, while the solution is deployed in a private cloud environment.
Collapse
|
28
|
Nguyen TX, Ran AR, Hu X, Yang D, Jiang M, Dou Q, Cheung CY. Federated Learning in Ocular Imaging: Current Progress and Future Direction. Diagnostics (Basel) 2022; 12:2835. [PMID: 36428895 PMCID: PMC9689273 DOI: 10.3390/diagnostics12112835] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 11/11/2022] [Accepted: 11/14/2022] [Indexed: 11/18/2022] Open
Abstract
Advances in artificial intelligence deep learning (DL) have made tremendous impacts on the field of ocular imaging over the last few years. Specifically, DL has been utilised to detect and classify various ocular diseases on retinal photographs, optical coherence tomography (OCT) images, and OCT-angiography images. In order to achieve good robustness and generalisability of model performance, DL training strategies traditionally require extensive and diverse training datasets from various sites to be transferred and pooled into a "centralised location". However, such a data transferring process could raise practical concerns related to data security and patient privacy. Federated learning (FL) is a distributed collaborative learning paradigm which enables the coordination of multiple collaborators without the need for sharing confidential data. This distributed training approach has great potential to ensure data privacy among different institutions and reduce the potential risk of data leakage from data pooling or centralisation. This review article aims to introduce the concept of FL, provide current evidence of FL in ocular imaging, and discuss potential challenges as well as future applications.
Collapse
Affiliation(s)
- Truong X. Nguyen
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - An Ran Ran
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Xiaoyan Hu
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Dawei Yang
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Meirui Jiang
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Qi Dou
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Carol Y. Cheung
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
29
|
Rønn Hansen C, Price G, Field M, Sarup N, Zukauskaite R, Johansen J, Eriksen JG, Aly F, McPartlin A, Holloway L, Thwaites D, Brink C. Larynx cancer survival model developed through open-source federated learning. Radiother Oncol 2022; 176:179-186. [PMID: 36208652 DOI: 10.1016/j.radonc.2022.09.023] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 08/12/2022] [Accepted: 09/28/2022] [Indexed: 12/14/2022]
Abstract
INTRODUCTION Federated learning has the potential to perfrom analysis on decentralised data; however, there are some obstacles to survival analyses as there is a risk of data leakage. This study demonstrates how to perform a stratified Cox regression survival analysis specifically designed to avoid data leakage using federated learning on larynx cancer patients from centres in three different countries. METHODS Data were obtained from 1821 larynx cancer patients treated with radiotherapy in three centres. Tumour volume was available for all 786 of the included patients. Parameter selection among eleven clinical and radiotherapy parameters were performed using best subset selection and cross-validation through the federated learning system, AusCAT. After parameter selection, β regression coefficients were estimated using bootstrap. Calibration plots were generated at 2 and 5-years survival, and inner and outer risk groups' Kaplan-Meier curves were compared to the Cox model prediction. RESULTS The best performing Cox model included log(GTV), performance status, age, smoking, haemoglobin and N-classification; however, the simplest model with similar statistical prediction power included log(GTV) and performance status only. The Harrell C-indices for the simplest model were for Odense, Christie and Liverpool 0.75[0.71-0.78], 0.65[0.59-0.71], and 0.69[0.59-0.77], respectively. The values are slightly higher for the full model with C-index 0.77[0.74-0.80], 0.67[0.62-0.73] and 0.71[0.61-0.80], respectively. Smoking during treatment has the same hazard as a ten-years older nonsmoking patient. CONCLUSION Without any patient-specific data leaving the hospitals, a stratified Cox regression model based on data from centres in three countries was developed without data leakage risks. The overall survival model is primarily driven by tumour volume and performance status.
Collapse
Affiliation(s)
- Christian Rønn Hansen
- Laboratory of Radiation Physics, Odense University Hospital, Odense, Denmark; Department of Clinical Research, University of Southern Denmark, Odense, Denmark; Danish Centre for Particle Therapy, Aarhus University Hospital, Denmark; Institute of Medical Physics, School of Physics, University of Sydney, Sydney, Australia.
| | - Gareth Price
- Radiotherapy department, The Christie NHS Foundation Trust, Manchester, United Kingdom
| | - Matthew Field
- Ingham Institute for Applied Medical Research, Sydney, Australia
| | - Nis Sarup
- Laboratory of Radiation Physics, Odense University Hospital, Odense, Denmark
| | - Ruta Zukauskaite
- Department of Clinical Research, University of Southern Denmark, Odense, Denmark; Department of Oncology, Odense University Hospital, Odense, Denmark
| | - Jørgen Johansen
- Department of Oncology, Odense University Hospital, Odense, Denmark
| | - Jesper Grau Eriksen
- Department of Oncology, Odense University Hospital, Odense, Denmark; Department of Experimental Clinical Oncology, Aarhus University Hospital, Denmark; Department of Oncology, Aarhus University Hospital, Denmark
| | - Farhannah Aly
- Ingham Institute for Applied Medical Research, Sydney, Australia; Southwest Sydney Clinical Campus, University of New South Wales, Sydney, Australia; Liverpool and Macarthur Cancer Therapy Centres, Sydney, Australia
| | - Andrew McPartlin
- Radiotherapy department, The Christie NHS Foundation Trust, Manchester, United Kingdom
| | - Lois Holloway
- Institute of Medical Physics, School of Physics, University of Sydney, Sydney, Australia; Ingham Institute for Applied Medical Research, Sydney, Australia; Southwest Sydney Clinical Campus, University of New South Wales, Sydney, Australia; Liverpool and Macarthur Cancer Therapy Centres, Sydney, Australia
| | - David Thwaites
- Institute of Medical Physics, School of Physics, University of Sydney, Sydney, Australia
| | - Carsten Brink
- Laboratory of Radiation Physics, Odense University Hospital, Odense, Denmark; Department of Clinical Research, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
30
|
Federated learning review: Fundamentals, enabling technologies, and future applications. Inf Process Manag 2022. [DOI: 10.1016/j.ipm.2022.103061] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
31
|
Zhang P, Kamel Boulos MN. Privacy-by-Design Environments for Large-Scale Health Research and Federated Learning from Data. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:11876. [PMID: 36231175 PMCID: PMC9565554 DOI: 10.3390/ijerph191911876] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 09/07/2022] [Accepted: 09/16/2022] [Indexed: 06/16/2023]
Abstract
This article offers a brief overview of 'privacy-by-design (or data-protection-by-design) research environments', namely Trusted Research Environments (TREs, most commonly used in the United Kingdom) and Personal Health Trains (PHTs, most commonly used in mainland Europe). These secure environments are designed to enable the safe analysis of multiple, linked (and often big) data sources, including sensitive personal data and data owned by, and distributed across, different institutions. They take data protection and privacy requirements into account from the very start (conception phase, during system design) rather than as an afterthought or 'patch' implemented at a later stage on top of an existing environment. TREs and PHTs are becoming increasingly important for conducting large-scale privacy-preserving health research and for enabling federated learning and discoveries from big healthcare datasets. The paper also presents select examples of successful TRE and PHT implementations and of large-scale studies that used them.
Collapse
Affiliation(s)
- Peng Zhang
- Data Science Institute & Department of Computer Science, Vanderbilt University, Nashville, TN 37240, USA
| | | |
Collapse
|
32
|
Sun C, van Soest J, Koster A, Eussen SJPM, Schram MT, Stehouwer CDA, Dagnelie PC, Dumontier M. Studying the association of diabetes and healthcare cost on distributed data from the Maastricht Study and Statistics Netherlands using a privacy-preserving federated learning infrastructure. J Biomed Inform 2022; 134:104194. [PMID: 36064113 DOI: 10.1016/j.jbi.2022.104194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2022] [Revised: 08/26/2022] [Accepted: 08/29/2022] [Indexed: 11/28/2022]
Abstract
The mining of personal data collected by multiple organizations remains challenging in the presence of technical barriers, privacy concerns, and legal and/or organizational restrictions. While a number of privacy-preserving and data mining frameworks have recently emerged, much remains to show their practical utility. In this study, we implement and utilize a secure infrastructure using data from Statistics Netherlands and the Maastricht Study to learn the association between Type 2 Diabetes Mellitus (T2DM) and healthcare expenses considering the impact of lifestyle, physical activities, and complications of T2DM. Through experiments using real-world distributed personal data, we present the feasibility and effectiveness of the secure infrastructure for practical use cases of linking and analyzing vertically partitioned data across multiple organizations. We discovered that individuals diagnosed with T2DM had significantly higher expenses than those with prediabetes, while participants with prediabetes spent more than those without T2DM in all the included healthcare categories to different degrees. We further discuss a joint effort from technical, ethical-legal, and domain-specific experts that is highly valued for applying such a secure infrastructure to real-life use cases to protect data privacy.
Collapse
Affiliation(s)
- Chang Sun
- Institute of Data Science, Faculty of Science and Engineering, Maastricht University, Maastricht, The Netherlands.
| | - Johan van Soest
- Brightlands Institute of Smart Society, Faculty of Science and Engineering, Maastricht University, Heerlen, The Netherlands
| | - Annemarie Koster
- Department of Social Medicine, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, The Netherlands; Care and Public Health Research Institute, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, The Netherlands
| | - Simone J P M Eussen
- School for Cardiovascular Diseases, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, The Netherlands; Department of Epidemiology, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, The Netherlands
| | - Miranda T Schram
- School for Cardiovascular Diseases, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, The Netherlands; Department of Internal Medicine, Maastricht University Medical Centre+, Maastricht, The Netherlands; School for Mental Health and Neuroscience, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, The Netherlands; Maastricht Heart & Vascular Center, Maastricht University Medical Center+, Maastricht, The Netherlands
| | - Coen D A Stehouwer
- School for Cardiovascular Diseases, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, The Netherlands; Department of Internal Medicine, Maastricht University Medical Centre+, Maastricht, The Netherlands
| | - Pieter C Dagnelie
- School for Cardiovascular Diseases, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, The Netherlands; Department of Internal Medicine, Maastricht University Medical Centre+, Maastricht, The Netherlands
| | - Michel Dumontier
- Institute of Data Science, Faculty of Science and Engineering, Maastricht University, Maastricht, The Netherlands
| |
Collapse
|
33
|
Field M, I Thwaites D, Carolan M, Delaney GP, Lehmann J, Sykes J, Vinod S, Holloway L. Infrastructure platform for privacy-preserving distributed machine learning development of computer-assisted theragnostics in cancer. J Biomed Inform 2022; 134:104181. [PMID: 36055639 DOI: 10.1016/j.jbi.2022.104181] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2022] [Revised: 04/29/2022] [Accepted: 08/20/2022] [Indexed: 11/26/2022]
Abstract
INTRODUCTION Emerging evidence suggests that data-driven support tools have found their way into clinical decision-making in a number of areas, including cancer care. Improving them and widening their scope of availability in various differing clinical scenarios, including for prognostic models derived from retrospective data, requires co-ordinated data sharing between clinical centres, secondary analyses of large multi-institutional clinical trial data, or distributed (federated) learning infrastructures. A systematic approach to utilizing routinely collected data across cancer care clinics remains a significant challenge due to privacy, administrative and political barriers. METHODS An information technology infrastructure and web service software was developed and implemented which uses machine learning to construct clinical decision support systems in a privacy-preserving manner across datasets geographically distributed in different hospitals. The infrastructure was deployed in a network of Australian hospitals. A harmonized, international ontology-linked, set of lung cancer databases were built with the routine clinical and imaging data at each centre. The infrastructure was demonstrated with the development of logistic regression models to predict major cardiovascular events following radiation therapy. RESULTS The infrastructure implemented forms the basis of the Australian computer-assisted theragnostics (AusCAT) network for radiation oncology data extraction, reporting and distributed learning. Four radiation oncology departments (across seven hospitals) in New South Wales (NSW) participated in this demonstration study. Infrastructure was deployed at each centre and used to develop a model predicting for cardiovascular admission within a year of receiving curative radiotherapy for non-small cell lung cancer. A total of 10417 lung cancer patients were identified with 802 being eligible for the model. Twenty features were chosen for analysis from the clinical record and linked registries. After selection, 8 features were included and a logistic regression model achieved an area under the receiver operating characteristic (AUROC) curve of 0.70 and C-index of 0.65 on out-of-sample data. CONCLUSION The infrastructure developed was demonstrated to be usable in practice between clinical centres to harmonize routinely collected oncology data and develop models with federated learning. It provides a promising approach to enable further research studies in radiation oncology using real world clinical data.
Collapse
Affiliation(s)
- Matthew Field
- South Western Sydney Clinical Campus, School of Clinical Medicine, University of New South Wales, NSW, Australia; South Western Sydney Cancer Services, NSW Health, Sydney, NSW, Australia; Ingham Institute for Applied Medical Research, Liverpool, NSW, Australia.
| | - David I Thwaites
- Institute of Medical Physics, School of Physics, University of Sydney, NSW, Australia
| | - Martin Carolan
- Illawarra Cancer Care Centre, Wollongong, NSW, Australia
| | - Geoff P Delaney
- South Western Sydney Clinical Campus, School of Clinical Medicine, University of New South Wales, NSW, Australia; South Western Sydney Cancer Services, NSW Health, Sydney, NSW, Australia; Ingham Institute for Applied Medical Research, Liverpool, NSW, Australia
| | - Joerg Lehmann
- Institute of Medical Physics, School of Physics, University of Sydney, NSW, Australia; Department of Radiation Oncology, Calvary Mater Newcastle, NSW, Australia
| | - Jonathan Sykes
- Institute of Medical Physics, School of Physics, University of Sydney, NSW, Australia; Blacktown Haematology and Oncology Cancer Care Centre, Blacktown Hospital, Blacktown, NSW, Australia; Crown Princess Mary Cancer Centre, Westmead Hospital, Westmead, NSW, Australia
| | - Shalini Vinod
- South Western Sydney Clinical Campus, School of Clinical Medicine, University of New South Wales, NSW, Australia; South Western Sydney Cancer Services, NSW Health, Sydney, NSW, Australia; Ingham Institute for Applied Medical Research, Liverpool, NSW, Australia
| | - Lois Holloway
- South Western Sydney Clinical Campus, School of Clinical Medicine, University of New South Wales, NSW, Australia; South Western Sydney Cancer Services, NSW Health, Sydney, NSW, Australia; Ingham Institute for Applied Medical Research, Liverpool, NSW, Australia; Institute of Medical Physics, School of Physics, University of Sydney, NSW, Australia
| |
Collapse
|
34
|
Theophanous S, Lønne PI, Choudhury A, Berbee M, Dekker A, Dennis K, Dewdney A, Gambacorta MA, Gilbert A, Guren MG, Holloway L, Jadon R, Kochhar R, Mohamed AA, Muirhead R, Parés O, Raszewski L, Roy R, Scarsbrook A, Sebag-Montefiore D, Spezi E, Spindler KLG, van Triest B, Vassiliou V, Malinen E, Wee L, Appelt AL. Development and validation of prognostic models for anal cancer outcomes using distributed learning: protocol for the international multi-centre atomCAT2 study. Diagn Progn Res 2022; 6:14. [PMID: 35922837 PMCID: PMC9351222 DOI: 10.1186/s41512-022-00128-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 06/09/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Anal cancer is a rare cancer with rising incidence. Despite the relatively good outcomes conferred by state-of-the-art chemoradiotherapy, further improving disease control and reducing toxicity has proven challenging. Developing and validating prognostic models using routinely collected data may provide new insights for treatment development and selection. However, due to the rarity of the cancer, it can be difficult to obtain sufficient data, especially from single centres, to develop and validate robust models. Moreover, multi-centre model development is hampered by ethical barriers and data protection regulations that often limit accessibility to patient data. Distributed (or federated) learning allows models to be developed using data from multiple centres without any individual-level patient data leaving the originating centre, therefore preserving patient data privacy. This work builds on the proof-of-concept three-centre atomCAT1 study and describes the protocol for the multi-centre atomCAT2 study, which aims to develop and validate robust prognostic models for three clinically important outcomes in anal cancer following chemoradiotherapy. METHODS This is a retrospective multi-centre cohort study, investigating overall survival, locoregional control and freedom from distant metastasis after primary chemoradiotherapy for anal squamous cell carcinoma. Patient data will be extracted and organised at each participating radiotherapy centre (n = 18). Candidate prognostic factors have been identified through literature review and expert opinion. Summary statistics will be calculated and exchanged between centres prior to modelling. The primary analysis will involve developing and validating Cox proportional hazards models across centres for each outcome through distributed learning. Outcomes at specific timepoints of interest and factor effect estimates will be reported, allowing for outcome prediction for future patients. DISCUSSION The atomCAT2 study will analyse one of the largest available cross-institutional cohorts of patients with anal cancer treated with chemoradiotherapy. The analysis aims to provide information on current international clinical practice outcomes and may aid the personalisation and design of future anal cancer clinical trials through contributing to a better understanding of patient risk stratification.
Collapse
Affiliation(s)
- Stelios Theophanous
- Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, UK.
| | - Per-Ivar Lønne
- Department of Medical Physics, Oslo University Hospital, Oslo, Norway
| | - Ananya Choudhury
- MAASTRO (Dept of Radiotherapy), GROW School of Oncology and Developmental Biology, Maastricht University and Maastricht University Medical Centre+, P. Debyelaan 25, 6229, Maastricht, Netherlands
| | - Maaike Berbee
- MAASTRO (Dept of Radiotherapy), GROW School of Oncology and Developmental Biology, Maastricht University and Maastricht University Medical Centre+, P. Debyelaan 25, 6229, Maastricht, Netherlands
| | - Andre Dekker
- MAASTRO (Dept of Radiotherapy), GROW School of Oncology and Developmental Biology, Maastricht University and Maastricht University Medical Centre+, P. Debyelaan 25, 6229, Maastricht, Netherlands
| | | | | | | | - Alexandra Gilbert
- Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, UK
| | - Marianne Grønlie Guren
- Department of Oncology, Oslo University Hospital, and Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Lois Holloway
- Ingham Research Institute and Liverpool Hospital, Liverpool, New South Wales, Australia
| | | | | | | | | | | | | | - Rajarshi Roy
- Hull University Teaching Hospitals NHS Trust, Hull, UK
| | - Andrew Scarsbrook
- Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, UK
- Leeds Teaching Hospitals NHS Trust, Leeds, UK
| | | | | | | | - Baukelien van Triest
- The Netherlands Cancer Institute-Antoni van Leeuwenhoek (NKI-AVL), Amsterdam, The Netherlands
| | | | - Eirik Malinen
- Department of Medical Physics, Oslo University Hospital, Oslo, Norway
| | - Leonard Wee
- MAASTRO (Dept of Radiotherapy), GROW School of Oncology and Developmental Biology, Maastricht University and Maastricht University Medical Centre+, P. Debyelaan 25, 6229, Maastricht, Netherlands
| | - Ane L Appelt
- Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, UK
- Leeds Teaching Hospitals NHS Trust, Leeds, UK
| |
Collapse
|
35
|
Huang B, Sollee J, Luo YH, Reddy A, Zhong Z, Wu J, Mammarappallil J, Healey T, Cheng G, Azzoli C, Korogodsky D, Zhang P, Feng X, Li J, Yang L, Jiao Z, Bai HX. Prediction of lung malignancy progression and survival with machine learning based on pre-treatment FDG-PET/CT. EBioMedicine 2022; 82:104127. [PMID: 35810561 PMCID: PMC9278031 DOI: 10.1016/j.ebiom.2022.104127] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2021] [Revised: 05/16/2022] [Accepted: 06/09/2022] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND Pre-treatment FDG-PET/CT scans were analyzed with machine learning to predict progression of lung malignancies and overall survival (OS). METHODS A retrospective review across three institutions identified patients with a pre-procedure FDG-PET/CT and an associated malignancy diagnosis. Lesions were manually and automatically segmented, and convolutional neural networks (CNNs) were trained using FDG-PET/CT inputs to predict malignancy progression. Performance was evaluated using area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, and specificity. Image features were extracted from CNNs and by radiomics feature extraction, and random survival forests (RSF) were constructed to predict OS. Concordance index (C-index) and integrated brier score (IBS) were used to evaluate OS prediction. FINDINGS 1168 nodules (n=965 patients) were identified. 792 nodules had progression and 376 were progression-free. The most common malignancies were adenocarcinoma (n=740) and squamous cell carcinoma (n=179). For progression risk, the PET+CT ensemble model with manual segmentation (accuracy=0.790, AUC=0.876) performed similarly to the CT only (accuracy=0.723, AUC=0.888) and better compared to the PET only (accuracy=0.664, AUC=0.669) models. For OS prediction with deep learning features, the PET+CT+clinical RSF ensemble model (C-index=0.737) performed similarly to the CT only (C-index=0.730) and better than the PET only (C-index=0.595), and clinical only (C-index=0.595) models. RSF models constructed with radiomics features had comparable performance to those with CNN features. INTERPRETATION CNNs trained using pre-treatment FDG-PET/CT and extracted performed well in predicting lung malignancy progression and OS. OS prediction performance with CNN features was comparable to a radiomics approach. The prognostic models could inform treatment options and improve patient care. FUNDING NIH NHLBI training grant (5T35HL094308-12, John Sollee).
Collapse
Affiliation(s)
- Brian Huang
- Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA
| | - John Sollee
- Warren Alpert Medical School of Brown University, Providence, RI 02903, USA
- Department of Diagnostic Radiology, Rhode Island Hospital, 593 Eddy St. Providence, Providence, RI 02903, USA
| | - Yong-Heng Luo
- Department of Radiology, The Second Xiangya Hospital of Central South University, Changsha, Hunan 410011, China
| | - Ashwin Reddy
- Warren Alpert Medical School of Brown University, Providence, RI 02903, USA
- Department of Diagnostic Radiology, Rhode Island Hospital, 593 Eddy St. Providence, Providence, RI 02903, USA
| | - Zhusi Zhong
- School of Electronic Engineering, Xidian University, Xi'an 710071, China
| | - Jing Wu
- Department of Radiology, The Second Xiangya Hospital of Central South University, Changsha, Hunan 410011, China
| | - Joseph Mammarappallil
- Department of Diagnostic Radiology, Duke University School of Medicine, Durham, NC 27708, USA
| | - Terrance Healey
- Warren Alpert Medical School of Brown University, Providence, RI 02903, USA
- Department of Diagnostic Radiology, Rhode Island Hospital, 593 Eddy St. Providence, Providence, RI 02903, USA
| | - Gang Cheng
- Department of Diagnostic Radiology, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Christopher Azzoli
- Department of Thoracic Oncology, Rhode Island Hospital, Providence, RI 02903, USA
| | - Dana Korogodsky
- Warren Alpert Medical School of Brown University, Providence, RI 02903, USA
| | - Paul Zhang
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Xue Feng
- Carina Medical Inc., Lexington, KY 40507, USA
| | - Jie Li
- School of Electronic Engineering, Xidian University, Xi'an 710071, China
| | - Li Yang
- Department of Neurology, The Second Xiangya Hospital of Central South University, Changsha, Hunan 410011, China
| | - Zhicheng Jiao
- Warren Alpert Medical School of Brown University, Providence, RI 02903, USA
- Department of Diagnostic Radiology, Rhode Island Hospital, 593 Eddy St. Providence, Providence, RI 02903, USA
| | - Harrison Xiao Bai
- Department of Radiology and Radiological Sciences, Johns Hopkins University, 601 N. Carolina St., Baltimore, MD 21287, USA
| |
Collapse
|
36
|
Hansen CR, Price G, Field M, Sarup N, Zukauskaite R, Johansen J, Eriksen JG, Aly F, McPartlin A, Holloway L, Thwaites D, Brink C. Open-source distributed learning validation for a larynx cancer survival model following radiotherapy. Radiother Oncol 2022; 173:319-326. [PMID: 35738481 DOI: 10.1016/j.radonc.2022.06.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 05/30/2022] [Accepted: 06/15/2022] [Indexed: 10/18/2022]
Abstract
INTRODUCTION Prediction models are useful to design personalised treatment. However, safe and effective implementation relies on external validation. Retrospective data are available in many institutions, but sharing between institutions can be challenging due to patient data sensitivity and governance or legal barriers. This study validates a larynx cancer survival model performed using distributed learning without any sensitive data leaving the institution. METHODS Open-source distributed learning software based on a stratified Cox proportional hazard model was developed and used to validate the Egelmeer et al. MAASTRO survival model across two hospitals in two countries. The validation optimised a single scaling parameter multiplied by the original predicted prognostic index. All analyses and figures were based on the distributed system, ensuring no information leakage from the individual centres. All applied software is provided as freeware to facilitate distributed learning in other institutions. RESULTS 1745 patients received radiotherapy for larynx cancer in the two centres from Jan 2005 to Dec 2018. Limiting to a maximum of one missing value in the parameters of the survival model reduced the cohort to 1095 patients. The Harrell C-index was 0.74 (CI95%, 0.71-0.76) and 0.70 (0.66-0.75) for the two centres. However, the model needed a scaling update. In addition, it was found that survival predictions of patients undergoing hypofractionation were less precise. CONCLUSION Open-source distributed learning software was able to validate, and suggest a minor update to the original survival model without central access to patient sensitive information. Even without the update, the original MAASTRO survival model of Egelmeer et al. performed reasonably well, providing similar results in this validation as in its original validation.
Collapse
Affiliation(s)
- Christian Rønn Hansen
- Laboratory of Radiation Physics, Odense University Hospital, Denmark; Department of Clinical Research, University of Southern Denmark, Odense, Denmark; Danish Centre for Particle Therapy, Aarhus University Hospital, Denmark; Institute of Medical Physics, School of Physics, University of Sydney, Australia.
| | - Gareth Price
- Radiotherapy Department, The Christie NHS Foundation Trust, Manchester, United Kingdom
| | - Matthew Field
- Ingham Institute for Applied Medical Research, Sydney, Australia
| | - Nis Sarup
- Laboratory of Radiation Physics, Odense University Hospital, Denmark
| | - Ruta Zukauskaite
- Department of Clinical Research, University of Southern Denmark, Odense, Denmark; Department of Oncology, Odense University Hospital, Denmark
| | | | - Jesper Grau Eriksen
- Department of Oncology, Odense University Hospital, Denmark; Department of Experimental Clinical Oncology, Aarhus University Hospital, Denmark; Department of Oncology, Aarhus University Hospital, Denmark
| | - Farhannah Aly
- Ingham Institute for Applied Medical Research, Sydney, Australia; Southwest Sydney Clinical Campus, University of New South Wales, Sydney, Australia; Liverpool and Macarthur Cancer Therapy Centres, Sydney, Australia
| | - Andrew McPartlin
- Radiotherapy Department, The Christie NHS Foundation Trust, Manchester, United Kingdom
| | - Lois Holloway
- Institute of Medical Physics, School of Physics, University of Sydney, Australia; Ingham Institute for Applied Medical Research, Sydney, Australia; Southwest Sydney Clinical Campus, University of New South Wales, Sydney, Australia; Liverpool and Macarthur Cancer Therapy Centres, Sydney, Australia
| | - David Thwaites
- Institute of Medical Physics, School of Physics, University of Sydney, Australia
| | - Carsten Brink
- Laboratory of Radiation Physics, Odense University Hospital, Denmark; Department of Clinical Research, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
37
|
Turner RJ, Coenen F, Roelofs F, Hagoort K, Härmä A, Grünwald PD, Velders FP, Scheepers FE. Information extraction from free text for aiding transdiagnostic psychiatry: constructing NLP pipelines tailored to clinicians' needs. BMC Psychiatry 2022; 22:407. [PMID: 35715745 PMCID: PMC9206307 DOI: 10.1186/s12888-022-04058-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Accepted: 06/10/2022] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND Developing predictive models for precision psychiatry is challenging because of unavailability of the necessary data: extracting useful information from existing electronic health record (EHR) data is not straightforward, and available clinical trial datasets are often not representative for heterogeneous patient groups. The aim of this study was constructing a natural language processing (NLP) pipeline that extracts variables for building predictive models from EHRs. We specifically tailor the pipeline for extracting information on outcomes of psychiatry treatment trajectories, applicable throughout the entire spectrum of mental health disorders ("transdiagnostic"). METHODS A qualitative study into beliefs of clinical staff on measuring treatment outcomes was conducted to construct a candidate list of variables to extract from the EHR. To investigate if the proposed variables are suitable for measuring treatment effects, resulting themes were compared to transdiagnostic outcome measures currently used in psychiatry research and compared to the HDRS (as a gold standard) through systematic review, resulting in an ideal set of variables. To extract these from EHR data, a semi-rule based NLP pipeline was constructed and tailored to the candidate variables using Prodigy. Classification accuracy and F1-scores were calculated and pipeline output was compared to HDRS scores using clinical notes from patients admitted in 2019 and 2020. RESULTS Analysis of 34 questionnaires answered by clinical staff resulted in four themes defining treatment outcomes: symptom reduction, general well-being, social functioning and personalization. Systematic review revealed 242 different transdiagnostic outcome measures, with the 36-item Short-Form Survey for quality of life (SF36) being used most consistently, showing substantial overlap with the themes from the qualitative study. Comparing SF36 to HDRS scores in 26 studies revealed moderate to good correlations (0.62-0.79) and good positive predictive values (0.75-0.88). The NLP pipeline developed with notes from 22,170 patients reached an accuracy of 95 to 99 percent (F1 scores: 0.38 - 0.86) on detecting these themes, evaluated on data from 361 patients. CONCLUSIONS The NLP pipeline developed in this study extracts outcome measures from the EHR that cater specifically to the needs of clinical staff and align with outcome measures used to detect treatment effects in clinical trials.
Collapse
Affiliation(s)
- Rosanne J. Turner
- grid.7692.a0000000090126352University Medical Center Utrecht, Brain Center, Amsterdam, Netherlands ,grid.6054.70000 0004 0369 4183Machine Learning Group, CWI, Amsterdam, Netherlands
| | - Femke Coenen
- grid.7692.a0000000090126352University Medical Center Utrecht, Brain Center, Amsterdam, Netherlands
| | - Femke Roelofs
- grid.7692.a0000000090126352University Medical Center Utrecht, Brain Center, Amsterdam, Netherlands
| | - Karin Hagoort
- grid.7692.a0000000090126352University Medical Center Utrecht, Brain Center, Amsterdam, Netherlands
| | - Aki Härmä
- grid.417284.c0000 0004 0398 9387Philips Research, Eindhoven, Netherlands
| | - Peter D. Grünwald
- grid.6054.70000 0004 0369 4183Machine Learning Group, CWI, Amsterdam, Netherlands ,grid.5132.50000 0001 2312 1970Department of Mathematics, Leiden University, Leiden, Netherlands
| | - Fleur P. Velders
- grid.7692.a0000000090126352University Medical Center Utrecht, Brain Center, Amsterdam, Netherlands
| | - Floortje E. Scheepers
- grid.7692.a0000000090126352University Medical Center Utrecht, Brain Center, Amsterdam, Netherlands
| |
Collapse
|
38
|
Reps JM, Williams RD, Schuemie MJ, Ryan PB, Rijnbeek PR. Learning patient-level prediction models across multiple healthcare databases: evaluation of ensembles for increasing model transportability. BMC Med Inform Decis Mak 2022; 22:142. [PMID: 35614485 PMCID: PMC9134686 DOI: 10.1186/s12911-022-01879-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Accepted: 05/13/2022] [Indexed: 11/29/2022] Open
Abstract
Background Prognostic models that are accurate could help aid medical decision making. Large observational databases often contain temporal medical data for large and diverse populations of patients. It may be possible to learn prognostic models using the large observational data. Often the performance of a prognostic model undesirably worsens when transported to a different database (or into a clinical setting). In this study we investigate different ensemble approaches that combine prognostic models independently developed using different databases (a simple federated learning approach) to determine whether ensembles that combine models developed across databases can improve model transportability (perform better in new data than single database models)? Methods For a given prediction question we independently trained five single database models each using a different observational healthcare database. We then developed and investigated numerous ensemble models (fusion, stacking and mixture of experts) that combined the different database models. Performance of each model was investigated via discrimination and calibration using a leave one dataset out technique, i.e., hold out one database to use for validation and use the remaining four datasets for model development. The internal validation of a model developed using the hold out database was calculated and presented as the ‘internal benchmark’ for comparison. Results In this study the fusion ensembles generally outperformed the single database models when transported to a previously unseen database and the performances were more consistent across unseen databases. Stacking ensembles performed poorly in terms of discrimination when the labels in the unseen database were limited. Calibration was consistently poor when both ensembles and single database models were applied to previously unseen databases. Conclusion A simple federated learning approach that implements ensemble techniques to combine models independently developed across different databases for the same prediction question may improve the discriminative performance in new data (new database or clinical setting) but will need to be recalibrated using the new data. This could help medical decision making by improving prognostic model performance.
Supplementary Information The online version contains supplementary material available at 10.1186/s12911-022-01879-6.
Collapse
Affiliation(s)
| | - Ross D Williams
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | | | | | - Peter R Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| |
Collapse
|
39
|
Ankolekar A, van der Heijden B, Dekker A, Roumen C, De Ruysscher D, Reymen B, Berlanga A, Oberije C, Fijten R. Clinician perspectives on clinical decision support systems in lung cancer: Implications for shared decision-making. Health Expect 2022; 25:1342-1351. [PMID: 35535474 PMCID: PMC9327823 DOI: 10.1111/hex.13457] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Revised: 01/28/2022] [Accepted: 02/07/2022] [Indexed: 11/27/2022] Open
Abstract
Background Lung cancer treatment decisions are typically made among clinical experts in a multidisciplinary tumour board (MTB) based on clinical data and guidelines. The rise of artificial intelligence and cultural shifts towards patient autonomy are changing the nature of clinical decision‐making towards personalized treatments. This can be supported by clinical decision support systems (CDSSs) that generate personalized treatment information as a basis for shared decision‐making (SDM). Little is known about lung cancer patients' treatment decisions and the potential for SDM supported by CDSSs. The aim of this study is to understand to what extent SDM is done in current practice and what clinicians need to improve it. Objective To explore (1) the extent to which patient preferences are taken into consideration in non‐small‐cell lung cancer (NSCLC) treatment decisions; (2) clinician perspectives on using CDSSs to support SDM. Design Mixed methods study consisting of a retrospective cohort study on patient deviation from MTB advice and reasons for deviation, qualitative interviews with lung cancer specialists and observations of MTB discussions and patient consultations. Setting and Participants NSCLC patients (N = 257) treated at a single radiotherapy clinic and nine lung cancer specialists from six Dutch clinics. Results We found a 10.9% (n = 28) deviation rate from MTB advice; 50% (n = 14) were due to patient preference, of which 85.7% (n = 12) chose a less intensive treatment than MTB advice. Current MTB recommendations are based on clinician experience, guidelines and patients' performance status. Most specialists (n = 7) were receptive towards CDSSs but cited barriers, such as lack of trust, lack of validation studies and time. CDSSs were considered valuable during MTB discussions rather than in consultations. Conclusion Lung cancer decisions are heavily influenced by clinical guidelines and experience, yet many patients prefer less intensive treatments. CDSSs can support SDM by presenting the harms and benefits of different treatment options rather than giving single treatment advice. External validation of CDSSs should be prioritized. Patient or Public Contribution This study did not involve patients or the public explicitly; however, the study design was informed by prior interviews with volunteers of a cancer patient advocacy group. The study objectives and data collection were supported by Dutch health care insurer CZ for a project titled ‘My Best Treatment’ that improves patient‐centeredness and the lung cancer patient pathway in the Netherlands.
Collapse
Affiliation(s)
- Anshu Ankolekar
- Department of Radiation Oncology (MAASTRO), GROW School for Oncology, Maastricht University Medical Center+, Maastricht, The Netherlands
| | - Britt van der Heijden
- Department of Radiation Oncology (MAASTRO), GROW School for Oncology, Maastricht University Medical Center+, Maastricht, The Netherlands
| | - Andre Dekker
- Department of Radiation Oncology (MAASTRO), GROW School for Oncology, Maastricht University Medical Center+, Maastricht, The Netherlands
| | - Cheryl Roumen
- Department of Radiation Oncology (MAASTRO), GROW School for Oncology, Maastricht University Medical Center+, Maastricht, The Netherlands
| | - Dirk De Ruysscher
- Department of Radiation Oncology (MAASTRO), GROW School for Oncology, Maastricht University Medical Center+, Maastricht, The Netherlands
| | - Bart Reymen
- Department of Radiation Oncology (MAASTRO), GROW School for Oncology, Maastricht University Medical Center+, Maastricht, The Netherlands
| | - Adriana Berlanga
- Department of Radiation Oncology (MAASTRO), GROW School for Oncology, Maastricht University Medical Center+, Maastricht, The Netherlands
| | - Cary Oberije
- The D-Lab, GROW School for Oncology, Maastricht University Medical Center+, Maastricht University, Maastricht, The Netherlands
| | - Rianne Fijten
- Department of Radiation Oncology (MAASTRO), GROW School for Oncology, Maastricht University Medical Center+, Maastricht, The Netherlands
| |
Collapse
|
40
|
Hunger M, Bardenheuer K, Passey A, Schade R, Sharma R, Hague C. The Value of Federated Data Networks in Oncology: What Research Questions Do They Answer? Outcomes From a Systematic Literature Review. VALUE IN HEALTH : THE JOURNAL OF THE INTERNATIONAL SOCIETY FOR PHARMACOECONOMICS AND OUTCOMES RESEARCH 2022; 25:855-868. [PMID: 35249830 DOI: 10.1016/j.jval.2021.11.1357] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 10/22/2021] [Accepted: 11/14/2021] [Indexed: 06/14/2023]
Abstract
OBJECTIVES Real-world evidence (RWE) plays an important role in addressing key research questions of interest to healthcare decision makers. Federated data networks (FDNs) apply novel technology to enable the conduct of RWE studies with multiple partners, without the need to share the individual partner's data set. A systematic review of the published literature was performed to determine which types of research questions can best be addressed through FDNs, specifically in the field of oncology. METHODS Systematic searches of MEDLINE and Embase were undertaken to identify the types of research questions that had been addressed in studies using FDNs. Additional information was retrieved about study characteristics, statistical methods, and the FDN itself. RESULTS In total, 40 publications were included where research questions on the following had been addressed (multiple categories possible): disease natural history (58%), safety surveillance (18%), treatment pathways (15%), comparative effectiveness (10%), and cost/resource use studies (3%)-13% of studies had to be left uncategorized. A total of 50% of the studies were run with data partners in networks of ≤5. The size of the networks ranged from 227 patients to >5 million patients. Statistical methods used included distributed learning and distributed regression methods. CONCLUSIONS Further work is needed to raise awareness of the important role that FDNs can play in leveraging readily available RWE to address key research questions of interest in cancer and the benefits to the research community in engaging in federated data initiatives with a long-term perspective.
Collapse
Affiliation(s)
- Matthias Hunger
- ICON plc, Global Health Economics, Outcomes Research and Epidemiology, Dublin
| | | | | | - René Schade
- ICON plc, Global Health Economics, Outcomes Research and Epidemiology, Dublin
| | - Ruchika Sharma
- ICON plc, Global Health Economics, Outcomes Research and Epidemiology, Dublin
| | | |
Collapse
|
41
|
Crowson MG, Moukheiber D, Arévalo AR, Lam BD, Mantena S, Rana A, Goss D, Bates DW, Celi LA. A systematic review of federated learning applications for biomedical data. PLOS DIGITAL HEALTH 2022; 1:e0000033. [PMID: 36812504 PMCID: PMC9931322 DOI: 10.1371/journal.pdig.0000033] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 03/30/2022] [Indexed: 11/18/2022]
Abstract
OBJECTIVES Federated learning (FL) allows multiple institutions to collaboratively develop a machine learning algorithm without sharing their data. Organizations instead share model parameters only, allowing them to benefit from a model built with a larger dataset while maintaining the privacy of their own data. We conducted a systematic review to evaluate the current state of FL in healthcare and discuss the limitations and promise of this technology. METHODS We conducted a literature search using PRISMA guidelines. At least two reviewers assessed each study for eligibility and extracted a predetermined set of data. The quality of each study was determined using the TRIPOD guideline and PROBAST tool. RESULTS 13 studies were included in the full systematic review. Most were in the field of oncology (6 of 13; 46.1%), followed by radiology (5 of 13; 38.5%). The majority evaluated imaging results, performed a binary classification prediction task via offline learning (n = 12; 92.3%), and used a centralized topology, aggregation server workflow (n = 10; 76.9%). Most studies were compliant with the major reporting requirements of the TRIPOD guidelines. In all, 6 of 13 (46.2%) of studies were judged at high risk of bias using the PROBAST tool and only 5 studies used publicly available data. CONCLUSION Federated learning is a growing field in machine learning with many promising uses in healthcare. Few studies have been published to date. Our evaluation found that investigators can do more to address the risk of bias and increase transparency by adding steps for data homogeneity or sharing required metadata and code.
Collapse
Affiliation(s)
- Matthew G. Crowson
- Department of Otolaryngology-Head & Neck Surgery, Massachusetts Eye & Ear, Boston, Massachusetts, United States of America
- Department of Otolaryngology-Head & Neck Surgery, Harvard Medical School, Massachusetts, United States of America
| | - Dana Moukheiber
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, United States of America
| | - Aldo Robles Arévalo
- IDMEC, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal
- Data & Analytics, NTT DATA Portugal, Lisbon, Portugal
| | - Barbara D. Lam
- Department of Hematology & Oncology, Beth Israel Deaconess Medical Center, Boston, Massachusetts, United States of America
| | - Sreekar Mantena
- Harvard College, Boston, Massachusetts, United States of America
| | - Aakanksha Rana
- Massachusetts Institute of Technology, Boston, Massachusetts, United States of America
| | - Deborah Goss
- Department of Otolaryngology-Head & Neck Surgery, Massachusetts Eye & Ear, Boston, Massachusetts, United States of America
| | - David W. Bates
- Division of General Internal Medicine and Primary Care, Brigham and Women’s Hospital, Boston, MA, United States of America
- Department of Health Policy and Management, Harvard T. H. Chan School of Public Health, Boston, MA, United States of America
| | - Leo Anthony Celi
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, United States of America
| |
Collapse
|
42
|
Multi-Institutional Breast Cancer Detection Using a Secure On-Boarding Service for Distributed Analytics. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12094336] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
The constant upward movement of data-driven medicine as a valuable option to enhance daily clinical practice has brought new challenges for data analysts to get access to valuable but sensitive data due to privacy considerations. One solution for most of these challenges are Distributed Analytics (DA) infrastructures, which are technologies fostering collaborations between healthcare institutions by establishing a privacy-preserving network for data sharing. However, in order to participate in such a network, a lot of technical and administrative prerequisites have to be made, which could pose bottlenecks and new obstacles for non-technical personnel during their deployment. We have identified three major problems in the current state-of-the-art. Namely, the missing compliance with FAIR data principles, the automation of processes, and the installation. In this work, we present a seamless on-boarding workflow based on a DA reference architecture for data sharing institutions to address these problems. The on-boarding service manages all technical configurations and necessities to reduce the deployment time. Our aim is to use well-established and conventional technologies to gain acceptance through enhanced ease of use. We evaluate our development with six institutions across Germany by conducting a DA study with open-source breast cancer data, which represents the second contribution of this work. We find that our on-boarding solution lowers technical barriers and efficiently deploys all necessary components and is, therefore, indeed an enabler for collaborative data sharing.
Collapse
|
43
|
Husson O, Reeve BB, Darlington AS, Cheung CK, Sodergren S, van der Graaf WTA, Salsman JM. Next Step for Global Adolescent and Young Adult Oncology: A Core Patient-Centered Outcome Set. J Natl Cancer Inst 2022; 114:496-502. [PMID: 34865066 PMCID: PMC9002284 DOI: 10.1093/jnci/djab217] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 10/11/2021] [Accepted: 11/23/2021] [Indexed: 12/24/2022] Open
Abstract
The relatively small number of cancers in the adolescent and young adult (AYA) age group, those aged 15-39 years, does not appropriately reflect the personal and societal costs of cancer in this population, as reflected in the potential years of life lost or saved, the decreased productivity and health-related quality of life due to the impact of the disease during formative years, and long-term complications or disabilities. Improvements in care and outcomes for AYAs with cancer require collaboration among different stakeholders at different levels (patients, caregivers, health-care professionals, researchers, industry, and policymakers). Development of a Core Outcome Set (COS), an agreed minimum set of outcomes that should be measured globally in research and routine clinical practice-specifically for AYAs with cancer-with outcomes that are well defined based on the perspective of those who are affected and assessed with validated measures, is urgently required. A globally implemented COS for AYAs with cancer will facilitate better pooling of research data and the implementation of high-quality health-care registries, which by benchmarking not only nationally but also internationally, may ultimately improve the value of the care given to these underserved young cancer patients. We reflect on the need to develop a COS for AYAs with cancer, the arenas of application, and the challenges of implementing an age-specific COS in research and clinical practice.
Collapse
Affiliation(s)
- Olga Husson
- Department of Medical Oncology, Netherlands Cancer Institute, Amsterdam, the Netherlands
- Division of Clinical Studies, Institute of Cancer Research, London, UK
- Department of Surgical Oncology, Erasmus MC Cancer Institute, Rotterdam, the Netherlands
- Division of Psychosocial Research and Epidemiology, Netherlands Cancer Institute, Amsterdam, the Netherlands
| | - Bryce B Reeve
- Department of Population Health Sciences, Duke University School of Medicine, Durham, NC, USA
| | | | | | | | - Winette T A van der Graaf
- Department of Medical Oncology, Netherlands Cancer Institute, Amsterdam, the Netherlands
- Department of Medical Oncology, Erasmus MC Cancer Institute, Erasmus University Medical Center, Rotterdam, the Netherlands
| | - John M Salsman
- Department of Social Sciences and Health Policy, Wake Forest School of Medicine, Wake Forest Baptist Comprehensive Cancer Center, Winston Salem, NC, USA
| |
Collapse
|
44
|
Chen K, Li H, Pan Z, Wu Z, Song E. Insights into artificial intelligence in clinical oncology: opportunities and challenges. SCIENCE CHINA. LIFE SCIENCES 2022; 65:643-647. [PMID: 34846642 DOI: 10.1007/s11427-021-2010-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/26/2021] [Accepted: 09/18/2021] [Indexed: 06/13/2023]
Affiliation(s)
- Kai Chen
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, 510120, China
- Breast Tumor Center, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, 510120, China
- Artificial Intelligence Lab, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, 510120, China
| | - Hanwei Li
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, 510120, China
- Artificial Intelligence Lab, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, 510120, China
| | - Zhanpeng Pan
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, 510120, China
- Artificial Intelligence Lab, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, 510120, China
| | - Zhuo Wu
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, 510120, China
- Artificial Intelligence Lab, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, 510120, China
- Department of Radiology, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, 510120, China
| | - Erwei Song
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, 510120, China.
- Breast Tumor Center, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, 510120, China.
- Fountain-Valley Institute for Life Sciences, Guangzhou Institute of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, 510530, China.
- Bioland Laboratory, Guangzhou, 510005, China.
| |
Collapse
|
45
|
Kamphorst B, Rooijakkers T, Veugen T, Cellamare M, Knoors D. Accurate training of the Cox proportional hazards model on vertically-partitioned data while preserving privacy. BMC Med Inform Decis Mak 2022; 22:49. [PMID: 35209883 PMCID: PMC8867891 DOI: 10.1186/s12911-022-01771-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Accepted: 01/20/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Analysing distributed medical data is challenging because of data sensitivity and various regulations to access and combine data. Some privacy-preserving methods are known for analyzing horizontally-partitioned data, where different organisations have similar data on disjoint sets of people. Technically more challenging is the case of vertically-partitioned data, dealing with data on overlapping sets of people. We use an emerging technology based on cryptographic techniques called secure multi-party computation (MPC), and apply it to perform privacy-preserving survival analysis on vertically-distributed data by means of the Cox proportional hazards (CPH) model. Both MPC and CPH are explained. METHODS We use a Newton-Raphson solver to securely train the CPH model with MPC, jointly with all data holders, without revealing any sensitive data. In order to securely compute the log-partial likelihood in each iteration, we run into several technical challenges to preserve the efficiency and security of our solution. To tackle these technical challenges, we generalize a cryptographic protocol for securely computing the inverse of the Hessian matrix and develop a new method for securely computing exponentiations. A theoretical complexity estimate is given to get insight into the computational and communication effort that is needed. RESULTS Our secure solution is implemented in a setting with three different machines, each presenting a different data holder, which can communicate through the internet. The MPyC platform is used for implementing this privacy-preserving solution to obtain the CPH model. We test the accuracy and computation time of our methods on three standard benchmark survival datasets. We identify future work to make our solution more efficient. CONCLUSIONS Our secure solution is comparable with the standard, non-secure solver in terms of accuracy and convergence speed. The computation time is considerably larger, although the theoretical complexity is still cubic in the number of covariates and quadratic in the number of subjects. We conclude that this is a promising way of performing parametric survival analysis on vertically-distributed medical data, while realising high level of security and privacy.
Collapse
Affiliation(s)
- Bart Kamphorst
- Cyber Security and Robustness, Netherlands Organisation for Applied Scientific Research, The Hague, The Netherlands
| | - Thomas Rooijakkers
- Cyber Security and Robustness, Netherlands Organisation for Applied Scientific Research, The Hague, The Netherlands
| | - Thijs Veugen
- Cyber Security and Robustness, Netherlands Organisation for Applied Scientific Research, The Hague, The Netherlands
- Cryptology, Centrum Wiskunde and Informatica, Amsterdam, The Netherlands
| | - Matteo Cellamare
- Research and Development, Netherlands Comprehensive Cancer Organisation, Eindhoven, The Netherlands
| | - Daan Knoors
- Research and Development, Netherlands Comprehensive Cancer Organisation, Eindhoven, The Netherlands
| |
Collapse
|
46
|
Antunes RS, da Costa CA, Küderle A, Yari IA, Eskofier B. Federated Learning for Healthcare: Systematic Review and Architecture Proposal. ACM T INTEL SYST TEC 2022. [DOI: 10.1145/3501813] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
The use of machine learning (ML) with electronic health records (EHR) is growing in popularity as a means to extract knowledge that can improve the decision-making process in healthcare. Such methods require training of high-quality learning models based on diverse and comprehensive datasets, which are hard to obtain due to the sensitive nature of medical data from patients. In this context, federated learning (FL) is a methodology that enables the distributed training of machine learning models with remotely hosted datasets without the need to accumulate data and, therefore, compromise it. FL is a promising solution to improve ML-based systems, better aligning them to regulatory requirements, improving trustworthiness and data sovereignty. However, many open questions must be addressed before the use of FL becomes widespread. This article aims at presenting a systematic literature review on current research about FL in the context of EHR data for healthcare applications. Our analysis highlights the main research topics, proposed solutions, case studies, and respective ML methods. Furthermore, the article discusses a general architecture for FL applied to healthcare data based on the main insights obtained from the literature review. The collected literature corpus indicates that there is extensive research on the privacy and confidentiality aspects of training data and model sharing, which is expected given the sensitive nature of medical data. Studies also explore improvements to the aggregation mechanisms required to generate the learning model from distributed contributions and case studies with different types of medical data.
Collapse
Affiliation(s)
| | | | | | | | - Björn Eskofier
- Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
| |
Collapse
|
47
|
Fu Y, Zhang H, Morris ED, Glide-Hurst CK, Pai S, Traverso A, Wee L, Hadzic I, Lønne PI, Shen C, Liu T, Yang X. Artificial Intelligence in Radiation Therapy. IEEE TRANSACTIONS ON RADIATION AND PLASMA MEDICAL SCIENCES 2022; 6:158-181. [PMID: 35992632 PMCID: PMC9385128 DOI: 10.1109/trpms.2021.3107454] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Artificial intelligence (AI) has great potential to transform the clinical workflow of radiotherapy. Since the introduction of deep neural networks, many AI-based methods have been proposed to address challenges in different aspects of radiotherapy. Commercial vendors have started to release AI-based tools that can be readily integrated to the established clinical workflow. To show the recent progress in AI-aided radiotherapy, we have reviewed AI-based studies in five major aspects of radiotherapy including image reconstruction, image registration, image segmentation, image synthesis, and automatic treatment planning. In each section, we summarized and categorized the recently published methods, followed by a discussion of the challenges, concerns, and future development. Given the rapid development of AI-aided radiotherapy, the efficiency and effectiveness of radiotherapy in the future could be substantially improved through intelligent automation of various aspects of radiotherapy.
Collapse
Affiliation(s)
- Yabo Fu
- Department of Radiation Oncology and Winship Cancer Institute, Emory University, Atlanta, GA 30322, USA
| | - Hao Zhang
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Eric D. Morris
- Department of Radiation Oncology, University of California-Los Angeles, Los Angeles, CA 90095, USA
| | - Carri K. Glide-Hurst
- Department of Human Oncology, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI 53792, USA
| | - Suraj Pai
- Maastricht University Medical Centre, Netherlands
| | | | - Leonard Wee
- Maastricht University Medical Centre, Netherlands
| | | | - Per-Ivar Lønne
- Department of Medical Physics, Oslo University Hospital, PO Box 4953 Nydalen, 0424 Oslo, Norway
| | - Chenyang Shen
- Department of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, TX 75002, USA
| | - Tian Liu
- Department of Radiation Oncology and Winship Cancer Institute, Emory University, Atlanta, GA 30322, USA
| | - Xiaofeng Yang
- Department of Radiation Oncology and Winship Cancer Institute, Emory University, Atlanta, GA 30322, USA
| |
Collapse
|
48
|
Welten S, Mou Y, Neumann L, Jaberansary M, Yediel Ucer Y, Kirsten T, Decker S, Beyan O. A Privacy-Preserving Distributed Analytics Platform for Health Care Data. Methods Inf Med 2022; 61:e1-e11. [PMID: 35038764 PMCID: PMC9246511 DOI: 10.1055/s-0041-1740564] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Background
In recent years, data-driven medicine has gained increasing importance in terms of diagnosis, treatment, and research due to the exponential growth of health care data. However, data protection regulations prohibit data centralisation for analysis purposes because of potential privacy risks like the accidental disclosure of data to third parties. Therefore, alternative data usage policies, which comply with present privacy guidelines, are of particular interest.
Objective
We aim to enable analyses on sensitive patient data by simultaneously complying with local data protection regulations using an approach called the Personal Health Train (PHT), which is a paradigm that utilises distributed analytics (DA) methods. The main principle of the PHT is that the analytical task is brought to the data provider and the data instances remain in their original location.
Methods
In this work, we present our implementation of the PHT paradigm, which preserves the sovereignty and autonomy of the data providers and operates with a limited number of communication channels. We further conduct a DA use case on data stored in three different and distributed data providers.
Results
We show that our infrastructure enables the training of data models based on distributed data sources.
Conclusion
Our work presents the capabilities of DA infrastructures in the health care sector, which lower the regulatory obstacles of sharing patient data. We further demonstrate its ability to fuel medical science by making distributed data sets available for scientists or health care practitioners.
Collapse
Affiliation(s)
- Sascha Welten
- Chair of Computer Science 5, RWTH Aachen University, Aachen, Germany
| | - Yongli Mou
- Chair of Computer Science 5, RWTH Aachen University, Aachen, Germany
| | - Laurenz Neumann
- Chair of Computer Science 5, RWTH Aachen University, Aachen, Germany
| | | | - Yeliz Yediel Ucer
- Department of Data Science and Artificial Intelligence, Fraunhofer FIT, Sankt Augustin, Germany
| | - Toralf Kirsten
- Department of Medical Data Science, University Medical Center Leipzig, Leipzig, Germany
| | - Stefan Decker
- Chair of Computer Science 5, RWTH Aachen University, Aachen, Germany.,Department of Data Science and Artificial Intelligence, Fraunhofer FIT, Sankt Augustin, Germany
| | - Oya Beyan
- Department of Data Science and Artificial Intelligence, Fraunhofer FIT, Sankt Augustin, Germany.,Institute for Medical Informatics, Faculty of Medicine, University Hospital Cologne, University of Cologne, Cologne, Germany
| |
Collapse
|
49
|
Jha AK, Mithun S, Sherkhane UB, Jaiswar V, Shi Z, Kalendralis P, Kulkarni C, M.S. D, Rajamenakshi R, Sunder G, Purandare N, Wee L, Rangarajan V, van Soest J, Dekker A. Implementation of Big Imaging Data Pipeline Adhering to FAIR Principles for Federated Machine Learning in Oncology. IEEE TRANSACTIONS ON RADIATION AND PLASMA MEDICAL SCIENCES 2022. [DOI: 10.1109/trpms.2021.3113860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
50
|
Chen X, Fu R, Shao Q, Chen Y, Ye Q, Li S, He X, Zhu J. Application of artificial intelligence to pancreatic adenocarcinoma. Front Oncol 2022; 12:960056. [PMID: 35936738 PMCID: PMC9353734 DOI: 10.3389/fonc.2022.960056] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 06/24/2022] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND AND OBJECTIVES Pancreatic cancer (PC) is one of the deadliest cancers worldwide although substantial advancement has been made in its comprehensive treatment. The development of artificial intelligence (AI) technology has allowed its clinical applications to expand remarkably in recent years. Diverse methods and algorithms are employed by AI to extrapolate new data from clinical records to aid in the treatment of PC. In this review, we will summarize AI's use in several aspects of PC diagnosis and therapy, as well as its limits and potential future research avenues. METHODS We examine the most recent research on the use of AI in PC. The articles are categorized and examined according to the medical task of their algorithm. Two search engines, PubMed and Google Scholar, were used to screen the articles. RESULTS Overall, 66 papers published in 2001 and after were selected. Of the four medical tasks (risk assessment, diagnosis, treatment, and prognosis prediction), diagnosis was the most frequently researched, and retrospective single-center studies were the most prevalent. We found that the different medical tasks and algorithms included in the reviewed studies caused the performance of their models to vary greatly. Deep learning algorithms, on the other hand, produced excellent results in all of the subdivisions studied. CONCLUSIONS AI is a promising tool for helping PC patients and may contribute to improved patient outcomes. The integration of humans and AI in clinical medicine is still in its infancy and requires the in-depth cooperation of multidisciplinary personnel.
Collapse
Affiliation(s)
- Xi Chen
- Department of General Surgery, Second Affiliated Hospital Zhejiang University School of Medicine, Hangzhou, China
| | - Ruibiao Fu
- Department of General Surgery, Second Affiliated Hospital Zhejiang University School of Medicine, Hangzhou, China
| | - Qian Shao
- Department of Surgical Ward 1, Ningbo Women and Children’s Hospital, Ningbo, China
| | - Yan Chen
- Department of General Surgery, Second Affiliated Hospital Zhejiang University School of Medicine, Hangzhou, China
| | - Qinghuang Ye
- Department of General Surgery, Second Affiliated Hospital Zhejiang University School of Medicine, Hangzhou, China
| | - Sheng Li
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, China
| | - Xiongxiong He
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, China
| | - Jinhui Zhu
- Department of General Surgery, Second Affiliated Hospital Zhejiang University School of Medicine, Hangzhou, China
- *Correspondence: Jinhui Zhu,
| |
Collapse
|