1
|
Radanliev P, Santos O, Ani UD. Generative AI cybersecurity and resilience. Front Artif Intell 2025; 8:1568360. [PMID: 40529644 PMCID: PMC12171450 DOI: 10.3389/frai.2025.1568360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2025] [Accepted: 05/06/2025] [Indexed: 06/20/2025] Open
Abstract
Generative Artificial Intelligence marks a critical inflection point in the evolution of machine learning systems, enabling the autonomous synthesis of content across text, image, audio, and biomedical domains. While these capabilities are advancing at pace, their deployment raises profound ethical, security, and privacy concerns that remain inadequately addressed by existing governance mechanisms. This study undertakes a systematic inquiry into these challenges, combining a PRISMA-guided literature review with thematic and quantitative analyses to interrogate the socio-technical implications of generative Artificial Intelligence. The article develops an integrated theoretical framework, grounded in established models of technology adoption, cybersecurity resilience, and normative governance. Structured across five lifecycle stages (design, implementation, monitoring, compliance, and feedback) the framework offers a practical schema for evaluating and guiding responsible AI deployment. The analysis reveals a disconnection between the fast adoption of generative systems and the maturity of institutional safeguards, resulting with new risks from the shadow Artificial Intelligence, and underscoring the need for adaptive, sector-specific governance. This study offers a coherent pathway towards ethically aligned and secure application of Artificial Intelligence in national critical infrastructure.
Collapse
Affiliation(s)
- Petar Radanliev
- Department of Computer Sciences, University of Oxford, Oxford, United Kingdom
- Alan Turing Institute, British Library, London, United Kingdom
| | - Omar Santos
- Cisco Systems, RTP, San Jose, NC, United States
| | - Uchenna Daniel Ani
- School of Computer Science and Mathematics, Keele University, Keele, United Kingdom
| |
Collapse
|
2
|
Thompson RAM, Shah YB, Aguirre F, Stewart C, Lallas CD, Shah MS. Artificial Intelligence Use in Medical Education: Best Practices and Future Directions. Curr Urol Rep 2025; 26:45. [PMID: 40439780 PMCID: PMC12122599 DOI: 10.1007/s11934-025-01277-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/18/2025] [Indexed: 06/02/2025]
Abstract
PURPOSEOF REVIEW This review examines the various ways artificial intelligence (AI) has been utilized in medical education (MedEd)and presents ideas that will ethically and effectively leverage AI in enhancing the learning experience of medical trainees. RECENT FINDINGS AI has improved accessibility to learning material in a manner that engages the wider population. It has utility as a reference tool and can assist academic writing by generating outlines, summaries and identifying relevant reference articles. As AI is increasingly integrated into MedEd and practice, its regulation should become a priority to prevent drawbacks to the education of trainees. By involving physicians in AI design and development, we can best preserve the integrity, quality, and clinical relevance of AI-generated content. In adopting the best practices for AI use, we can maximize its benefits while preserving the ethical standards of MedEd with the goal of improving learning outcomes.
Collapse
Affiliation(s)
- Rasheed A M Thompson
- Department of Urology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, United States
| | - Yash B Shah
- Department of Urology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, United States
| | - Francisco Aguirre
- Department of Urology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, United States
| | - Courtney Stewart
- Department of Urology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, United States
| | - Costas D Lallas
- Department of Urology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, United States
| | - Mihir S Shah
- Department of Urology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, United States.
| |
Collapse
|
3
|
Meurers T, Otte K, Abu Attieh H, Briki F, Despraz J, Halilovic M, Kaabachi B, Milicevic V, Müller A, Papapostolou G, Wirth FN, Raisaro JL, Prasser F. A quantitative analysis of the use of anonymization in biomedical research. NPJ Digit Med 2025; 8:279. [PMID: 40369095 PMCID: PMC12078711 DOI: 10.1038/s41746-025-01644-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2024] [Accepted: 04/16/2025] [Indexed: 05/16/2025] Open
Abstract
Anonymized biomedical data sharing faces several challenges. This systematic review analyzes 1084 PubMed-indexed studies (2018-2022) using anonymized biomedical data to quantify usage trends across geographic, regulatory, and cultural regions to identify effective approaches and inform implementation agendas. We identified a significant yearly increase in such studies with a slope of 2.16 articles per 100,000 when normalized against the total number of PubMed-indexed articles (p = 0.021). Most studies used data from the US, UK, and Australia (78.2%). This trend remained when normalized by country-specific research output. Cross-border sharing was rare (10.5% of studies). We identified twelve common data sources, primarily in the US (seven) and UK (three), including commercial (seven) and public entities (five). The prevalence of anonymization in the US, UK, and Australia suggests their practices could guide broader adoption. Rare cross-border anonymized data sharing and differences between countries with comparable regulations underscore the need for global standards.
Collapse
Affiliation(s)
- Thierry Meurers
- Health Data Science Center, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany.
| | - Karen Otte
- Health Data Science Center, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Hammam Abu Attieh
- Health Data Science Center, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Farah Briki
- Biomedical Data Science Center, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
| | - Jérémie Despraz
- Biomedical Data Science Center, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
| | - Mehmed Halilovic
- Health Data Science Center, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Bayrem Kaabachi
- Biomedical Data Science Center, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
| | - Vladimir Milicevic
- Health Data Science Center, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Armin Müller
- Health Data Science Center, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Grigorios Papapostolou
- Health Data Science Center, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Felix Nikolaus Wirth
- Health Data Science Center, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Jean Louis Raisaro
- Biomedical Data Science Center, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
| | - Fabian Prasser
- Health Data Science Center, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany.
| |
Collapse
|
4
|
Cheema B, Hourmozdi J, Kline A, Ahmad F, Khera R. Artificial Intelligence in the Management of Heart Failure. J Card Fail 2025:S1071-9164(25)00194-0. [PMID: 40345521 DOI: 10.1016/j.cardfail.2025.02.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Revised: 02/10/2025] [Accepted: 02/10/2025] [Indexed: 05/11/2025]
Abstract
Artificial intelligence (AI) has the potential to revolutionize the management of heart failure. AI-based tools can guide the diagnosis and treatment of known risk factors, identify asymptomatic structural heart disease, improve cardiomyopathy diagnosis and symptomatic heart failure treatment, and uncover patients transitioning to advanced disease. By integrating multimodal data, including omics, imaging, signals, and electronic health records, state-of-the-art algorithms allow for a more tailored approach to patient care, addressing the unique needs of the individual. The past decade has led to the development of numerous AI solutions targeting each aspect of the heart failure syndrome. However, significant barriers to implementation remain and have limited clinical uptake. Data-privacy concerns, real-world model performance, integration challenges, trust in AI, model governance, and concerns about fairness and bias are some of the topics requiring additional research and the development of best practices. This review highlights progress in the use of AI to guide the diagnosis and management of heart failure while underscoring the importance of overcoming key implementation challenges that are currently slowing progress.
Collapse
Affiliation(s)
- Baljash Cheema
- Bluhm Cardiovascular Institute, Center for Artificial Intelligence, Northwestern Medicine, Chicago, IL; Northwestern University, Feinberg School of Medicine, Chicago, IL.
| | | | - Adrienne Kline
- Bluhm Cardiovascular Institute, Center for Artificial Intelligence, Northwestern Medicine, Chicago, IL; Northwestern University, Feinberg School of Medicine, Chicago, IL
| | - Faraz Ahmad
- Bluhm Cardiovascular Institute, Center for Artificial Intelligence, Northwestern Medicine, Chicago, IL; Northwestern University, Feinberg School of Medicine, Chicago, IL
| | - Rohan Khera
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT; Section of Health Informatics, Department of Biostatistics, Yale School of Public Health, New Haven, CT; Section of Biomedical Informatics and Data Science, Yale School of Medicine, New Haven, CT; Center for Outcomes Research and Evaluation, Yale-New Haven Hospital, New Haven, CT
| |
Collapse
|
5
|
Largent EA, Kim Y, Karlawish J, Wexler A. Ethics From the Outset: Incorporating Ethical Considerations into the Artificial Intelligence and Technology Collaboratories for Aging Research Pilot Projects. J Gerontol A Biol Sci Med Sci 2025; 80:glaf035. [PMID: 40166843 PMCID: PMC12066003 DOI: 10.1093/gerona/glaf035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2024] [Indexed: 04/02/2025] Open
Abstract
There is an urgent need to develop tools to enable older adults to live healthy, independent lives for as long as possible. To address this need, the National Institute on Aging (NIA) Artificial Intelligence and Technology Collaboratories (AITCs) for Aging Research were created to identify, develop, evaluate, commercialize, and disseminate innovative technologies and artificial intelligence (AI) methods to promote healthy aging and to support persons with Alzheimer's disease and Alzheimer's disease-related dementias (AD/ADRD). In 2023, AITC pilot grant applicants were required to answer questions about how, if at all, they would safeguard older adults' data privacy and confidentiality, advance health equity, address bias, and protect vulnerable participants. Our team analyzed applicants' answers to these ethics-focused questions using a constructivist grounded theory approach. In this article, we present what we learned and discuss modifications to our approach moving forward.
Collapse
Affiliation(s)
- Emily A Largent
- Department of Medical Ethics and Health Policy, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Yungjee Kim
- Department of Medical Ethics and Health Policy, University of Pennsylvania Perelman School of Medicine, University of Pennsylvania Carey Law School, Philadelphia, Pennsylvania, USA
| | - Jason Karlawish
- Department of Medicine, Department of Medical Ethics and Health Policy, Department of Neurology, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Anna Wexler
- Department of Medical Ethics and Health Policy, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| |
Collapse
|
6
|
Eiermann M, Sernaker S. The Impact of Data Suppression on Re-Identification Risk and Data Access in the National Child Abuse and Neglect Data System. CHILD MALTREATMENT 2025:10775595251337073. [PMID: 40257230 DOI: 10.1177/10775595251337073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/22/2025]
Abstract
In summer 2024, the Children's Bureau revised the rules that govern data suppression in datasets from the National Child Abuse and Neglect Data System (NCANDS). To minimize the risk of re-identification, researchers had previously been unable to identify counties with fewer than 1000 annual maltreatment cases. Under the new data suppression rule, county identifiers will only be suppressed for counties with fewer than 700 cases. In this report, we document the consequences of this shift for research data access and re-identification risks, showing that reducing the data suppression threshold increased the number of identifiable counties from 835 to 1096 (a 31.3% increase) and doubled the number of identifiable rural counties. The percentage of reported children who face a particularly elevated re-identification risk due to having unique demographic characteristics increased from 0.7% to 1.0%.
Collapse
|
7
|
Cervera de la Cruz P, Shabani M. Conceptualizing fairness in the secondary use of health data for research: A scoping review. Account Res 2025; 32:233-262. [PMID: 37851101 DOI: 10.1080/08989621.2023.2271394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 10/12/2023] [Indexed: 10/19/2023]
Abstract
With the introduction of the European Health Data Space (EHDS), the secondary use of health data for research purposes is attracting more attention. Secondary health data processing promises to address novel research questions, inform the design of future research and improve healthcare delivery generally. To comply with the existing data protection regulations, the secondary data use must be fair, among other things. However, there is no clear understanding of what fairness means in the context of secondary use of health data for scientific research purposes. In response, we conducted a scoping review of argument-based literature to explore how fairness in the secondary use of health data has been conceptualized. A total of 35 publications were included in the final synthesis after abstract and full-text screening. Using an inductive approach and a thematic analysis, our review has revealed that balancing individual and public interests, reducing power asymmetries, setting conditions for commercial involvement, and implementing benefit sharing are essential to guarantee fair secondary use research. The findings of this review can inform current and future research practices and policy development to adequately address concerns about fairness in the secondary use of health data.
Collapse
Affiliation(s)
| | - Mahsa Shabani
- Metamedica, Faculty of Law and Criminology, University of Ghent, Ghent, Belgium
- Law Centre for Health and Life, Faculty of Law, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
8
|
Zhang J, Bell MAL. Overfit detection method for deep neural networks trained to beamform ultrasound images. ULTRASONICS 2025; 148:107562. [PMID: 39746284 PMCID: PMC11839378 DOI: 10.1016/j.ultras.2024.107562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2024] [Revised: 12/18/2024] [Accepted: 12/20/2024] [Indexed: 01/04/2025]
Abstract
Deep neural networks (DNNs) have remarkable potential to reconstruct ultrasound images. However, this promise can suffer from overfitting to training data, which is typically detected via loss function monitoring during an otherwise time-consuming training process or via access to new sources of test data. We present a method to detect overfitting with associated evaluation approaches that only require knowledge of a network architecture and associated trained weights. Three types of artificial DNN inputs (i.e., zeros, ones, and Gaussian noise), unseen during DNN training, were input to three DNNs designed for ultrasound image formation, trained on multi-site data, and submitted to the Challenge on Ultrasound Beamforming with Deep Learning (CUBDL). Overfitting was detected using these artificial DNN inputs. Qualitative and quantitative comparisons of DNN-created images to ground truth images immediately revealed signs of overfitting (e.g., zeros input produced mean output values ≥0.08, ones input produced mean output values ≤0.07, with corresponding image-to-image normalized correlations ≤0.8). The proposed approach is promising to detect overfitting without requiring lengthy network retraining or the curation of additional test data. Potential applications include sanity checks during federated learning, as well as optimization, security, public policy, regulation creation, and benchmarking.
Collapse
Affiliation(s)
- Jiaxin Zhang
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Muyinatu A Lediju Bell
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA; Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA; Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
9
|
Liu T, Krentz AJ, Huo Z, Ćurčin V. Opportunities and Challenges of Cardiovascular Disease Risk Prediction for Primary Prevention Using Machine Learning and Electronic Health Records: A Systematic Review. Rev Cardiovasc Med 2025; 26:37443. [PMID: 40351688 PMCID: PMC12059770 DOI: 10.31083/rcm37443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2025] [Revised: 03/13/2025] [Accepted: 03/20/2025] [Indexed: 05/14/2025] Open
Abstract
Background Cardiovascular disease (CVD) remains the foremost cause of morbidity and mortality worldwide. Recent advancements in machine learning (ML) have demonstrated substantial potential in augmenting risk stratification for primary prevention, surpassing conventional statistical models in predictive performance. Thus, integrating ML with Electronic Health Records (EHRs) enables refined risk estimation by leveraging the granularity and breadth of longitudinal individual patient data. However, fundamental barriers persist, including limited generalizability, challenges in interpretability, and the absence of rigorous external validation, all of which impede widespread clinical deployment. Methods This review adheres to the methodological rigor of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) and Scale for the Assessment of Narrative Review Articles (SANRA) guidelines. A systematic literature search was performed in March 2024, encompassing the Medline and Embase databases, to identify studies published since 2010. Supplementary references were retrieved from the Institute for Scientific Information (ISI) Web of Science, and manual searches were curated. The selection process, conducted via Rayyan, focused on systematic and narrative reviews evaluating ML-driven models for long-term CVD risk prediction within primary prevention contexts utilizing EHR data. Studies investigating short-term prognostication, highly specific comorbid cohorts, or conventional models devoid of ML components were excluded. Results Following an exhaustive screening of 1757 records, 22 studies met the inclusion criteria. Of these, 10 were systematic reviews (four incorporating meta-analyses), while 12 constituted narrative reviews, with the majority published post-2020. The synthesis underscores the superiority of ML in modeling intricate EHR-derived risk factors, facilitating precision-driven cardiovascular risk assessment. Nonetheless, salient challenges endure heterogeneity in CVD outcome definitions, undermine comparability, data incompleteness and inconsistency compromise model robustness, and a dearth of external validation constrains clinical translatability. Moreover, ethical and regulatory considerations, including algorithmic opacity, equity in predictive performance, and the absence of standardized evaluation frameworks, pose formidable obstacles to seamless integration into clinical workflows. Conclusions Despite the transformative potential of ML-based CVD risk prediction, it remains encumbered by methodological, technical, and regulatory impediments that hinder its full-scale adoption into real-world healthcare settings. This review underscores the imperative circumstances for standardized validation protocols, stringent regulatory oversight, and interdisciplinary collaboration to bridge the translational divide. Our findings established an integrative framework for developing, validating, and applying ML-based CVD risk prediction algorithms, addressing both clinical and technical dimensions. To further advance this field, we propose a standardized, transparent, and regulated EHR platform that facilitates fair model evaluation, reproducibility, and clinical translation by providing a high-quality, representative dataset with structured governance and benchmarking mechanisms. Meanwhile, future endeavors must prioritize enhancing model transparency, mitigating biases, and ensuring adaptability to heterogeneous clinical populations, fostering equitable and evidence-based implementation of ML-driven predictive analytics in cardiovascular medicine.
Collapse
Affiliation(s)
- Tianyi Liu
- School of Life Course & Population Sciences, King’s College London, SE1 1UL London, UK
| | - Andrew J. Krentz
- School of Life Course & Population Sciences, King’s College London, SE1 1UL London, UK
- Metadvice, 1025 St-Sulpice, Switzerland
| | - Zhiqiang Huo
- School of Life Course & Population Sciences, King’s College London, SE1 1UL London, UK
| | - Vasa Ćurčin
- School of Life Course & Population Sciences, King’s College London, SE1 1UL London, UK
| |
Collapse
|
10
|
Yin SQ, Li YH. Advancing the diagnosis of major depressive disorder: Integrating neuroimaging and machine learning. World J Psychiatry 2025; 15:103321. [PMID: 40109992 PMCID: PMC11886342 DOI: 10.5498/wjp.v15.i3.103321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/21/2024] [Revised: 12/27/2024] [Accepted: 01/08/2025] [Indexed: 02/26/2025] Open
Abstract
Major depressive disorder (MDD), a psychiatric disorder characterized by functional brain deficits, poses considerable diagnostic and treatment challenges, especially in adolescents owing to varying clinical presentations. Biomarkers hold substantial clinical potential in the field of mental health, enabling objective assessments of physiological and pathological states, facilitating early diagnosis, and enhancing clinical decision-making and patient outcomes. Recent breakthroughs combine neuroimaging with machine learning (ML) to distinguish brain activity patterns between MDD patients and healthy controls, paving the way for diagnostic support and personalized treatment. However, the accuracy of the results depends on the selection of neuroimaging features and algorithms. Ensuring privacy protection, ML model accuracy, and fostering trust are essential steps prior to clinical implementation. Future research should prioritize the establishment of comprehensive legal frameworks and regulatory mechanisms for using ML in MDD diagnosis while safeguarding patient privacy and rights. By doing so, we can advance accuracy and personalized care for MDD.
Collapse
Affiliation(s)
- Shi-Qi Yin
- School of Pharmaceutical Sciences, Capital Medical University, Beijing 100069, China
| | - Ying-Huan Li
- School of Pharmaceutical Sciences, Capital Medical University, Beijing 100069, China
| |
Collapse
|
11
|
Clunie DA, Flanders A, Taylor A, Erickson B, Bialecki B, Brundage D, Gutman D, Prior F, Seibert JA, Perry J, Gichoya JW, Kirby J, Andriole K, Geneslaw L, Moore S, Fitzgerald TJ, Tellis W, Xiao Y, Farahani K. Report of the Medical Image De-Identification (MIDI) Task Group -- Best Practices and Recommendations. ARXIV 2025:arXiv:2303.10473v3. [PMID: 37033463 PMCID: PMC10081345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 04/11/2023]
Abstract
This report addresses the technical aspects of de-identification of medical images of human subjects and biospecimens, such that re-identification risk of ethical, moral, and legal concern is sufficiently reduced to allow unrestricted public sharing for any purpose, regardless of the jurisdiction of the source and distribution sites. All medical images, regardless of the mode of acquisition, are considered, though the primary emphasis is on those with accompanying data elements, especially those encoded in formats in which the data elements are embedded, particularly Digital Imaging and Communications in Medicine (DICOM). These images include image-like objects such as Segmentations, Parametric Maps, and Radiotherapy (RT) Dose objects. The scope also includes related non-image objects, such as RT Structure Sets, Plans and Dose Volume Histograms, Structured Reports, and Presentation States. Only de-identification of publicly released data is considered, and alternative approaches to privacy preservation, such as federated learning for artificial intelligence (AI) model development, are out of scope, as are issues of privacy leakage from AI model sharing. Only technical issues of public sharing are addressed.
Collapse
|
12
|
Hanna MG, Pantanowitz L, Jackson B, Palmer O, Visweswaran S, Pantanowitz J, Deebajah M, Rashidi HH. Ethical and Bias Considerations in Artificial Intelligence/Machine Learning. Mod Pathol 2025; 38:100686. [PMID: 39694331 DOI: 10.1016/j.modpat.2024.100686] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2024] [Accepted: 11/27/2024] [Indexed: 12/20/2024]
Abstract
As artificial intelligence (AI) gains prominence in pathology and medicine, the ethical implications and potential biases within such integrated AI models will require careful scrutiny. Ethics and bias are important considerations in our practice settings, especially as an increased number of machine learning (ML) systems are being integrated within our various medical domains. Such ML-based systems have demonstrated remarkable capabilities in specified tasks such as, but not limited to, image recognition, natural language processing, and predictive analytics. However, the potential bias that may exist within such AI-ML models can also inadvertently lead to unfair and potentially detrimental outcomes. The source of bias within such ML models can be due to numerous factors but is typically categorized into 3 main buckets (data bias, development bias, and interaction bias). These could be due to the training data, algorithmic bias, feature engineering and selection issues, clinic and institutional bias (ie, practice variability), reporting bias, and temporal bias (ie, changes in technology, clinical practice, or disease patterns). Therefore, despite the potential of these AI-ML applications, their deployment in our day-to-day practice also raises noteworthy ethical concerns. To address ethics and bias in medicine, a comprehensive evaluation process is required, which will encompass all aspects of such systems, from model development through clinical deployment. Addressing these biases is crucial to ensure that AI-ML systems remain fair, transparent, and beneficial to all. This review will discuss the relevant ethical and bias considerations in AI-ML specifically within the pathology and medical domain.
Collapse
Affiliation(s)
- Matthew G Hanna
- Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania; Computational Pathology and AI Center of Excellence (CPACE), University of Pittsburgh, Pittsburgh, Pennsylvania.
| | - Liron Pantanowitz
- Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania; Computational Pathology and AI Center of Excellence (CPACE), University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Brian Jackson
- Department of Pathology, University of Utah, Salt Lake City, Utah; ARUP Laboratories, Salt Lake City, Utah
| | - Octavia Palmer
- Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania; Computational Pathology and AI Center of Excellence (CPACE), University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Shyam Visweswaran
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania
| | | | | | - Hooman H Rashidi
- Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania; Computational Pathology and AI Center of Excellence (CPACE), University of Pittsburgh, Pittsburgh, Pennsylvania.
| |
Collapse
|
13
|
Van Biesen W, Ponikvar JB, Fontana M, Heering P, Sever MS, Sawhney S, Luyckx V. Ethical considerations on the use of big data and artificial intelligence in kidney research from the ERA ethics committee. Nephrol Dial Transplant 2025; 40:455-464. [PMID: 39572076 PMCID: PMC11879022 DOI: 10.1093/ndt/gfae267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2024] [Indexed: 03/06/2025] Open
Abstract
In the current paper, we will focus on requirements to ensure big data can advance the outcomes of our patients suffering from kidney disease. The associated ethical question is whether and how we as a nephrology community can and should encourage the collection of big data of our patients. We identify some ethical reflections on the use of big data, and their importance and relevance. Furthermore, we balance advantages and pitfalls and discuss requirements to make legitimate and ethical use of big data possible. The collection, organization, and curation of data come upfront in the pipeline before any analyses. Great care must therefore be taken to ensure quality of the data at this stage, to avoid the 'garbage in garbage out' problem and suboptimal patient care as a consequence of such analyses. Access to the data should be organized so that correct and efficient use of data is possible. This means that data must be stored safely, so that only those entitled to do so can access them. At the same time, those who are entitled to access the data should be able to do so in an efficient way, so as not to hinder relevant research. Analysis of observational data is itself prone to many errors and biases. Each of these biases can finally result in provision of low-quality medical care. Secure platforms should therefore also ensure correct methodology is used to interpret the available data. This requires close collaboration of a skilled workforce of experts in medical research and data scientists. Only then will our patients be able to benefit fully from the potential of AI and big data.
Collapse
Affiliation(s)
- Wim Van Biesen
- Department of Nephrology, University Hospital Gent, Gent, Belgium
| | - Jadranka Buturovic Ponikvar
- University Medical Centre Ljubljana, Division of Internal Medicine, Department of Nephrology, Ljubljana, Slovenia; Faculty of Medicine, University of Ljubljana, Slovenia
| | - Monica Fontana
- European Renal Association, Headquarters, Parma, Emilia-Romagna, Italy
| | - Peter Heering
- KFH, Solingen General Hospital. Solingen, Germany. Dept of Nephrology and Hypertension, Univ of Cape Town, Cape Town, South Africa
| | - Mehmet S Sever
- Istanbul School of Medicine, Nephrology department, Millet Caddesi, Capa-Istanbul, Turkey
| | - Simon Sawhney
- Aberdeen Centre for Health Data Sciences, University of Aberdeen, Aberdeen, UK
| | - Valerie Luyckx
- Department of Public and Global Health, Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Zurich, Switzerland
- Renal Division, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
14
|
Kuo TT, Gabriel RA, Koola J, Schooley RT, Ohno-Machado L. Distributed cross-learning for equitable federated models - privacy-preserving prediction on data from five California hospitals. Nat Commun 2025; 16:1371. [PMID: 39910076 PMCID: PMC11799213 DOI: 10.1038/s41467-025-56510-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 01/22/2025] [Indexed: 02/07/2025] Open
Abstract
Quality improvement, clinical research, and patient care can be supported by medical predictive analytics. Predictive models can be improved by integrating more patient records from different healthcare centers (horizontal) or integrating parts of information of a patient from different centers (vertical). We introduce Distributed Cross-Learning for Equitable Federated models (D-CLEF), which incorporates horizontally- or vertically-partitioned data without disseminating patient-level records, to protect patients' privacy. We compared D-CLEF with centralized/siloed/federated learning in horizontal or vertical scenarios. Using data of more than 15,000 patients with COVID-19 from five University of California (UC) Health medical centers, surgical data from UC San Diego, and heart disease data from Edinburgh, UK, D-CLEF performed close to the centralized solution, outperforming the siloed ones, and equivalent to the federated learning counterparts, but with increased synchronization time. Here, we show that D-CLEF presents a promising accelerator for healthcare systems to collaborate without submitting their patient data outside their own systems.
Collapse
Affiliation(s)
- Tsung-Ting Kuo
- Department of Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, Connecticut, United States of America.
- Department of Surgery, School of Medicine, Yale University, New Haven, Connecticut, United States of America.
- Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, California, United States of America.
| | - Rodney A Gabriel
- Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, California, United States of America
- Department of Biomedical Informatics, University of California San Diego Health, La Jolla, California, United States of America
- Department of Anesthesiology, University of California San Diego, La Jolla, California, United States of America
| | - Jejo Koola
- Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, California, United States of America
- Department of Biomedical Informatics, University of California San Diego Health, La Jolla, California, United States of America
| | - Robert T Schooley
- Division of Infectious Diseases and Global Public Health, Department of Medicine, University of California San Diego, La Jolla, California, United States of America
| | - Lucila Ohno-Machado
- Department of Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, Connecticut, United States of America
- Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, California, United States of America
| |
Collapse
|
15
|
Demuth S, De Sèze J, Edan G, Ziemssen T, Simon F, Gourraud PA. Digital Representation of Patients as Medical Digital Twins: Data-Centric Viewpoint. JMIR Med Inform 2025; 13:e53542. [PMID: 39881430 PMCID: PMC11793832 DOI: 10.2196/53542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 09/20/2024] [Accepted: 10/13/2024] [Indexed: 01/31/2025] Open
Abstract
Unlabelled Precision medicine involves a paradigm shift toward personalized data-driven clinical decisions. The concept of a medical "digital twin" has recently become popular to designate digital representations of patients as a support for a wide range of data science applications. However, the concept is ambiguous when it comes to practical implementations. Here, we propose a medical digital twin framework with a data-centric approach. We argue that a single digital representation of patients cannot support all the data uses of digital twins for technical and regulatory reasons. Instead, we propose a data architecture leveraging three main families of digital representations: (1) multimodal dashboards integrating various raw health records at points of care to assist with perception and documentation, (2) virtual patients, which provide nonsensitive data for collective secondary uses, and (3) individual predictions that support clinical decisions. For a given patient, multiple digital representations may be generated according to the different clinical pathways the patient goes through, each tailored to balance the trade-offs associated with the respective intended uses. Therefore, our proposed framework conceives the medical digital twin as a data architecture leveraging several digital representations of patients along clinical pathways.
Collapse
Affiliation(s)
- Stanislas Demuth
- INSERM U1064, CR2TI - Center for Research in Transplantation and Translational Immunology, Nantes University, 30 Bd Jean Monnet, Nantes, 44093, France, 33 2 40 08 74 10
- INSERM CIC 1434 Clinical Investigation Center, University Hospital of Strasbourg, Strasbourg, France
| | - Jérôme De Sèze
- INSERM CIC 1434 Clinical Investigation Center, University Hospital of Strasbourg, Strasbourg, France
- Department of Neurology, University Hospital of Strasbourg, Strasbourg, France
| | - Gilles Edan
- Department of Neurology, University Hospital of Rennes, Rennes, France
| | - Tjalf Ziemssen
- Center of Clinical Neuroscience, University Hospital Carl Gustav Carus, Dresden, Germany
| | - Françoise Simon
- Department of Health Policy & Management, Columbia University, New York, NY, United States
- Mount Sinai School of Medicine, New York, NY, United States
| | - Pierre-Antoine Gourraud
- INSERM U1064, CR2TI - Center for Research in Transplantation and Translational Immunology, Nantes University, 30 Bd Jean Monnet, Nantes, 44093, France, 33 2 40 08 74 10
- Pôle Hospitalo-Universitaire 11: Santé Publique, Clinique des données, INSERM, CIC 1413, Nantes University Hospital, Nantes, France
| |
Collapse
|
16
|
Idaikkadar N, Bodin E, Cholli P, Navon L, Ortmann L, Banja J, Waller LA, Alic A, Yuan K, Law R. Advancing Ethical Considerations for Data Science in Injury and Violence Prevention. Public Health Rep 2025:333549241312055. [PMID: 39834075 PMCID: PMC11748135 DOI: 10.1177/00333549241312055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2025] Open
Abstract
Data science is an emerging field that provides new analytical methods. It incorporates novel data sources (eg, internet data) and methods (eg, machine learning) that offer valuable and timely insights into public health issues, including injury and violence prevention. The objective of this research was to describe ethical considerations for public health data scientists conducting injury and violence prevention-related data science projects to prevent unintended ethical, legal, and social consequences, such as loss of privacy or loss of public trust. We first reviewed foundational bioethics and public health ethics literature to identify key ethical concepts relevant to public health data science. After identifying these ethics concepts, we held a series of discussions to organize them under broad ethical domains. Within each domain, we examined relevant ethics concepts from our review of the primary literature. Lastly, we developed questions for each ethical domain to facilitate the early conceptualization stage of the ethical analysis of injury and violence prevention projects. We identified 4 ethical domains: privacy, responsible stewardship, justice as fairness, and inclusivity and engagement. We determined that each domain carries equal weight, with no consideration bearing more importance than the others. Examples of ethical considerations are clearly identifying project goals, determining whether people included in projects are at risk of reidentification through external sources or linkages, and evaluating and minimizing the potential for bias in data sources used. As data science methodologies are incorporated into public health research to work toward reducing the effect of injury and violence on individuals, families, and communities in the United States, we recommend that relevant ethical issues be identified, considered, and addressed.
Collapse
Affiliation(s)
- Nimi Idaikkadar
- Division of Injury Prevention, National Center for Injury Prevention and Control, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Eva Bodin
- Office of Readiness and Response, Immediate Office of the Director, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Preetam Cholli
- National Center for HIV, Viral Hepatitis, STD, and TB Prevention, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Livia Navon
- Division of Injury Prevention, National Center for Injury Prevention and Control, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Leonard Ortmann
- Office of Public Health Ethics and Regulations, Office of Science, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - John Banja
- Center for Ethics, Emory University, Atlanta, GA, USA
| | - Lance A. Waller
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA, USA
| | - Alen Alic
- Division of Injury Prevention, National Center for Injury Prevention and Control, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Keming Yuan
- Division of Injury Prevention, National Center for Injury Prevention and Control, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Royal Law
- Division of Injury Prevention, National Center for Injury Prevention and Control, Centers for Disease Control and Prevention, Atlanta, GA, USA
| |
Collapse
|
17
|
Rocher L, Hendrickx JM, Montjoye YAD. A scaling law to model the effectiveness of identification techniques. Nat Commun 2025; 16:347. [PMID: 39788959 PMCID: PMC11718298 DOI: 10.1038/s41467-024-55296-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Accepted: 12/06/2024] [Indexed: 01/12/2025] Open
Abstract
AI techniques are increasingly being used to identify individuals both offline and online. However, quantifying their effectiveness at scale and, by extension, the risks they pose remains a significant challenge. Here, we propose a two-parameter Bayesian model for exact matching techniques and derive an analytical expression for correctness (κ), the fraction of people accurately identified in a population. We then generalize the model to forecast how κ scales from small-scale experiments to the real world, for exact, sparse, and machine learning-based robust identification techniques. Despite having only two degrees of freedom, our method closely fits 476 correctness curves and strongly outperforms curve-fitting methods and entropy-based rules of thumb. Our work provides a principled framework for forecasting the privacy risks posed by identification techniques, while also supporting independent accountability efforts for AI-based biometric systems.
Collapse
Affiliation(s)
- Luc Rocher
- Oxford Internet Institute, University of Oxford, Oxford, UK.
- Information and Communication Technologies, Electronics and Applied Mathematics (ICTEAM), Université catholique de Louvain, Louvain-la-Neuve, Belgium.
- Data Science Institute, Imperial College London, London, UK.
| | - Julien M Hendrickx
- Information and Communication Technologies, Electronics and Applied Mathematics (ICTEAM), Université catholique de Louvain, Louvain-la-Neuve, Belgium
| | - Yves-Alexandre de Montjoye
- Data Science Institute, Imperial College London, London, UK.
- Department of Computing, Imperial College London, London, UK.
| |
Collapse
|
18
|
Jackson BR, Kaplan B, Schreiber R, DeMuro PR, Nichols-Johnson V, Ozeran L, Solomonides A, Koppel R. Ethical Dimensions of Clinical Data Sharing by U.S. Health Care Organizations for Purposes beyond Direct Patient Care: Interviews with Health Care Leaders. Appl Clin Inform 2025; 16:90-100. [PMID: 39362293 PMCID: PMC11779532 DOI: 10.1055/a-2432-0329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Accepted: 10/01/2024] [Indexed: 10/05/2024] Open
Abstract
OBJECTIVES This study aimed to (1) empirically investigate current practices and analyze ethical dimensions of clinical data sharing by health care organizations for uses other than treatment, payment, and operations; and (2) make recommendations to inform research and policy for health care organizations to protect patients' privacy and autonomy when sharing data with unrelated third parties. METHODS Semistructured interviews and surveys involving 24 informatics leaders from 22 U.S. health care organizations, accompanied by thematic and ethical analyses. RESULTS We found considerable heterogeneity across organizations in policies and practices. Respondents understood "data sharing" and "research" in very different ways. Their interpretations of these terms ranged from making data available for academic and public health uses, and to health information exchanges; to selling data for corporate research; and to contracting with aggregators for future resale or use. The nine interview themes were that health care organizations: (1) share clinical data with many types of organizations, (2) have a variety of motivations for sharing data, (3) do not make data-sharing policies readily available, (4) have widely varying data-sharing approval processes, (5) most commonly rely on Health Insurance and Portability and Accountability Act (HIPAA) de-identification to protect privacy, (6) were concerned about clinical data use by electronic health record vendors, (7) lacked data-sharing transparency to the general public, (8) allowed individual patients little control over sharing of their data, and (9) had not yet changed data-sharing practices within the year following the U.S. Supreme Court 2022 decision denying rights to abortion. CONCLUSION Our analysis identified gaps between ethical principles and health care organizations' data-sharing policies and practices. To better align clinical data-sharing practices with patient expectations and biomedical ethical principles, we recommend updating HIPAA, including re-identification and upstream sharing restrictions in data-sharing contracts, better coordination across data-sharing approval processes, fuller transparency and opt-out options for patients, and accountability for data-sharing and consequent harms.
Collapse
Affiliation(s)
| | - Bonnie Kaplan
- Department of Biostatistics (Health Informatics), Bioethics Center, Information Society Project, Solomon Center for Health Law and Policy, Center for Biomedical Data Science, and Program for Biomedical Ethics, Yale University, Yale School of Public Health, New Haven, Connecticut, United States
| | - Richard Schreiber
- Information Services, Penn State Health, Camp Hill, Pennsylvania, United States
- Department of Biomedical Informatics and Data Science, Johns Hopkins School of Medicine, University of Maryland Graduate School of Medicine, Baltimore, Maryland, United States
- University of Maryland Graduate School, Clinical Informatics Master of Science Program, Baltimore, Maryland, United States
| | | | | | - Larry Ozeran
- Clinical Informatics Inc., Woodland, California, United States
| | | | - Ross Koppel
- Department of Biomedical Informatics, Perelman School of Medicine and The Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, Pennsylvania, United States
- Department of Biomedical Informatics, Jacobs School of Medicine, University at Buffalo, Buffalo, New York, United States
| |
Collapse
|
19
|
Vigezzi GP, Maggioni E, Bert F, de Vito C, Siliquini R, Odone A. Who is (not) vaccinated? A proposal for a comprehensive immunization information system. Hum Vaccin Immunother 2024; 20:2386739. [PMID: 39103249 DOI: 10.1080/21645515.2024.2386739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Accepted: 07/27/2024] [Indexed: 08/07/2024] Open
Abstract
The role of immunization in public health is crucial, offering widespread protection against infectious diseases and underpinning societal well-being. However, achieving optimal vaccination coverage is impeded by vaccine hesitancy, a significant challenge that necessitates comprehensive strategies to understand and mitigate its effects. We propose the integration of Population Health Management principles with Immunization Information Systems (IISs) to address vaccine hesitancy more effectively. Our approach leverages systematic health determinants analysis to identify at-risk populations and tailor interventions, thereby promoting vaccination coverage and public health responses. We call for the development of an enhanced version of the Italian National Vaccination Registry, which aims to facilitate real-time tracking of individuals' vaccination status while improving data accuracy and interoperability among healthcare systems. This registry is designed to overcome current barriers by ensuring robust data protection, addressing cultural and organizational challenges, and integrating behavioral insights to foster informed public health campaigns. Our proposal aligns with the Italian National Vaccination Prevention Plan 2023-2025 and emphasizes proactive, evidence-based strategies to increase vaccination uptake and contrast the spread of vaccine-preventable diseases. The ultimate goal is to establish a data-driven, ethically sound framework that enhances public health outcomes and addresses the complexities of vaccine hesitancy within the Italian context and beyond.
Collapse
Affiliation(s)
- Giacomo Pietro Vigezzi
- Department of Public Health, Experimental and Forensic Medicine, University of Pavia, Pavia, Italy
| | - Elena Maggioni
- Department of Public Health, Experimental and Forensic Medicine, University of Pavia, Pavia, Italy
| | - Fabrizio Bert
- Department of Public Health and Pediatrics Sciences, University of Torino, Torino, Italy
| | - Corrado de Vito
- Department of Public Health and Infectious Diseases, Sapienza University of Rome, Rome, Italy
| | - Roberta Siliquini
- Department of Public Health and Pediatrics Sciences, University of Torino, Torino, Italy
| | - Anna Odone
- Department of Public Health, Experimental and Forensic Medicine, University of Pavia, Pavia, Italy
- Medical Direction, IRCCS Fondazione Policlinico San Matteo, Pavia, Italy
| |
Collapse
|
20
|
Vickers P, Adamo L, Alfano M, Clark C, Cresto E, Cui H, Dang H, Dellsén F, Dupin N, Gradowski L, Graf S, Guevara A, Hallap M, Hamilton J, Hardey M, Helm P, Landrum A, Levy N, Machery E, Mills S, Muller S, Sheppard J, N. K. S, Slater M, Stegenga J, Strandin H, Stuart MT, Sweet D, Tasdan U, Taylor H, Towler O, Tulodziecki D, Tworek H, Wallbank R, Wiltsche H, Mitchell Finnigan S. Development of a novel methodology for ascertaining scientific opinion and extent of agreement. PLoS One 2024; 19:e0313541. [PMID: 39642116 PMCID: PMC11623554 DOI: 10.1371/journal.pone.0313541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Accepted: 10/27/2024] [Indexed: 12/08/2024] Open
Abstract
We take up the challenge of developing an international network with capacity to survey the world's scientists on an ongoing basis, providing rich datasets regarding the opinions of scientists and scientific sub-communities, both at a time and also over time. The novel methodology employed sees local coordinators, at each institution in the network, sending survey invitation emails internally to scientists at their home institution. The emails link to a '10 second survey', where the participant is presented with a single statement to consider, and a standard five-point Likert scale. In June 2023, a group of 30 philosophers and social scientists invited 20,085 scientists across 30 institutions in 12 countries to participate, gathering 6,807 responses to the statement Science has put it beyond reasonable doubt that COVID-19 is caused by a virus. The study demonstrates that it is possible to establish a global network to quickly ascertain scientific opinion on a large international scale, with high response rate, low opt-out rate, and in a way that allows for significant (perhaps indefinite) repeatability. Measuring scientific opinion in this new way would be a valuable complement to currently available approaches, potentially informing policy decisions and public understanding across diverse fields.
Collapse
Affiliation(s)
- Peter Vickers
- Department of Philosophy, University of Durham, Durham, United Kingdom
| | - Ludovica Adamo
- School of Philosophy, Religion and History of Science, University of Leeds, Leeds, United Kingdom
| | - Mark Alfano
- Department of Philosophy, Macquarie University, Sydney, Australia
| | - Cory Clark
- The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- School of Arts and Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Eleonora Cresto
- Institute for Philosophical Research - SADAF, National Council for Scientific and Technical Research (CONICET), Buenos Aires, Argentina
| | - He Cui
- Department of Philosophy, University of Durham, Durham, United Kingdom
| | - Haixin Dang
- Department of Philosophy, University of Nebraska Omaha, Omaha, Nebraska, United States of America
| | - Finnur Dellsén
- Faculty of Philosophy, History, and Archeology, University of Iceland, Reykjavik, Iceland
- Department of Philosophy, Law, and International Studies, Inland Norway University of Applied Sciences, Lillehammer, Norway
- Department of Philosophy, Classics, History of Art and Ideas, University of Oslo, Oslo, Norway
| | - Nathalie Dupin
- School of Social and Political Science, University of Edinburgh, Edinburgh, United Kingdom
| | - Laura Gradowski
- Department of History and Philosophy of Science, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Simon Graf
- School of Philosophy, Religion and History of Science, University of Leeds, Leeds, United Kingdom
| | - Aline Guevara
- Science Communication Unit, Institute of Nuclear Sciences, National Autonomous University of Mexico (UNAM), Mexico City, Mexico
| | - Mark Hallap
- Department of Philosophy, University of Toronto, Toronto, Canada
| | - Jesse Hamilton
- Department of Philosophy, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Mariann Hardey
- Durham University Business School, University of Durham, Durham, United Kingdom
| | - Paula Helm
- Department of Media and Culture, University of Amsterdam (UvA), Amsterdam, Netherlands
| | - Asheley Landrum
- Walter Cronkite School of Journalism and Mass Communication, Arizona State University, Phoenix, Arizona, United States of America
| | - Neil Levy
- Department of Philosophy, Macquarie University, Sydney, Australia
- Uehiro Oxford Institute, University of Oxford, Oxford, United Kingdom
| | - Edouard Machery
- Department of History and Philosophy of Science, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Sarah Mills
- Department of Philosophy, University of Durham, Durham, United Kingdom
| | - Seán Muller
- Johannesburg Institute for Advanced Study, University of Johannesburg, Johannesburg, South Africa
| | - Joanne Sheppard
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, Virginia, United States of America
| | - Shinod N. K.
- Department of Philosophy, University of Hyderabad, Hyderabad, India
| | - Matthew Slater
- Department of Philosophy, Bucknell University, Lewisburg, Pennsylvania, United States of America
| | - Jacob Stegenga
- Leverhulme Centre for the Future of Intelligence, University of Cambridge, Cambridge, United Kingdom
- School of Humanities, Nanyang Technological University (NTU), Singapore, Singapore
| | - Henning Strandin
- Department of Philosophy, Stockholm University, Stockholm, Sweden
| | | | - David Sweet
- Department of Emergency Medicine, University of British Columbia, Vancouver, Canada
| | - Ufuk Tasdan
- Department of Philosophy, University of Durham, Durham, United Kingdom
| | - Henry Taylor
- Department of Philosophy, University of Birmingham, Birmingham, United Kingdom
| | - Owen Towler
- Department of Philosophy, University of Durham, Durham, United Kingdom
| | - Dana Tulodziecki
- Department of Philosophy, Purdue University, West Lafayette, Indiana, United States of America
| | - Heidi Tworek
- Department of History and School of Public Policy and Global Affairs, University of British Columbia, Vancouver, Canada
| | | | - Harald Wiltsche
- Division of Philosophy and Applied Ethics, Linköping University, Linköping, Sweden
| | | |
Collapse
|
21
|
Josephson CB, Aronica E, Beniczky S, Boyce D, Cavalleri G, Denaxas S, French J, Jehi L, Koh H, Kwan P, McDonald C, Mitchell JW, Rampp S, Sadleir L, Sisodiya SM, Wang I, Wiebe S, Yasuda C, Youngerman B, the ILAE Big Data Commission. Big data research is everyone's research-Making epilepsy data science accessible to the global community: Report of the ILAE big data commission. Epileptic Disord 2024; 26:733-752. [PMID: 39446076 PMCID: PMC11651381 DOI: 10.1002/epd2.20288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Revised: 07/24/2024] [Accepted: 09/04/2024] [Indexed: 10/25/2024]
Abstract
Epilepsy care generates multiple sources of high-dimensional data, including clinical, imaging, electroencephalographic, genomic, and neuropsychological information, that are collected routinely to establish the diagnosis and guide management. Thanks to high-performance computing, sophisticated graphics processing units, and advanced analytics, we are now on the cusp of being able to use these data to significantly improve individualized care for people with epilepsy. Despite this, many clinicians, health care providers, and people with epilepsy are apprehensive about implementing Big Data and accompanying technologies such as artificial intelligence (AI). Practical, ethical, privacy, and climate issues represent real and enduring concerns that have yet to be completely resolved. Similarly, Big Data and AI-related biases have the potential to exacerbate local and global disparities. These are highly germane concerns to the field of epilepsy, given its high burden in developing nations and areas of socioeconomic deprivation. This educational paper from the International League Against Epilepsy's (ILAE) Big Data Commission aims to help clinicians caring for people with epilepsy become familiar with how Big Data is collected and processed, how they are applied to studies using AI, and outline the immense potential positive impact Big Data can have on diagnosis and management.
Collapse
Affiliation(s)
- Colin B. Josephson
- Department of Clinical Neurosciences, Cumming School of MedicineUniversity of CalgaryCalgaryAlbertaCanada
- Hotchkiss Brain InstituteUniversity of CalgaryCalgaryAlbertaCanada
- Department of Community Health Sciences, Cumming School of MedicineUniversity of CalgaryAlbertaCanada
- O'Brien Institute for Public HealthUniversity of CalgaryCalgaryAlbertaCanada
- Centre for Health InformaticsUniversity of CalgaryCalgaryAlbertaCanada
- Institute for Health InformaticsUniversity College LondonLondonUK
| | - Eleonora Aronica
- Department of (Neuro)Pathology, Amsterdam UMCUniversity of Amsterdam, Amsterdam NeuroscienceAmsterdamThe Netherlands
- Stichting Epilepsie Instellingen Nederland (SEIN)HeemstedeThe Netherlands
| | - Sandor Beniczky
- Department of Neurology, Albert Szent‐Györgyi Medical SchoolUniversity of SzegedSzegedHungary
- Department of NeurophysiologyDanish Epilepsy CenterDianalundDenmark
- Department of Clinical Medicine, Aarhus University and Department of Clinical NeurophysiologyAarhus University HospitalAarhusDenmark
| | - Danielle Boyce
- Tufts University School of MedicineBostonMassachusettsUSA
- Johns Hopkins University Biomedical Informatics and Data Science SectionBaltimoreMarylandUSA
- West Chester University Department of Public Policy and AdministrationWest ChesterPennsylvaniaUSA
| | - Gianpiero Cavalleri
- School of Pharmacy and Biomolecular SciencesThe Royal College of Surgeons in IrelandDublinIreland
- FutureNeuro SFI Research CentreThe Royal College of Surgeons in IrelandDublinIreland
| | - Spiros Denaxas
- Institute for Health InformaticsUniversity College LondonLondonUK
- British Heart Foundation Data Science CenterHealth Data Research UKLondonUK
| | - Jacqueline French
- Department of NeurologyGrossman School of Medicine, New York UniversityNew YorkNew YorkUSA
| | - Lara Jehi
- Epilepsy CenterCleveland ClinicClevelandOhioUSA
- Center for Computational Life SciencesClevelandOhioUSA
| | - Hyunyong Koh
- Harvard Brain Science InitiativeHarvard UniversityBostonMassachusettsUSA
| | - Patrick Kwan
- Department of Neuroscience, School of Translational MedicineMonash UniversityMelbourneVictoriaAustralia
- Department of NeurologyAlfred HealthMelbourneVictoriaAustralia
- Department of NeurologyThe Royal Melbourne HospitalParkvilleVictoriaAustralia
| | - Carrie McDonald
- Department of Radiation Medicine and Applied Sciences & PsychiatryUniversity of CaliforniaSan DiegoCaliforniaUSA
| | - James W. Mitchell
- Institute of Systems, Molecular and Integrative Biology (ISMIB)University of LiverpoolLiverpoolUK
- Department of NeurologyThe Walton Cetnre NHS Foundation TrustLiverpoolUK
| | - Stefan Rampp
- Department of Neurosurgery and Department of Neuroradiology, University Hospital Erlangen, Department of NeurosurgeryUniversity Hospital Halle (Saale)Halle (Saale)Germany
| | - Lynette Sadleir
- Department of Paediatrics and Child HealthUniversity of OtagoWellingtonNew Zealand
| | - Sanjay M. Sisodiya
- Department of Clinical and Experimental Epilepsy, UCL Queen Square Institute of NeurologyLondon WC1N 3BG and Chalfont Centre for EpilepsyLondonUK
| | - Irene Wang
- Epilepsy Center, Neurological InstituteCleveland ClinicClevelandOhioUSA
| | - Samuel Wiebe
- Department of Clinical Neurosciences, Cumming School of MedicineUniversity of CalgaryCalgaryAlbertaCanada
- Hotchkiss Brain InstituteUniversity of CalgaryCalgaryAlbertaCanada
- Department of Community Health Sciences, Cumming School of MedicineUniversity of CalgaryAlbertaCanada
- O'Brien Institute for Public HealthUniversity of CalgaryCalgaryAlbertaCanada
- Clinical Research Unit, Cumming School of MedicineUniversity of CalgaryCalgaryAlbertaCanada
| | | | - Brett Youngerman
- Department of Neurological SurgeryColumbia University Vagelos College of Physicians and SurgeonsNew YorkNew YorkUSA
| | | |
Collapse
|
22
|
Zwiers LC, Grobbee DE, Uijl A, Ong DSY. Federated learning as a smart tool for research on infectious diseases. BMC Infect Dis 2024; 24:1327. [PMID: 39573994 PMCID: PMC11580691 DOI: 10.1186/s12879-024-10230-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 11/14/2024] [Indexed: 11/25/2024] Open
Abstract
BACKGROUND The use of real-world data has become increasingly popular, also in the field of infectious disease (ID), particularly since the COVID-19 pandemic emerged. While much useful data for research is being collected, these data are generally stored across different sources. Privacy concerns limit the possibility to store the data centrally, thereby also limiting the possibility of fully leveraging the potential power of combined data. Federated learning (FL) has been suggested to overcome privacy issues by making it possible to perform research on data from various sources without those data leaving local servers. In this review, we discuss existing applications of FL in ID research, as well as the most relevant opportunities and challenges of this method. METHODS References for this review were identified through searches of MEDLINE/PubMed, Google Scholar, Embase and Scopus until July 2023. We searched for studies using FL in different applications related to ID. RESULTS Thirty references were included and divided into four sub-topics: disease screening, prediction of clinical outcomes, infection epidemiology, and vaccine research. Most research was related to COVID-19. In all studies, FL achieved good accuracy when predicting diseases and outcomes, also in comparison to non-federated methods. However, most studies did not make use of real-world federated data, but rather showed the potential of FL by using data that was manually partitioned. CONCLUSIONS FL is a promising methodology which allows using data from several sources, potentially generating stronger and more generalisable results. However, further exploration of FL application possibilities in ID research is needed.
Collapse
Affiliation(s)
- Laura C Zwiers
- Julius Global Health, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands.
- Julius Clinical, Zeist, The Netherlands.
| | - Diederick E Grobbee
- Julius Global Health, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Julius Clinical, Zeist, The Netherlands
| | - Alicia Uijl
- Julius Global Health, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Department of Cardiology, Amsterdam University Medical Centers, Amsterdam Cardiovascular Sciences, University of Amsterdam, Amsterdam, The Netherlands
- Division of Cardiology, Department of Medicine, Karolinska Institutet, Stockholm, Sweden
| | - David S Y Ong
- Julius Global Health, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Julius Clinical, Zeist, The Netherlands
- Department of Medical Microbiology and Infection Control, Franciscus Gasthuis & Vlietland, Rotterdam, The Netherlands
| |
Collapse
|
23
|
Kupers ER, Knapen T, Merriam EP, Kay KN. Principles of intensive human neuroimaging. Trends Neurosci 2024; 47:856-864. [PMID: 39455343 PMCID: PMC11563852 DOI: 10.1016/j.tins.2024.09.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Revised: 08/28/2024] [Accepted: 09/27/2024] [Indexed: 10/28/2024]
Abstract
The rise of large, publicly shared functional magnetic resonance imaging (fMRI) data sets in human neuroscience has focused on acquiring either a few hours of data on many individuals ('wide' fMRI) or many hours of data on a few individuals ('deep' fMRI). In this opinion article, we highlight an emerging approach within deep fMRI, which we refer to as 'intensive' fMRI: one that strives for extensive sampling of cognitive phenomena to support computational modeling and detailed investigation of brain function at the single voxel level. We discuss the fundamental principles, trade-offs, and practical considerations of intensive fMRI. We also emphasize that intensive fMRI does not simply mean collecting more data: it requires careful design of experiments to enable a rich hypothesis space, optimizing data quality, and strategically curating public resources to maximize community impact.
Collapse
Affiliation(s)
- Eline R Kupers
- Center for Magnetic Resonance Research, Department of Radiology, University of Minnesota, Minneapolis, MN, USA; Department of Psychology, Stanford University, Stanford, CA, USA.
| | - Tomas Knapen
- Spinoza Centre for Neuroimaging, Amsterdam, the Netherlands; Netherlands Institute for Neuroscience, Royal Netherlands Academy of Sciences, Amsterdam, the Netherlands; Cognitive Psychology, Faculty of Behavioural and Movement Sciences, Vrije Universiteit, Amsterdam, the Netherlands
| | - Elisha P Merriam
- Laboratory of Brain and Cognition, National Institute of Mental Health, NIH, Bethesda, MD, USA
| | - Kendrick N Kay
- Center for Magnetic Resonance Research, Department of Radiology, University of Minnesota, Minneapolis, MN, USA.
| |
Collapse
|
24
|
Wang R, Liu J, Jiang B, Gao B, Luo H, Yang F, Ye Y, Chen Z, Liu H, Cui C, Xu K, Li B, Yang X. A single-cell perspective on immunotherapy for pancreatic cancer: from microenvironment analysis to therapeutic strategy innovation. Front Immunol 2024; 15:1454833. [PMID: 39539544 PMCID: PMC11557317 DOI: 10.3389/fimmu.2024.1454833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Accepted: 10/08/2024] [Indexed: 11/16/2024] Open
Abstract
Pancreatic cancer remains one of the most lethal malignancies, with conventional treatment options providing limited efficacy. Recent advancements in immunotherapy have offered new hope, yet the unique tumor microenvironment (TME) of pancreatic cancer poses significant challenges to its successful application. This review explores the transformative impact of single-cell technology on the understanding and treatment of pancreatic cancer. By enabling high-resolution analysis of cellular heterogeneity within the TME, single-cell approaches have elucidated the complex interplay between various immune and tumor cell populations. These insights have led to the identification of predictive biomarkers and the development of innovative, personalized immunotherapeutic strategies. The review discusses the role of single-cell technology in dissecting the intricate immune landscape of pancreatic cancer, highlighting the discovery of T cell exhaustion profiles and macrophage polarization states that influence treatment response. Moreover, it outlines the potential of single-cell data in guiding the selection of immunotherapy drugs and optimizing treatment plans. The review also addresses the challenges and prospects of translating these single-cell-based innovations into clinical practice, emphasizing the need for interdisciplinary research and the integration of artificial intelligence to overcome current limitations. Ultimately, the review underscores the promise of single-cell technology in driving therapeutic strategy innovation and improving patient outcomes in the battle against pancreatic cancer.
Collapse
Affiliation(s)
- Rui Wang
- Department of General Surgery (Hepatopancreatobiliary Surgery), The Affiliated Hospital of Southwest Medical University, Luzhou, China
- Academician (Expert) Workstation of Sichuan Province, Metabolic Hepatobiliary and Pancreatic Diseases Key Laboratory of Luzhou City, The Affiliated Hospital of Southwest Medical University, Luzhou, China
- General Surgery Day Ward, Department of General Surgery, The Third People’s Hospital of Chengdu, Affiliated Hospital of Southwest Jiaotong University, The Second Affiliated Hospital of Chengdu, Chongqing Medical University, Chengdu, China
| | - Jie Liu
- Department of General Surgery (Hepatopancreatobiliary Surgery), The Affiliated Hospital of Southwest Medical University, Luzhou, China
- Academician (Expert) Workstation of Sichuan Province, Metabolic Hepatobiliary and Pancreatic Diseases Key Laboratory of Luzhou City, The Affiliated Hospital of Southwest Medical University, Luzhou, China
| | - Bo Jiang
- Department of General Surgery (Hepatopancreatobiliary Surgery), The Affiliated Hospital of Southwest Medical University, Luzhou, China
- Academician (Expert) Workstation of Sichuan Province, Metabolic Hepatobiliary and Pancreatic Diseases Key Laboratory of Luzhou City, The Affiliated Hospital of Southwest Medical University, Luzhou, China
| | - Benjian Gao
- Department of General Surgery (Hepatopancreatobiliary Surgery), The Affiliated Hospital of Southwest Medical University, Luzhou, China
- Academician (Expert) Workstation of Sichuan Province, Metabolic Hepatobiliary and Pancreatic Diseases Key Laboratory of Luzhou City, The Affiliated Hospital of Southwest Medical University, Luzhou, China
| | - Honghao Luo
- Department of Radiology, Xichong People’s Hospital, Nanchong, China
| | - Fengyi Yang
- Department of General Surgery (Hepatopancreatobiliary Surgery), The Affiliated Hospital of Southwest Medical University, Luzhou, China
- Academician (Expert) Workstation of Sichuan Province, Metabolic Hepatobiliary and Pancreatic Diseases Key Laboratory of Luzhou City, The Affiliated Hospital of Southwest Medical University, Luzhou, China
| | - Yuntao Ye
- Department of General Surgery (Hepatopancreatobiliary Surgery), The Affiliated Hospital of Southwest Medical University, Luzhou, China
- Academician (Expert) Workstation of Sichuan Province, Metabolic Hepatobiliary and Pancreatic Diseases Key Laboratory of Luzhou City, The Affiliated Hospital of Southwest Medical University, Luzhou, China
| | - Zhuo Chen
- Department of General Surgery (Hepatopancreatobiliary Surgery), The Affiliated Hospital of Southwest Medical University, Luzhou, China
- Academician (Expert) Workstation of Sichuan Province, Metabolic Hepatobiliary and Pancreatic Diseases Key Laboratory of Luzhou City, The Affiliated Hospital of Southwest Medical University, Luzhou, China
| | - Hong Liu
- Department of General Surgery (Hepatopancreatobiliary Surgery), The Affiliated Hospital of Southwest Medical University, Luzhou, China
- Academician (Expert) Workstation of Sichuan Province, Metabolic Hepatobiliary and Pancreatic Diseases Key Laboratory of Luzhou City, The Affiliated Hospital of Southwest Medical University, Luzhou, China
| | - Cheng Cui
- Department of General Surgery (Hepatopancreatobiliary Surgery), The Affiliated Hospital of Southwest Medical University, Luzhou, China
- Academician (Expert) Workstation of Sichuan Province, Metabolic Hepatobiliary and Pancreatic Diseases Key Laboratory of Luzhou City, The Affiliated Hospital of Southwest Medical University, Luzhou, China
| | - Ke Xu
- Department of Oncology, Chongqing General Hospital, Chongqing University, Chongqing, China
| | - Bo Li
- Department of General Surgery (Hepatopancreatobiliary Surgery), The Affiliated Hospital of Southwest Medical University, Luzhou, China
- Academician (Expert) Workstation of Sichuan Province, Metabolic Hepatobiliary and Pancreatic Diseases Key Laboratory of Luzhou City, The Affiliated Hospital of Southwest Medical University, Luzhou, China
| | - Xiaoli Yang
- Department of General Surgery (Hepatopancreatobiliary Surgery), The Affiliated Hospital of Southwest Medical University, Luzhou, China
- Academician (Expert) Workstation of Sichuan Province, Metabolic Hepatobiliary and Pancreatic Diseases Key Laboratory of Luzhou City, The Affiliated Hospital of Southwest Medical University, Luzhou, China
| |
Collapse
|
25
|
Manna A, Dall’Amico L, Tizzoni M, Karsai M, Perra N. Generalized contact matrices allow integrating socioeconomic variables into epidemic models. SCIENCE ADVANCES 2024; 10:eadk4606. [PMID: 39392883 PMCID: PMC11468902 DOI: 10.1126/sciadv.adk4606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 09/09/2024] [Indexed: 10/13/2024]
Abstract
Variables related to socioeconomic status (SES), including income, ethnicity, and education, shape contact structures and affect the spread of infectious diseases. However, these factors are often overlooked in epidemic models, which typically stratify social contacts by age and interaction contexts. Here, we introduce and study generalized contact matrices that stratify contacts across multiple dimensions. We demonstrate a lower-bound theorem proving that disregarding additional dimensions, besides age and context, might lead to an underestimation of the basic reproductive number. By using SES variables in both synthetic and empirical data, we illustrate how generalized contact matrices enhance epidemic models, capturing variations in behaviors such as heterogeneous levels of adherence to nonpharmaceutical interventions among demographic groups. Moreover, we highlight the importance of integrating SES traits into epidemic models, as neglecting them might lead to substantial misrepresentation of epidemic outcomes and dynamics. Our research contributes to the efforts aiming at incorporating socioeconomic and other dimensions into epidemic modeling.
Collapse
Affiliation(s)
- Adriana Manna
- Department of Network and Data Science, Central European University, Vienna, Austria
| | | | - Michele Tizzoni
- Department of Sociology and Social Research, University of Trento, Trento, Italy
| | - Márton Karsai
- Department of Network and Data Science, Central European University, Vienna, Austria
- National Laboratory for Health Security, HUN-REN Rényi Institute of Mathematics, Budapest, Hungary
| | - Nicola Perra
- School of Mathematical Sciences, Queen Mary University of London, London, UK
| |
Collapse
|
26
|
Lam K, Simister C, Yiu A, Kinross JM. Barriers to the adoption of routine surgical video recording: a mixed-methods qualitative study of a real-world implementation of a video recording platform. Surg Endosc 2024; 38:5793-5802. [PMID: 39148005 PMCID: PMC11458650 DOI: 10.1007/s00464-024-11174-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Accepted: 08/05/2024] [Indexed: 08/17/2024]
Abstract
BACKGROUND Routine surgical video recording has multiple benefits. Video acts as an objective record of the operative record, allows video-based coaching and is integral to the development of digital technologies. Despite these benefits, adoption is not widespread. To date, only questionnaire studies have explored this failure in adoption. This study aims to determine the barriers and provide recommendations for the implementation of routine surgical video recording. MATERIALS AND METHODS A pre- and post-pilot questionnaire surrounding a real-world implementation of a C-SATS©, an educational recording and surgical analytics platform, was conducted in a university teaching hospital trust. Usage metrics from the pilot study and descriptive analyses of questionnaire responses were used with the non-adoption, abandonment, scale-up, spread, sustainability (NASSS) framework to create topic guides for semi-structured interviews. Transcripts of interviews were evaluated in an inductive thematic analysis. RESULTS Engagement with the C-SATS© platform failed to reach consistent levels with only 57 videos uploaded. Three attending surgeons, four surgical residents, one scrub nurse, three patients, one lawyer, and one industry representative were interviewed, all of which perceived value in recording. Barriers of 'change,' 'resource,' and 'governance,' were identified as the main themes. Resistance was centred on patient misinterpretation of videos. Participants believed availability of infrastructure would facilitate adoption but integration into surgical workflow is required. Regulatory uncertainty was centred around anonymity and data ownership. CONCLUSION Barriers to the adoption of routine surgical video recording exist beyond technological barriers alone. Priorities for implementation include integration recording into the patient record, engaging all stakeholders to ensure buy-in, and formalising consent processes to establish patient trust.
Collapse
Affiliation(s)
- Kyle Lam
- Department of Surgery and Cancer, Imperial College, 10th Floor Queen Elizabeth Queen Mother Building, St Mary's Hospital, London, W2 1NY, UK.
| | - Catherine Simister
- Department of Surgery and Cancer, Imperial College, 10th Floor Queen Elizabeth Queen Mother Building, St Mary's Hospital, London, W2 1NY, UK
| | - Andrew Yiu
- Department of Surgery and Cancer, Imperial College, 10th Floor Queen Elizabeth Queen Mother Building, St Mary's Hospital, London, W2 1NY, UK
| | - James M Kinross
- Department of Surgery and Cancer, Imperial College, 10th Floor Queen Elizabeth Queen Mother Building, St Mary's Hospital, London, W2 1NY, UK
| |
Collapse
|
27
|
Islam MS, Kalmady SV, Hindle A, Sandhu R, Sun W, Sepehrvand N, Greiner R, Kaul P. Diagnostic and Prognostic Electrocardiogram-Based Models for Rapid Clinical Applications. Can J Cardiol 2024; 40:1788-1803. [PMID: 38992812 DOI: 10.1016/j.cjca.2024.07.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 07/04/2024] [Accepted: 07/05/2024] [Indexed: 07/13/2024] Open
Abstract
Leveraging artificial intelligence (AI) for the analysis of electrocardiograms (ECGs) has the potential to transform diagnosis and estimate the prognosis of not only cardiac but, increasingly, noncardiac conditions. In this review, we summarize clinical studies and AI-enhanced ECG-based clinical applications in the early detection, diagnosis, and estimating prognosis of cardiovascular diseases in the past 5 years (2019-2023). With advancements in deep learning and the rapid increased use of ECG technologies, a large number of clinical studies have been published. However, most of these studies are single-centre, retrospective, proof-of-concept studies that lack external validation. Prospective studies that progress from development toward deployment in clinical settings account for < 15% of the studies. Successful implementations of ECG-based AI applications that have received approval from the Food and Drug Administration have been developed through commercial collaborations, with approximately half of them being for mobile or wearable devices. The field is in its early stages, and overcoming several obstacles is essential, such as prospective validation in multicentre large data sets, addressing technical issues, bias, privacy, data security, model generalizability, and global scalability. This review concludes with a discussion of these challenges and potential solutions. By providing a holistic view of the state of AI in ECG analysis, this review aims to set a foundation for future research directions, emphasizing the need for comprehensive, clinically integrated, and globally deployable AI solutions in cardiovascular disease management.
Collapse
Affiliation(s)
- Md Saiful Islam
- Canadian VIGOUR Centre, University of Alberta, Edmonton, Alberta, Canada; Department of Medicine, University of Alberta, Edmonton, Alberta, Canada
| | - Sunil Vasu Kalmady
- Canadian VIGOUR Centre, University of Alberta, Edmonton, Alberta, Canada; Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada
| | - Abram Hindle
- Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada
| | - Roopinder Sandhu
- Canadian VIGOUR Centre, University of Alberta, Edmonton, Alberta, Canada; Smidt Heart Institute, Cedars-Sinai Medical Center Hospital System, Los Angeles, California, USA
| | - Weijie Sun
- Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada
| | - Nariman Sepehrvand
- Canadian VIGOUR Centre, University of Alberta, Edmonton, Alberta, Canada; Department of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Russell Greiner
- Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada; Alberta Machine Intelligence Institute, Edmonton, Alberta, Canada
| | - Padma Kaul
- Canadian VIGOUR Centre, University of Alberta, Edmonton, Alberta, Canada; Department of Medicine, University of Alberta, Edmonton, Alberta, Canada.
| |
Collapse
|
28
|
Gibson M, Newman-Norlund R, Bonilha L, Fridriksson J, Hickok G, Hillis AE, den Ouden DB, Rorden C. The Aphasia Recovery Cohort, an open-source chronic stroke repository. Sci Data 2024; 11:981. [PMID: 39251640 PMCID: PMC11384737 DOI: 10.1038/s41597-024-03819-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 08/22/2024] [Indexed: 09/11/2024] Open
Abstract
Sharing neuroimaging datasets enables reproducibility, education, tool development, and new discoveries. Neuroimaging from many studies are publicly available, providing a glimpse into progressive disorders and human development. In contrast, few stroke studies are shared, and these datasets lack longitudinal sampling of functional imaging, diffusion imaging, as well as the behavioral and demographic data that encourage novel applications. This is surprising, as stroke is a leading cause of disability, and acquiring brain imaging is considered standard of care. The first release of the Aphasia Recovery Cohort includes imaging data, demographics and behavioral measures from 230 chronic stroke survivors who experienced aphasia. We also share scripts to illustrate how the imaging data can predict impairment. In conclusion, recent advances in machine learning thrive on large, diverse datasets. Clinical data sharing can contribute to improvements in automated detection of brain injury, identification of white matter hyperintensities, measures of brain health, and prognostic abilities to guide care.
Collapse
Affiliation(s)
- Makayla Gibson
- Department of Psychology, University of South Carolina, Columbia, SC, USA
| | | | - Leonardo Bonilha
- Department of Neurology, University of South Carolina School of Medicine, Columbia, SC, USA
| | - Julius Fridriksson
- Department of Communication Sciences and Disorders, University of South Carolina, Columbia, SC, USA
| | - Gregory Hickok
- Department of Cognitive Sciences, University of California, Irvine, CA, USA
| | - Argye E Hillis
- Department of Neurology, John Hopkins University, Baltimore, MD, USA
| | - Dirk-Bart den Ouden
- Department of Communication Sciences and Disorders, University of South Carolina, Columbia, SC, USA
| | - Christopher Rorden
- Department of Psychology, University of South Carolina, Columbia, SC, USA.
| |
Collapse
|
29
|
Farhadyar K, Bonofiglio F, Hackenberg M, Behrens M, Zöller D, Binder H. Combining propensity score methods with variational autoencoders for generating synthetic data in presence of latent sub-groups. BMC Med Res Methodol 2024; 24:198. [PMID: 39251921 PMCID: PMC11382494 DOI: 10.1186/s12874-024-02327-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2024] [Accepted: 08/29/2024] [Indexed: 09/11/2024] Open
Abstract
In settings requiring synthetic data generation based on a clinical cohort, e.g., due to data protection regulations, heterogeneity across individuals might be a nuisance that we need to control or faithfully preserve. The sources of such heterogeneity might be known, e.g., as indicated by sub-groups labels, or might be unknown and thus reflected only in properties of distributions, such as bimodality or skewness. We investigate how such heterogeneity can be preserved and controlled when obtaining synthetic data from variational autoencoders (VAEs), i.e., a generative deep learning technique that utilizes a low-dimensional latent representation. To faithfully reproduce unknown heterogeneity reflected in marginal distributions, we propose to combine VAEs with pre-transformations. For dealing with known heterogeneity due to sub-groups, we complement VAEs with models for group membership, specifically from propensity score regression. The evaluation is performed with a realistic simulation design that features sub-groups and challenging marginal distributions. The proposed approach faithfully recovers the latter, compared to synthetic data approaches that focus purely on marginal distributions. Propensity scores add complementary information, e.g., when visualized in the latent space, and enable sampling of synthetic data with or without sub-group specific characteristics. We also illustrate the proposed approach with real data from an international stroke trial that exhibits considerable distribution differences between study sites, in addition to bimodality. These results indicate that describing heterogeneity by statistical approaches, such as propensity score regression, might be more generally useful for complementing generative deep learning for obtaining synthetic data that faithfully reflects structure from clinical cohorts.
Collapse
Affiliation(s)
- Kiana Farhadyar
- Institute of Medical Biometry and Statistics, University of Freiburg, Freiburg, Germany.
- Freiburg Center for Data Analysis and Modeling, University of Freiburg, Freiburg, Germany.
| | - Federico Bonofiglio
- National Research Council of Italy, ISMAR, Forte Santa Teresa, Lerici, Italy
| | - Maren Hackenberg
- Institute of Medical Biometry and Statistics, University of Freiburg, Freiburg, Germany
- Freiburg Center for Data Analysis and Modeling, University of Freiburg, Freiburg, Germany
| | - Max Behrens
- Institute of Medical Biometry and Statistics, University of Freiburg, Freiburg, Germany
- Freiburg Center for Data Analysis and Modeling, University of Freiburg, Freiburg, Germany
| | - Daniela Zöller
- Institute of Medical Biometry and Statistics, University of Freiburg, Freiburg, Germany
- Freiburg Center for Data Analysis and Modeling, University of Freiburg, Freiburg, Germany
| | - Harald Binder
- Institute of Medical Biometry and Statistics, University of Freiburg, Freiburg, Germany
- Freiburg Center for Data Analysis and Modeling, University of Freiburg, Freiburg, Germany
| |
Collapse
|
30
|
Palaniappan K, Lin EYT, Vogel S, Lim JCW. Gaps in the Global Regulatory Frameworks for the Use of Artificial Intelligence (AI) in the Healthcare Services Sector and Key Recommendations. Healthcare (Basel) 2024; 12:1730. [PMID: 39273754 PMCID: PMC11394803 DOI: 10.3390/healthcare12171730] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2024] [Revised: 08/23/2024] [Accepted: 08/27/2024] [Indexed: 09/15/2024] Open
Abstract
Artificial Intelligence (AI) has shown remarkable potential to revolutionise healthcare by enhancing diagnostics, improving treatment outcomes, and streamlining administrative processes. In the global regulatory landscape, several countries are working on regulating AI in healthcare. There are five key regulatory issues that need to be addressed: (i) data security and protection-measures to cover the "digital health footprints" left unknowingly by patients when they access AI in health services; (ii) data quality-availability of safe and secure data and more open database sources for AI, algorithms, and datasets to ensure equity and prevent demographic bias; (iii) validation of algorithms-mapping of the explainability and causability of the AI system; (iv) accountability-whether this lies with the healthcare professional, healthcare organisation, or the personified AI algorithm; (v) ethics and equitable access-whether fundamental rights of people are met in an ethical manner. Policymakers may need to consider the entire life cycle of AI in healthcare services and the databases that were used for the training of the AI system, along with requirements for their risk assessments to be publicly accessible for effective regulatory oversight. AI services that enhance their functionality over time need to undergo repeated algorithmic impact assessment and must also demonstrate real-time performance. Harmonising regulatory frameworks at the international level would help to resolve cross-border issues of AI in healthcare services.
Collapse
Affiliation(s)
- Kavitha Palaniappan
- Centre of Regulatory Excellence, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Elaine Yan Ting Lin
- Centre of Regulatory Excellence, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Silke Vogel
- Centre of Regulatory Excellence, Duke-NUS Medical School, Singapore 169857, Singapore
| | - John C W Lim
- Centre of Regulatory Excellence, Duke-NUS Medical School, Singapore 169857, Singapore
| |
Collapse
|
31
|
Affiliation(s)
- Giles Dawnay
- Writer and GP in Leominster, Herefordshire. Find more of Giles' work at his website: https://gilesdawnay.com
| |
Collapse
|
32
|
Federico CA, Trotsyuk AA. Biomedical Data Science, Artificial Intelligence, and Ethics: Navigating Challenges in the Face of Explosive Growth. Annu Rev Biomed Data Sci 2024; 7:1-14. [PMID: 38598860 DOI: 10.1146/annurev-biodatasci-102623-104553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2024]
Abstract
Advances in biomedical data science and artificial intelligence (AI) are profoundly changing the landscape of healthcare. This article reviews the ethical issues that arise with the development of AI technologies, including threats to privacy, data security, consent, and justice, as they relate to donors of tissue and data. It also considers broader societal obligations, including the importance of assessing the unintended consequences of AI research in biomedicine. In addition, this article highlights the challenge of rapid AI development against the backdrop of disparate regulatory frameworks, calling for a global approach to address concerns around data misuse, unintended surveillance, and the equitable distribution of AI's benefits and burdens. Finally, a number of potential solutions to these ethical quandaries are offered. Namely, the merits of advocating for a collaborative, informed, and flexible regulatory approach that balances innovation with individual rights and public welfare, fostering a trustworthy AI-driven healthcare ecosystem, are discussed.
Collapse
Affiliation(s)
- Carole A Federico
- Center for Biomedical Ethics, Stanford University School of Medicine, Stanford, California, USA; ,
| | - Artem A Trotsyuk
- Center for Biomedical Ethics, Stanford University School of Medicine, Stanford, California, USA; ,
| |
Collapse
|
33
|
Zhui L, Fenghe L, Xuehu W, Qining F, Wei R. Ethical Considerations and Fundamental Principles of Large Language Models in Medical Education: Viewpoint. J Med Internet Res 2024; 26:e60083. [PMID: 38971715 PMCID: PMC11327620 DOI: 10.2196/60083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Accepted: 07/06/2024] [Indexed: 07/08/2024] Open
Abstract
This viewpoint article first explores the ethical challenges associated with the future application of large language models (LLMs) in the context of medical education. These challenges include not only ethical concerns related to the development of LLMs, such as artificial intelligence (AI) hallucinations, information bias, privacy and data risks, and deficiencies in terms of transparency and interpretability but also issues concerning the application of LLMs, including deficiencies in emotional intelligence, educational inequities, problems with academic integrity, and questions of responsibility and copyright ownership. This paper then analyzes existing AI-related legal and ethical frameworks and highlights their limitations with regard to the application of LLMs in the context of medical education. To ensure that LLMs are integrated in a responsible and safe manner, the authors recommend the development of a unified ethical framework that is specifically tailored for LLMs in this field. This framework should be based on 8 fundamental principles: quality control and supervision mechanisms; privacy and data protection; transparency and interpretability; fairness and equal treatment; academic integrity and moral norms; accountability and traceability; protection and respect for intellectual property; and the promotion of educational research and innovation. The authors further discuss specific measures that can be taken to implement these principles, thereby laying a solid foundation for the development of a comprehensive and actionable ethical framework. Such a unified ethical framework based on these 8 fundamental principles can provide clear guidance and support for the application of LLMs in the context of medical education. This approach can help establish a balance between technological advancement and ethical safeguards, thereby ensuring that medical education can progress without compromising the principles of fairness, justice, or patient safety and establishing a more equitable, safer, and more efficient environment for medical education.
Collapse
Affiliation(s)
- Li Zhui
- Department of Vascular Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Li Fenghe
- Department of Vascular Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Wang Xuehu
- Department of Vascular Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Fu Qining
- Department of Vascular Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Ren Wei
- Department of Vascular Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| |
Collapse
|
34
|
Fadel M, Petot J, Gourraud PA, Descatha A. Flexibility of a large blindly synthetized avatar database for occupational research: Example from the CONSTANCES cohort for stroke and knee pain. PLoS One 2024; 19:e0308063. [PMID: 39083487 PMCID: PMC11290644 DOI: 10.1371/journal.pone.0308063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 07/16/2024] [Indexed: 08/02/2024] Open
Abstract
OBJECTIVES Though the rise of big data in the field of occupational health offers new opportunities especially for cross-cutting research, they raise the issue of privacy and security of data, especially when linking sensitive data from the field of insurance, occupational health or compensation claims. We aimed to validate a large, blinded synthesized database developed from the CONSTANCES cohort by comparing associations between three independently selected outcomes, and various exposures. METHODS From the CONSTANCES cohort, a large synthetic dataset was constructed using the avatar method (Octopize) that is agnostic to the data primary or secondary data uses. Three main analyses of interest were chosen to compare associations between the raw and avatar dataset: risk of stroke (any stroke, and subtypes of stroke), risk of knee pain and limitations associated with knee pain. Logistic models were computed, and a qualitative comparison of paired odds ratio (OR) was made. RESULTS Both raw and avatar datasets included 162,434 observations and 19 relevant variables. On the 172 paired raw/avatar OR that were computed, including stratified analyses on sex, more than 77% of the comparisons had a OR difference ≤0.5 and less than 7% had a discrepancy in the statistical significance of the associations, with a Cohen's Kappa coefficient of 0.80. CONCLUSIONS This study shows the flexibility and the multiple usage of a synthetic database created with the avatar method in the particular field of occupational health, which can be shared in open access without risking re-identification and privacy issues and help bring new insights for complex phenomenon like return to work.
Collapse
Affiliation(s)
- Marc Fadel
- Univ Angers, CHU Angers, Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) ‐ UMR_S, IRSET-ESTER, SFR ICAT, CAPTV CDC, Angers, France
| | | | - Pierre-Antoine Gourraud
- Nantes Université, INSERM, CR2TI ‐ Center for Research in Transplantation and Translational Immunology, Nantes, France
- Nantes Université, CHU Nantes, Pôle Hospitalo-Universitaire 11: Santé Publique, Clinique des données, INSERM, CIC 1413, Nantes, France
| | - Alexis Descatha
- Univ Angers, CHU Angers, Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) ‐ UMR_S, IRSET-ESTER, SFR ICAT, CAPTV CDC, Angers, France
- Department of Occupational Medicine, Epidemiology and Prevention, Donald and Barbara Zucker School of Medicine, Hofstra/Northwell, United States of America
| |
Collapse
|
35
|
Eiermann M. The Impact of Data Suppression Rules on Data Access and Re-Identification Risk in Adoption and Foster Care Analysis and Reporting System Annual Files. CHILD MALTREATMENT 2024:10775595241270042. [PMID: 39075035 DOI: 10.1177/10775595241270042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/31/2024]
Abstract
One of the most widely used data sources for research on foster care and adoption is the Adoption and Foster Care Analysis and Reporting System (AFCARS). County identifiers in AFCARS are suppressed for all counties with fewer than 1000 cases to prevent the re-identification of vulnerable children, but this also impacts researchers' ability to study smaller communities and analyze how local environments may affect out-of-home placements. This study uses non-public AFCARS datasets to assess, for the first time, how data suppression rules impact data access and re-identification risk. It compares the long-standing 1000-case threshold against a wide range of potential alternatives and finds substantial data access gains coupled with moderate risk increases for thresholds between 400 and 700. Adopting a 700-case threshold leads to a 50% increase in the number of identifiable counties while also keeping the percentage of fostered children who face an elevated risk of re-identification below 1%. Making data from a substantial number of rural counties available to researchers requires much larger threshold changes, which in turn increases re-identification risks.
Collapse
|
36
|
Largent EA, Karlawish J, Wexler A. From an idea to the marketplace: Identifying and addressing ethical and regulatory considerations across the digital health product-development lifecycle. BMC DIGITAL HEALTH 2024; 2:41. [PMID: 39130168 PMCID: PMC11308106 DOI: 10.1186/s44247-024-00098-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 05/09/2024] [Indexed: 08/13/2024]
Abstract
Widespread adoption of digital health tools has the potential to improve health and health care for individuals and their communities, but realizing this potential requires anticipating and addressing numerous ethical and regulatory challenges. Here, we help digital health tool developers identify ethical and regulatory considerations - and opportunities to advance desirable outcomes - by organizing them within a general product-development lifecycle that spans generation of ideas to commercialization of a product.
Collapse
Affiliation(s)
- Emily A Largent
- Department of Medical Ethics and Health Policy, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Jason Karlawish
- Department of Medicine, Department of Medical Ethics and Health Policy, Department of Neurology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Anna Wexler
- Department of Medical Ethics and Health Policy, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| |
Collapse
|
37
|
Zhui L, Yhap N, Liping L, Zhengjie W, Zhonghao X, Xiaoshu Y, Hong C, Xuexiu L, Wei R. Impact of Large Language Models on Medical Education and Teaching Adaptations. JMIR Med Inform 2024; 12:e55933. [PMID: 39087590 PMCID: PMC11294775 DOI: 10.2196/55933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 04/25/2024] [Accepted: 06/08/2024] [Indexed: 08/02/2024] Open
Abstract
Unlabelled This viewpoint article explores the transformative role of large language models (LLMs) in the field of medical education, highlighting their potential to enhance teaching quality, promote personalized learning paths, strengthen clinical skills training, optimize teaching assessment processes, boost the efficiency of medical research, and support continuing medical education. However, the use of LLMs entails certain challenges, such as questions regarding the accuracy of information, the risk of overreliance on technology, a lack of emotional recognition capabilities, and concerns related to ethics, privacy, and data security. This article emphasizes that to maximize the potential of LLMs and overcome these challenges, educators must exhibit leadership in medical education, adjust their teaching strategies flexibly, cultivate students' critical thinking, and emphasize the importance of practical experience, thus ensuring that students can use LLMs correctly and effectively. By adopting such a comprehensive and balanced approach, educators can train health care professionals who are proficient in the use of advanced technologies and who exhibit solid professional ethics and practical skills, thus laying a strong foundation for these professionals to overcome future challenges in the health care sector.
Collapse
Affiliation(s)
- Li Zhui
- Department of Vascular Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Nina Yhap
- Department of General Surgery, Queen Elizabeth Hospital, St Michael, Barbados
| | - Liu Liping
- Department of Ultrasound, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Wang Zhengjie
- Department of Nuclear Medicine, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Xiong Zhonghao
- Department of Acupuncture and Moxibustion, Chongqing Traditional Chinese Medicine Hospital, Chongqing, China
| | - Yuan Xiaoshu
- Department of Anesthesia, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Cui Hong
- Department of Anesthesia, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Liu Xuexiu
- Department of Neonatology, Children’s Hospital of Chongqing Medical University, Chongqing, China
| | - Ren Wei
- Department of Vascular Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| |
Collapse
|
38
|
Gadotti A, Rocher L, Houssiau F, Creţu AM, de Montjoye YA. Anonymization: The imperfect science of using data while preserving privacy. SCIENCE ADVANCES 2024; 10:eadn7053. [PMID: 39018389 PMCID: PMC466941 DOI: 10.1126/sciadv.adn7053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Accepted: 06/10/2024] [Indexed: 07/19/2024]
Abstract
Information about us, our actions, and our preferences is created at scale through surveys or scientific studies or as a result of our interaction with digital devices such as smartphones and fitness trackers. The ability to safely share and analyze such data is key for scientific and societal progress. Anonymization is considered by scientists and policy-makers as one of the main ways to share data while minimizing privacy risks. In this review, we offer a pragmatic perspective on the modern literature on privacy attacks and anonymization techniques. We discuss traditional de-identification techniques and their strong limitations in the age of big data. We then turn our attention to modern approaches to share anonymous aggregate data, such as data query systems, synthetic data, and differential privacy. We find that, although no perfect solution exists, applying modern techniques while auditing their guarantees against attacks is the best approach to safely use and share data today.
Collapse
Affiliation(s)
- Andrea Gadotti
- Imperial College London, Exhibition Road, London SW7 2AZ, UK
- University of Oxford, Wellington Square, Oxford OX1 2JD, UK
| | - Luc Rocher
- Imperial College London, Exhibition Road, London SW7 2AZ, UK
- University of Oxford, Wellington Square, Oxford OX1 2JD, UK
| | - Florimond Houssiau
- Imperial College London, Exhibition Road, London SW7 2AZ, UK
- Alan Turing Institute, 96 Euston Road, London NW1 2DB, UK
| | - Ana-Maria Creţu
- Imperial College London, Exhibition Road, London SW7 2AZ, UK
- EPFL, CH-1015 Lausanne, Switzerland
| | | |
Collapse
|
39
|
Stelter L, Corbetta V, Beets-Tan R, Silva W. Assessing the Impact of Federated Learning and Differential Privacy on Multi-centre Polyp Segmentation. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2024; 2024:1-4. [PMID: 40039412 DOI: 10.1109/embc53108.2024.10782682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
Federated Learning (FL) is emerging in the medical field to address the need for diverse datasets while complying with data protection regulations. This decentralised learning paradigm allows hospitals (clients) to train machine learning models locally, ensuring that patient data remains within the confines of its originating institution. Nonetheless, FL by itself is not enough to guarantee privacy, as the central aggregation process may still be susceptible to identity-exposing attacks, potentially compromising data protection compliance. To strengthen privacy, differential privacy (DP) is often introduced. In this work, we conduct a comprehensive comparative analysis to evaluate the impact of DP in both traditional Centralised Learning (CL) frameworks and FL for polyp segmentation, a common medical image analysis task. Experiments are performed in PolypGen, a multi-centre publicly available dataset designed for polyp segmentation. The results show a clear drop in performance with the introduction of DP, exposing the trade-off between privacy and performance and highlighting the need to develop novel privacy-preserving techniques.
Collapse
|
40
|
Moreau D, Wiebels K. Nine quick tips for open meta-analyses. PLoS Comput Biol 2024; 20:e1012252. [PMID: 39052540 PMCID: PMC11271959 DOI: 10.1371/journal.pcbi.1012252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/27/2024] Open
Abstract
Open science principles are revolutionizing the transparency, reproducibility, and accessibility of research. Meta-analysis has become a key technique for synthesizing data across studies in a principled way; however, its impact is contingent on adherence to open science practices. Here, we outline 9 quick tips for open meta-analyses, aimed at guiding researchers to maximize the reach and utility of their findings. We advocate for outlining preregistering clear protocols, opting for open tools and software, and the use of version control systems to ensure transparency and facilitate collaboration. We further emphasize the importance of reproducibility, for example, by sharing search syntax and analysis scripts, and discuss the benefits of planning for dynamic updating to enable living meta-analyses. We also recommend publication in open-access formats, as well as open data, open code, and open access publication. We close by encouraging active promotion of research findings to bridge the gap between complex syntheses and public discourse, and provide a detailed submission checklist to equip researchers, reviewers and journal editors with a structured approach to conducting and reporting open meta-analyses.
Collapse
Affiliation(s)
- David Moreau
- School of Psychology and Centre for Brain Research, University of Auckland, Auckland, New Zealand
| | - Kristina Wiebels
- School of Psychology and Centre for Brain Research, University of Auckland, Auckland, New Zealand
| |
Collapse
|
41
|
Farag N, Noë A, Patrinos D, Zawati MH. Mapping the Apps: Ethical and Legal Issues with Crowdsourced Smartphone Data using mHealth Applications. Asian Bioeth Rev 2024; 16:437-470. [PMID: 39022376 PMCID: PMC11250705 DOI: 10.1007/s41649-024-00296-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 04/03/2024] [Accepted: 04/14/2024] [Indexed: 07/20/2024] Open
Abstract
More than 5 billion people in the world own a smartphone. More than half of these have been used to collect and process health-related data. As such, the existing volume of potentially exploitable health data is unprecedentedly large and growing rapidly. Mobile health applications (apps) on smartphones are some of the worst offenders and are increasingly being used for gathering and exchanging significant amounts of personal health data from the public. This data is often utilized for health research purposes and for algorithm training. While there are advantages to utilizing this data for expanding health knowledge, there are associated risks for the users of these apps, such as privacy concerns and the protection of their data. Consequently, gaining a deeper comprehension of how apps collect and crowdsource data is crucial. To explore how apps are crowdsourcing data and to identify potential ethical, legal, and social issues (ELSI), we conducted an examination of the Apple App Store and the Google Play Store in North America and Europe to identify apps that could potentially gather health data through crowdsourcing. Subsequently, we analyzed their privacy policies, terms of use, and other related documentation to gain insights into the utilization of users' data and the possibility of repurposing it for research or algorithm training purposes. More specifically, we reviewed privacy policies to identify clauses pertaining to the following key categories: research, data sharing, privacy/confidentiality, commercialization, and return of findings. Based on the results of these app search, we developed an App Atlas that presents apps which crowdsource data for research or algorithm training. We identified 46 apps available in the European and Canadian markets that either openly crowdsource health data for research or algorithm training or retain the legal or technical capability to do so. This app search showed an overall lack of consistency and transparency in privacy policies that poses challenges to user comprehensibility, trust, and informed consent. A significant proportion of applications presented contradictions or exhibited considerable ambiguity. For instance, the vast majority of privacy policies in the App Atlas contain ambiguous or contradictory language regarding the sharing of users' data with third parties. This raises a number of ethico-legal concerns which will require further academic and policy attention to ensure a balance between protecting individual interests and maximizing the scientific utility of crowdsourced data. This article represents a key first step in better understanding these concerns and bringing attention to this important issue. Supplementary Information The online version contains supplementary material available at 10.1007/s41649-024-00296-3.
Collapse
Affiliation(s)
- Nada Farag
- Centre of Genomics and Policy, McGill University, Montreal, Canada
| | - Alycia Noë
- Centre of Genomics and Policy, McGill University, Montreal, Canada
| | - Dimitri Patrinos
- Centre of Genomics and Policy, McGill University, Montreal, Canada
| | - Ma’n H. Zawati
- Centre of Genomics and Policy, McGill University, Montreal, Canada
| |
Collapse
|
42
|
Kühnel L, Schneider J, Perrar I, Adams T, Moazemi S, Prasser F, Nöthlings U, Fröhlich H, Fluck J. Synthetic data generation for a longitudinal cohort study - evaluation, method extension and reproduction of published data analysis results. Sci Rep 2024; 14:14412. [PMID: 38909025 PMCID: PMC11193715 DOI: 10.1038/s41598-024-62102-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Accepted: 05/13/2024] [Indexed: 06/24/2024] Open
Abstract
Access to individual-level health data is essential for gaining new insights and advancing science. In particular, modern methods based on artificial intelligence rely on the availability of and access to large datasets. In the health sector, access to individual-level data is often challenging due to privacy concerns. A promising alternative is the generation of fully synthetic data, i.e., data generated through a randomised process that have similar statistical properties as the original data, but do not have a one-to-one correspondence with the original individual-level records. In this study, we use a state-of-the-art synthetic data generation method and perform in-depth quality analyses of the generated data for a specific use case in the field of nutrition. We demonstrate the need for careful analyses of synthetic data that go beyond descriptive statistics and provide valuable insights into how to realise the full potential of synthetic datasets. By extending the methods, but also by thoroughly analysing the effects of sampling from a trained model, we are able to largely reproduce significant real-world analysis results in the chosen use case.
Collapse
Affiliation(s)
- Lisa Kühnel
- Knowledge Management, ZB MED - Information Centre for Life Sciences, 50931, Cologne, Germany.
- Faculty of Technology, Graduate School DILS, Bielefeld Institute for Bioinformatics Infrastructure (BIBI), Bielefeld University, 33615, Bielefeld, Germany.
| | - Julian Schneider
- Knowledge Management, ZB MED - Information Centre for Life Sciences, 50931, Cologne, Germany
| | - Ines Perrar
- Institute of Nutritional and Food Sciences - Nutritional Epidemiology, University of Bonn, 53115, Bonn, Germany
| | - Tim Adams
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing SCAI, 53757, Sankt Augustin, Germany
| | - Sobhan Moazemi
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing SCAI, 53757, Sankt Augustin, Germany
| | - Fabian Prasser
- Medical Informatics Group, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, 10117, Berlin, Germany
| | - Ute Nöthlings
- Institute of Nutritional and Food Sciences - Nutritional Epidemiology, University of Bonn, 53115, Bonn, Germany
| | - Holger Fröhlich
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing SCAI, 53757, Sankt Augustin, Germany
- Bonn-Aachen International Center for IT, University of Bonn, Friedrich Hirzebruch-Allee 6, 53115, Bonn, Germany
| | - Juliane Fluck
- Knowledge Management, ZB MED - Information Centre for Life Sciences, 50931, Cologne, Germany
- The Agricultural Faculty, University of Bonn, 53115, Bonn, Germany
| |
Collapse
|
43
|
van Genderen ME, Cecconi M, Jung C. Federated data access and federated learning: improved data sharing, AI model development, and learning in intensive care. Intensive Care Med 2024; 50:974-977. [PMID: 38635044 PMCID: PMC11164808 DOI: 10.1007/s00134-024-07408-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 03/23/2024] [Indexed: 04/19/2024]
Affiliation(s)
- Michel E van Genderen
- Department of Adult Intensive Care, Erasmus MC, University Medical Center Rotterdam, (internal postadress-Room Ne-403), Doctor molewaterplein 40, 3015 GD, Rotterdam, The Netherlands.
| | - Maurizio Cecconi
- Biomedical Sciences Department, Humanitas University, Milan, Italy
- Department of Anaesthesia and Intensive Care, IRCCS Humanitas Research Hospital, Milan, Italy
| | - Christian Jung
- Medical Faculty, Department of Cardiology, Pulmonology and Vascular Medicine, Heinrich-Heine-University Duesseldorf, Duesseldorf, Germany
- Medical Faculty and University Hospital of Düsseldorf, Cardiovascular Research Institute Düsseldorf (CARID), Heinrich-Heine University Düsseldorf, 40225, Düsseldorf, Germany
| |
Collapse
|
44
|
Petit-Jean T, Gérardin C, Berthelot E, Chatellier G, Frank M, Tannier X, Kempf E, Bey R. Collaborative and privacy-enhancing workflows on a clinical data warehouse: an example developing natural language processing pipelines to detect medical conditions. J Am Med Inform Assoc 2024; 31:1280-1290. [PMID: 38573195 PMCID: PMC11105139 DOI: 10.1093/jamia/ocae069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 02/28/2024] [Accepted: 03/13/2024] [Indexed: 04/05/2024] Open
Abstract
OBJECTIVE To develop and validate a natural language processing (NLP) pipeline that detects 18 conditions in French clinical notes, including 16 comorbidities of the Charlson index, while exploring a collaborative and privacy-enhancing workflow. MATERIALS AND METHODS The detection pipeline relied both on rule-based and machine learning algorithms, respectively, for named entity recognition and entity qualification, respectively. We used a large language model pre-trained on millions of clinical notes along with annotated clinical notes in the context of 3 cohort studies related to oncology, cardiology, and rheumatology. The overall workflow was conceived to foster collaboration between studies while respecting the privacy constraints of the data warehouse. We estimated the added values of the advanced technologies and of the collaborative setting. RESULTS The pipeline reached macro-averaged F1-score positive predictive value, sensitivity, and specificity of 95.7 (95%CI 94.5-96.3), 95.4 (95%CI 94.0-96.3), 96.0 (95%CI 94.0-96.7), and 99.2 (95%CI 99.0-99.4), respectively. F1-scores were superior to those observed using alternative technologies or non-collaborative settings. The models were shared through a secured registry. CONCLUSIONS We demonstrated that a community of investigators working on a common clinical data warehouse could efficiently and securely collaborate to develop, validate and use sensitive artificial intelligence models. In particular, we provided an efficient and robust NLP pipeline that detects conditions mentioned in clinical notes.
Collapse
Affiliation(s)
- Thomas Petit-Jean
- Innovation and Data Unit, IT Department, Assistance Publique-Hôpitaux de Paris, Paris, 75012, France
| | - Christel Gérardin
- Innovation and Data Unit, IT Department, Assistance Publique-Hôpitaux de Paris, Paris, 75012, France
- Institut Pierre-Louis d’Epidémiologie et de Santé Publique, INSERM, Sorbonne Université, Paris, 75012, France
| | - Emmanuelle Berthelot
- Department of Cardiology, Hôpital Bicêtre, Assistance Publique-Hôpitaux de Paris, Le Kremlin Bicêtre, 94270, France
| | - Gilles Chatellier
- Innovation and Data Unit, IT Department, Assistance Publique-Hôpitaux de Paris, Paris, 75012, France
- Department of Medical Informatics, Assistance Publique-Hôpitaux de Paris, Centre-Université de Paris (APHP-CUP), Université de Paris, Paris, 75015, France
| | - Marie Frank
- Department of Medical Informatics, Hôpitaux Universitaires Paris-Saclay, Assistance Publique-Hôpitaux de Paris, Le Kremlin-Bicêtre, 94270, France
| | - Xavier Tannier
- Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances pour la e-Santé (LIMICS), INSERM, Université Sorbonne Paris Nord, Sorbonne Université, Paris, 75005, France
| | - Emmanuelle Kempf
- Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances pour la e-Santé (LIMICS), INSERM, Université Sorbonne Paris Nord, Sorbonne Université, Paris, 75005, France
- Department of Medical Oncology, Henri Mondor and Albert Chenevier Teaching Hospital, Assistance Publique-Hôpitaux de Paris, Créteil, 94000, France
| | - Romain Bey
- Innovation and Data Unit, IT Department, Assistance Publique-Hôpitaux de Paris, Paris, 75012, France
| |
Collapse
|
45
|
Ngo H, Fang H, Rumbut J, Wang H. Federated Fuzzy Clustering for Decentralized Incomplete Longitudinal Behavioral Data. IEEE INTERNET OF THINGS JOURNAL 2024; 11:14657-14670. [PMID: 38605934 PMCID: PMC11006372 DOI: 10.1109/jiot.2023.3343719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/13/2024]
Abstract
The use of medical data for machine learning, including unsupervised methods such as clustering, is often restricted by privacy regulations such as the Health Insurance Portability and Accountability Act (HIPAA). Medical data is sensitive and highly regulated and anonymization is often insufficient to protect a patient's identity. Traditional clustering algorithms are also unsuitable for longitudinal behavioral health trials, which often have missing data and observe individual behaviors over varying time periods. In this work, we develop a new decentralized federated multiple imputation-based fuzzy clustering algorithm for complex longitudinal behavioral trial data collected from multisite randomized controlled trials over different time periods. Federated learning (FL) preserves privacy by aggregating model parameters instead of data. Unlike previous FL methods, this proposed algorithm requires only two rounds of communication and handles clients with varying numbers of time points for incomplete longitudinal data. The model is evaluated on both empirical longitudinal dietary health data and simulated clusters with different numbers of clients, effect sizes, correlations, and sample sizes. The proposed algorithm converges rapidly and achieves desirable performance on multiple clustering metrics. This new method allows for targeted treatments for various patient groups while preserving their data privacy and enables the potential for broader applications in the Internet of Medical Things.
Collapse
Affiliation(s)
- Hieu Ngo
- College of Engineering, University of Massachusetts Dartmouth, North Dartmouth, MA, 02747
| | - Hua Fang
- Department of Computer and Information Science, University of Massachusetts Dartmouth, North Dartmouth, MA, 02747 and the Department of Population and Quantitative Health Science, University of Massachusetts Chan Medical School, Worcester, MA 01655 USA
| | - Joshua Rumbut
- College of Engineering, University of Massachusetts Dartmouth, North Dartmouth, MA, 02747 and the Department of Population and Quantitative Health Science, University of Massachusetts Chan Medical School, Worcester, MA 01655 USA
| | - Honggang Wang
- Department of Graduate Computer Science and Engineering, Katz School of Science and Health, Yeshiva University, New York City, NY, 10033
| |
Collapse
|
46
|
Piciocchi A, Cipriani M, Messina M, Marconi G, Arena V, Soddu S, Crea E, Feraco MV, Ferrante M, La Sala E, Fazi P, Buccisano F, Voso MT, Martinelli G, Venditti A, Vignetti M. Unlocking the potential of synthetic patients for accelerating clinical trials: Results of the first GIMEMA experience on acute myeloid leukemia patients. EJHAEM 2024; 5:353-359. [PMID: 38633115 PMCID: PMC11020105 DOI: 10.1002/jha2.873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Accepted: 02/13/2024] [Indexed: 04/19/2024]
Abstract
Artificial Intelligence has the potential to reshape the landscape of clinical trials through innovative applications, with a notable advancement being the emergence of synthetic patient generation. This process involves simulating cohorts of virtual patients that can either replace or supplement real individuals within trial settings. By leveraging synthetic patients, it becomes possible to eliminate the need for obtaining patient consent and creating control groups that mimic patients in active treatment arms. This method not only streamlines trial processes, reducing time and costs but also fortifies the protection of sensitive participant data. Furthermore, integrating synthetic patients amplifies trial efficiency by expanding the sample size. These straightforward and cost-effective methods also enable the development of personalized subject-specific models, enabling predictions of patient responses to interventions. Synthetic data holds great promise for generating real-world evidence in clinical trials while upholding rigorous confidentiality standards throughout the process. Therefore, this study aims to demonstrate the applicability and performance of these methods in the context of onco-hematological research, breaking through the theoretical and practical barriers associated with the implementation of artificial intelligence in medical trials.
Collapse
Affiliation(s)
| | - Marta Cipriani
- Data CenterGIMEMA FoundationRomeItaly
- Department of Statistical SciencesUniversity of Rome La SapienzaRomeItaly
| | | | - Giovanni Marconi
- Hematology UnitIRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”MeldolaItaly
| | | | | | | | | | - Marco Ferrante
- Department Health Care and Life SciencesStudio Legale FLCRomeItaly
| | | | | | | | - Maria Teresa Voso
- Department of Biomedicine and PreventionTor Vergata UniversityRomeItaly
| | - Giovanni Martinelli
- Hematology UnitIRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”MeldolaItaly
| | - Adriano Venditti
- Department of Biomedicine and PreventionTor Vergata UniversityRomeItaly
| | | |
Collapse
|
47
|
Wiepert D, Malin BA, Duffy JR, Utianski RL, Stricker JL, Jones DT, Botha H. Reidentification of Participants in Shared Clinical Data Sets: Experimental Study. JMIR AI 2024; 3:e52054. [PMID: 38875581 PMCID: PMC11041495 DOI: 10.2196/52054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 01/26/2024] [Accepted: 02/19/2024] [Indexed: 06/16/2024]
Abstract
BACKGROUND Large curated data sets are required to leverage speech-based tools in health care. These are costly to produce, resulting in increased interest in data sharing. As speech can potentially identify speakers (ie, voiceprints), sharing recordings raises privacy concerns. This is especially relevant when working with patient data protected under the Health Insurance Portability and Accountability Act. OBJECTIVE We aimed to determine the reidentification risk for speech recordings, without reference to demographics or metadata, in clinical data sets considering both the size of the search space (ie, the number of comparisons that must be considered when reidentifying) and the nature of the speech recording (ie, the type of speech task). METHODS Using a state-of-the-art speaker identification model, we modeled an adversarial attack scenario in which an adversary uses a large data set of identified speech (hereafter, the known set) to reidentify as many unknown speakers in a shared data set (hereafter, the unknown set) as possible. We first considered the effect of search space size by attempting reidentification with various sizes of known and unknown sets using VoxCeleb, a data set with recordings of natural, connected speech from >7000 healthy speakers. We then repeated these tests with different types of recordings in each set to examine whether the nature of a speech recording influences reidentification risk. For these tests, we used our clinical data set composed of recordings of elicited speech tasks from 941 speakers. RESULTS We found that the risk was inversely related to the number of comparisons an adversary must consider (ie, the search space), with a positive linear correlation between the number of false acceptances (FAs) and the number of comparisons (r=0.69; P<.001). The true acceptances (TAs) stayed relatively stable, and the ratio between FAs and TAs rose from 0.02 at 1 × 105 comparisons to 1.41 at 6 × 106 comparisons, with a near 1:1 ratio at the midpoint of 3 × 106 comparisons. In effect, risk was high for a small search space but dropped as the search space grew. We also found that the nature of a speech recording influenced reidentification risk, with nonconnected speech (eg, vowel prolongation: FA/TA=98.5; alternating motion rate: FA/TA=8) being harder to identify than connected speech (eg, sentence repetition: FA/TA=0.54) in cross-task conditions. The inverse was mostly true in within-task conditions, with the FA/TA ratio for vowel prolongation and alternating motion rate dropping to 0.39 and 1.17, respectively. CONCLUSIONS Our findings suggest that speaker identification models can be used to reidentify participants in specific circumstances, but in practice, the reidentification risk appears small. The variation in risk due to search space size and type of speech task provides actionable recommendations to further increase participant privacy and considerations for policy regarding public release of speech recordings.
Collapse
Affiliation(s)
- Daniela Wiepert
- Department of Neurology, Mayo Clinic, Rochester, MN, United States
| | - Bradley A Malin
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, United States
- Department of Computer Science, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Joseph R Duffy
- Department of Neurology, Mayo Clinic, Rochester, MN, United States
| | - Rene L Utianski
- Department of Neurology, Mayo Clinic, Rochester, MN, United States
| | - John L Stricker
- Department of Neurology, Mayo Clinic, Rochester, MN, United States
| | - David T Jones
- Department of Neurology, Mayo Clinic, Rochester, MN, United States
| | - Hugo Botha
- Department of Neurology, Mayo Clinic, Rochester, MN, United States
| |
Collapse
|
48
|
Soltan AAS, Thakur A, Yang J, Chauhan A, D'Cruz LG, Dickson P, Soltan MA, Thickett DR, Eyre DW, Zhu T, Clifton DA. A scalable federated learning solution for secondary care using low-cost microcomputing: privacy-preserving development and evaluation of a COVID-19 screening test in UK hospitals. Lancet Digit Health 2024; 6:e93-e104. [PMID: 38278619 DOI: 10.1016/s2589-7500(23)00226-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Revised: 10/17/2023] [Accepted: 10/30/2023] [Indexed: 01/28/2024]
Abstract
BACKGROUND Multicentre training could reduce biases in medical artificial intelligence (AI); however, ethical, legal, and technical considerations can constrain the ability of hospitals to share data. Federated learning enables institutions to participate in algorithm development while retaining custody of their data but uptake in hospitals has been limited, possibly as deployment requires specialist software and technical expertise at each site. We previously developed an artificial intelligence-driven screening test for COVID-19 in emergency departments, known as CURIAL-Lab, which uses vital signs and blood tests that are routinely available within 1 h of a patient's arrival. Here we aimed to federate our COVID-19 screening test by developing an easy-to-use embedded system-which we introduce as full-stack federated learning-to train and evaluate machine learning models across four UK hospital groups without centralising patient data. METHODS We supplied a Raspberry Pi 4 Model B preloaded with our federated learning software pipeline to four National Health Service (NHS) hospital groups in the UK: Oxford University Hospitals NHS Foundation Trust (OUH; through the locally linked research University, University of Oxford), University Hospitals Birmingham NHS Foundation Trust (UHB), Bedfordshire Hospitals NHS Foundation Trust (BH), and Portsmouth Hospitals University NHS Trust (PUH). OUH, PUH, and UHB participated in federated training, training a deep neural network and logistic regressor over 150 rounds to form and calibrate a global model to predict COVID-19 status, using clinical data from patients admitted before the pandemic (COVID-19-negative) and testing positive for COVID-19 during the first wave of the pandemic. We conducted a federated evaluation of the global model for admissions during the second wave of the pandemic at OUH, PUH, and externally at BH. For OUH and PUH, we additionally performed local fine-tuning of the global model using the sites' individual training data, forming a site-tuned model, and evaluated the resultant model for admissions during the second wave of the pandemic. This study included data collected between Dec 1, 2018, and March 1, 2021; the exact date ranges used varied by site. The primary outcome was overall model performance, measured as the area under the receiver operating characteristic curve (AUROC). Removable micro secure digital (microSD) storage was destroyed on study completion. FINDINGS Clinical data from 130 941 patients (1772 COVID-19-positive), routinely collected across three hospital groups (OUH, PUH, and UHB), were included in federated training. The evaluation step included data from 32 986 patients (3549 COVID-19-positive) attending OUH, PUH, or BH during the second wave of the pandemic. Federated training of a global deep neural network classifier improved upon performance of models trained locally in terms of AUROC by a mean of 27·6% (SD 2·2): AUROC increased from 0·574 (95% CI 0·560-0·589) at OUH and 0·622 (0·608-0·637) at PUH using the locally trained models to 0·872 (0·862-0·882) at OUH and 0·876 (0·865-0·886) at PUH using the federated global model. Performance improvement was smaller for a logistic regression model, with a mean increase in AUROC of 13·9% (0·5%). During federated external evaluation at BH, AUROC for the global deep neural network model was 0·917 (0·893-0·942), with 89·7% sensitivity (83·6-93·6) and 76·6% specificity (73·9-79·1). Site-specific tuning of the global model did not significantly improve performance (change in AUROC <0·01). INTERPRETATION We developed an embedded system for federated learning, using microcomputing to optimise for ease of deployment. We deployed full-stack federated learning across four UK hospital groups to develop a COVID-19 screening test without centralising patient data. Federation improved model performance, and the resultant global models were generalisable. Full-stack federated learning could enable hospitals to contribute to AI development at low cost and without specialist technical expertise at each site. FUNDING The Wellcome Trust, University of Oxford Medical and Life Sciences Translational Fund.
Collapse
Affiliation(s)
- Andrew A S Soltan
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK; Department of Oncology, University of Oxford, Oxford, UK; Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK; Big Data Institute, Nuffield Department of Population Health, University of Oxford, Oxford, UK; Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK.
| | - Anshul Thakur
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK
| | - Jenny Yang
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK
| | - Anoop Chauhan
- Portsmouth Hospitals University NHS Trust, Portsmouth, UK
| | - Leon G D'Cruz
- Portsmouth Hospitals University NHS Trust, Portsmouth, UK
| | | | - Marina A Soltan
- The Queen Elizabeth Hospital, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK; Institute of Inflammation and Ageing, University of Birmingham, Birmingham, UK
| | - David R Thickett
- The Queen Elizabeth Hospital, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK; Institute of Inflammation and Ageing, University of Birmingham, Birmingham, UK
| | - David W Eyre
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK; Big Data Institute, Nuffield Department of Population Health, University of Oxford, Oxford, UK; NIHR Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, University of Oxford and Public Health England, Oxford, UK; NIHR Oxford Biomedical Research Centre, Oxford, UK
| | - Tingting Zhu
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK
| | - David A Clifton
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK; NIHR Oxford Biomedical Research Centre, Oxford, UK; Oxford-Suzhou Centre for Advanced Research, Suzhou, China
| |
Collapse
|
49
|
Song J, Song Z, Zhang J, Gong Y. Privacy-Preserving Identification of Cancer Subtype-Specific Driver Genes Based on Multigenomics Data with Privatedriver. J Comput Biol 2024; 31:99-116. [PMID: 38271572 DOI: 10.1089/cmb.2023.0115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2024] Open
Abstract
Identifying cancer subtype-specific driver genes from a large number of irrelevant passengers is crucial for targeted therapy in cancer treatment. Recently, the rapid accumulation of large-scale cancer genomics data from multiple institutions has presented remarkable opportunities for identification of cancer subtype-specific driver genes. However, the insufficient subtype samples, privacy issues, and heterogenous of aberration events pose great challenges in precisely identifying cancer subtype-specific driver genes. To address this, we introduce privatedriver, the first model for identifying subtype-specific driver genes that integrates genomics data from multiple institutions in a data privacy-preserving collaboration manner. The process of identifying subtype-specific cancer driver genes using privatedriver involves the following two steps: genomics data integration and collaborative training. In the integration process, the aberration events from multiple genomics data sources are combined for each institution using the forward and backward propagation method of NetICS. In the collaborative training process, each institution utilizes the federated learning framework to upload encrypted model parameters instead of raw data of all institutions to train a global model by using the non-negative matrix factorization algorithm. We applied privatedriver on head and neck squamous cell and colon cancer from The Cancer Genome Atlas website and evaluated it with two benchmarks using macro-Fscore. The comparison analysis demonstrates that privatedriver achieves comparable results to centralized learning models and outperforms most other nonprivacy preserving models, all while ensuring the confidentiality of patient information. We also demonstrate that, for varying predicted driver gene distributions in subtype, our model fully considers the heterogeneity of subtype and identifies subtype-specific driver genes corresponding to the given prognosis and therapeutic effect. The success of privatedriver reveals the feasibility and effectiveness of identifying cancer subtype-specific driver genes in a data protection manner, providing new insights for future privacy-preserving driver gene identification studies.
Collapse
Affiliation(s)
- Junrong Song
- School of Information; Kunming, P.R. China
- Yunnan Key Laboratory of Service Computing; Yunnan University of Finance and Economics, Kunming, P.R. China
| | - Zhiming Song
- School of Information; Kunming, P.R. China
- Yunnan Key Laboratory of Service Computing; Yunnan University of Finance and Economics, Kunming, P.R. China
| | - Jinpeng Zhang
- School of Information; Kunming, P.R. China
- Yunnan Key Laboratory of Service Computing; Yunnan University of Finance and Economics, Kunming, P.R. China
- The School of Computer Science and Engineering, Yunnan University, Kunming, P.R. China
| | | |
Collapse
|
50
|
Cobanaj M, Corti C, Dee EC, McCullum L, Boldrini L, Schlam I, Tolaney SM, Celi LA, Curigliano G, Criscitiello C. Advancing equitable and personalized cancer care: Novel applications and priorities of artificial intelligence for fairness and inclusivity in the patient care workflow. Eur J Cancer 2024; 198:113504. [PMID: 38141549 PMCID: PMC11362966 DOI: 10.1016/j.ejca.2023.113504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 12/13/2023] [Indexed: 12/25/2023]
Abstract
Patient care workflows are highly multimodal and intertwined: the intersection of data outputs provided from different disciplines and in different formats remains one of the main challenges of modern oncology. Artificial Intelligence (AI) has the potential to revolutionize the current clinical practice of oncology owing to advancements in digitalization, database expansion, computational technologies, and algorithmic innovations that facilitate discernment of complex relationships in multimodal data. Within oncology, radiation therapy (RT) represents an increasingly complex working procedure, involving many labor-intensive and operator-dependent tasks. In this context, AI has gained momentum as a powerful tool to standardize treatment performance and reduce inter-observer variability in a time-efficient manner. This review explores the hurdles associated with the development, implementation, and maintenance of AI platforms and highlights current measures in place to address them. In examining AI's role in oncology workflows, we underscore that a thorough and critical consideration of these challenges is the only way to ensure equitable and unbiased care delivery, ultimately serving patients' survival and quality of life.
Collapse
Affiliation(s)
- Marisa Cobanaj
- National Center for Radiation Research in Oncology, OncoRay, Helmholtz-Zentrum Dresden-Rossendorf, Dresden, Germany
| | - Chiara Corti
- Breast Oncology Program, Dana-Farber Brigham Cancer Center, Boston, MA, USA; Harvard Medical School, Boston, MA, USA; Division of New Drugs and Early Drug Development for Innovative Therapies, European Institute of Oncology, IRCCS, Milan, Italy; Department of Oncology and Hematology-Oncology (DIPO), University of Milan, Milan, Italy.
| | - Edward C Dee
- Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Lucas McCullum
- Department of Radiation Oncology, MD Anderson Cancer Center, Houston, TX, USA
| | - Laura Boldrini
- Division of New Drugs and Early Drug Development for Innovative Therapies, European Institute of Oncology, IRCCS, Milan, Italy; Department of Oncology and Hematology-Oncology (DIPO), University of Milan, Milan, Italy
| | - Ilana Schlam
- Department of Hematology and Oncology, Tufts Medical Center, Boston, MA, USA; Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Sara M Tolaney
- Breast Oncology Program, Dana-Farber Brigham Cancer Center, Boston, MA, USA; Harvard Medical School, Boston, MA, USA; Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Leo A Celi
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA; Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Giuseppe Curigliano
- Division of New Drugs and Early Drug Development for Innovative Therapies, European Institute of Oncology, IRCCS, Milan, Italy; Department of Oncology and Hematology-Oncology (DIPO), University of Milan, Milan, Italy
| | - Carmen Criscitiello
- Division of New Drugs and Early Drug Development for Innovative Therapies, European Institute of Oncology, IRCCS, Milan, Italy; Department of Oncology and Hematology-Oncology (DIPO), University of Milan, Milan, Italy
| |
Collapse
|