1
|
Lin WY, Lin C, Liu WC, Liu WT, Chang CH, Chen HY, Lee CC, Chen YC, Wu CS, Lee CC, Wang CH, Liao CC, Lin CS. Development of an Artificial Intelligence-Enabled Electrocardiography to Detect 23 Cardiac Arrhythmias and Predict Cardiovascular Outcomes. J Med Syst 2025; 49:51. [PMID: 40259136 DOI: 10.1007/s10916-025-02177-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2024] [Accepted: 03/22/2025] [Indexed: 04/23/2025]
Abstract
Arrhythmias are common and can affect individuals with or without structural heart disease. Deep learning models (DLMs) have shown the ability to recognize arrhythmias using 12-lead electrocardiograms (ECGs). However, the limited types of arrhythmias and dataset robustness have hindered widespread adoption. This study aimed to develop a DLM capable of detecting various arrhythmias across diverse datasets. This algorithm development study utilized 22,130 ECGs, divided into development, tuning, validation, and competition sets. External validation was conducted on three open datasets (CODE-test, PTB-XL, CPSC2018) comprising 32,495 ECGs. The study also assessed the long-term risks of new-onset atrial fibrillation (AF), heart failure (HF), and mortality in individuals with false-positive AF detection by the DLM. In the validation set, the DLM achieved area under the receiver operating characteristic curve above 0.97 and sensitivity/specificity exceeding 90% across most arrhythmia classes. It demonstrated cardiologist-level performance, ranking first in balanced accuracy in a human-machine competition. External validation confirmed comparable performance. Individuals with false-positive AF detection had a significantly higher risk of new-onset AF (hazard ration [HR]: 1.69, 95% confidence interval [CI]: 1.11-2.59), HF (HR: 1.73, 95% CI: 1.20-2.51), and mortality (HR: 1.40, 95% CI: 1.02-1.92) compared to true-negative individuals after adjusting for age and sex. We developed an accurate DLM capable of detecting 23 cardiac arrhythmias across multiple datasets. This DLM serves as a valuable screening tool to aid physicians in identifying high-risk patients, with potential implications for early intervention and risk stratification.
Collapse
Affiliation(s)
- Wen-Yu Lin
- Division of Cardiology, Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical Center Taipei, Taiwan, R.O.C
| | - Chin Lin
- Medical Technology Education Center, School of Medicine, National Defense Medical Center, Taipei, Taiwan, R.O.C
- Graduate Institutes of Life Sciences, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan, R.O.C
- School of Public Health, National Defense Medical Center, Taipei, Taiwan, R.O.C
| | - Wen-Cheng Liu
- Division of Cardiology, Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical Center Taipei, Taiwan, R.O.C
| | - Wei-Ting Liu
- Division of Cardiology, Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical Center Taipei, Taiwan, R.O.C
| | - Chiao-Hsiang Chang
- Division of Cardiology, Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical Center Taipei, Taiwan, R.O.C
| | - Hung-Yi Chen
- Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical Center Taipei, Taiwan, R.O.C
| | - Chiao-Chin Lee
- Division of Cardiology, Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical Center Taipei, Taiwan, R.O.C
| | - Yu-Cheng Chen
- Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical Center Taipei, Taiwan, R.O.C
| | - Chen-Shu Wu
- Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical Center Taipei, Taiwan, R.O.C
| | - Chia-Cheng Lee
- Medical Informatics Office, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan, R.O.C
- Division of Colorectal Surgery, Department of Surgery, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan, R.O.C
| | - Chih-Hung Wang
- Department of Otolaryngology-Head and Neck Surgery, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan, R.O.C
- Graduate Institute of Medical Sciences, National Defense Medical Center, Taipei, Taiwan, R.O.C
| | - Chun-Cheng Liao
- Department of Family Medicine, Taichung Armed Forces General Hospital, Taichung, Taiwan, 411, R.O.C..
- Department of Medical Education and Research, Taichung Armed Forces General Hospital, Taichung, Taiwan, 411, R.O.C..
- School of Medicine, National Defense Medical Center, Taipei, Taiwan, 114, R.O.C..
- , No.348, Sec.2, Chungshan Rd., Taiping Dist, Taichung City, Taiwan, 411228, R.O.C..
| | - Chin-Sheng Lin
- Division of Cardiology, Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical Center Taipei, Taiwan, R.O.C..
- , No 325, Section 2, Cheng-Kung Rd, Neihu, Taipei, Taiwan, 11490, R.O.C..
| |
Collapse
|
2
|
Huang Z, Yang E, Shen J, Gratzinger D, Eyerer F, Liang B, Nirschl J, Bingham D, Dussaq AM, Kunder C, Rojansky R, Gilbert A, Chang-Graham AL, Howitt BE, Liu Y, Ryan EE, Tenney TB, Zhang X, Folkins A, Fox EJ, Montine KS, Montine TJ, Zou J. A pathologist-AI collaboration framework for enhancing diagnostic accuracies and efficiencies. Nat Biomed Eng 2025; 9:455-470. [PMID: 38898173 DOI: 10.1038/s41551-024-01223-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 05/03/2024] [Indexed: 06/21/2024]
Abstract
In pathology, the deployment of artificial intelligence (AI) in clinical settings is constrained by limitations in data collection and in model transparency and interpretability. Here we describe a digital pathology framework, nuclei.io, that incorporates active learning and human-in-the-loop real-time feedback for the rapid creation of diverse datasets and models. We validate the effectiveness of the framework via two crossover user studies that leveraged collaboration between the AI and the pathologist, including the identification of plasma cells in endometrial biopsies and the detection of colorectal cancer metastasis in lymph nodes. In both studies, nuclei.io yielded considerable diagnostic performance improvements. Collaboration between clinicians and AI will aid digital pathology by enhancing accuracies and efficiencies.
Collapse
Affiliation(s)
- Zhi Huang
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
| | - Eric Yang
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Jeanne Shen
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Dita Gratzinger
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Frederick Eyerer
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Brooke Liang
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Jeffrey Nirschl
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - David Bingham
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Alex M Dussaq
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Christian Kunder
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Rebecca Rojansky
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Aubre Gilbert
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Brooke E Howitt
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Ying Liu
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Emily E Ryan
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Troy B Tenney
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Xiaoming Zhang
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Ann Folkins
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Edward J Fox
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Kathleen S Montine
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Thomas J Montine
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA.
| | - James Zou
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA.
| |
Collapse
|
3
|
Ramoni D, Scuricini A, Carbone F, Liberale L, Montecucco F. Artificial intelligence in gastroenterology: Ethical and diagnostic challenges in clinical practice. World J Gastroenterol 2025; 31:102725. [PMID: 40093670 PMCID: PMC11886536 DOI: 10.3748/wjg.v31.i10.102725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/28/2024] [Revised: 01/16/2025] [Accepted: 01/23/2025] [Indexed: 02/26/2025] Open
Abstract
This article discusses the manuscript recently published in the World Journal of Gastroenterology, which explores the application of deep learning models in decision-making processes via wireless capsule endoscopy. Integrating artificial intelligence (AI) into gastrointestinal disease diagnosis represents a transformative step toward precision medicine, enhancing real-time accuracy in detecting multi-category lesions at earlier stages, including small bowel lesions and precancerous polyps, ultimately improving patient outcomes. However, the use of AI in clinical settings raises ethical considerations that extend beyond technological potential. Issues of patient privacy, data security, and potential diagnostic biases require careful attention. AI models must prioritize diverse and representative datasets to mitigate inequities and ensure diagnostic accuracy across populations. Furthermore, balancing AI with clinical expertise is crucial, positioning AI as a supportive tool rather than a replacement for physician judgment. Addressing these ethical challenges will support the responsible deployment of AI, through equitable contribution to patient-centered care.
Collapse
Affiliation(s)
- Davide Ramoni
- Department of Internal Medicine, University of Genoa, Genoa 16132, Italy
| | | | - Federico Carbone
- Department of Internal Medicine, University of Genoa, Genoa 16132, Italy
- First Clinic of Internal Medicine, Department of Internal Medicine, Italian Cardiovascular Network, IRCCS Ospedale Policlinico San Martino, Genoa 16132, Italy
| | - Luca Liberale
- Department of Internal Medicine, University of Genoa, Genoa 16132, Italy
- First Clinic of Internal Medicine, Department of Internal Medicine, Italian Cardiovascular Network, IRCCS Ospedale Policlinico San Martino, Genoa 16132, Italy
| | - Fabrizio Montecucco
- Department of Internal Medicine, University of Genoa, Genoa 16132, Italy
- First Clinic of Internal Medicine, Department of Internal Medicine, Italian Cardiovascular Network, IRCCS Ospedale Policlinico San Martino, Genoa 16132, Italy
| |
Collapse
|
4
|
Corfmat M, Martineau JT, Régis C. High-reward, high-risk technologies? An ethical and legal account of AI development in healthcare. BMC Med Ethics 2025; 26:4. [PMID: 39815254 PMCID: PMC11734583 DOI: 10.1186/s12910-024-01158-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Accepted: 12/10/2024] [Indexed: 01/18/2025] Open
Abstract
BACKGROUND Considering the disruptive potential of AI technology, its current and future impact in healthcare, as well as healthcare professionals' lack of training in how to use it, the paper summarizes how to approach the challenges of AI from an ethical and legal perspective. It concludes with suggestions for improvements to help healthcare professionals better navigate the AI wave. METHODS We analyzed the literature that specifically discusses ethics and law related to the development and implementation of AI in healthcare as well as relevant normative documents that pertain to both ethical and legal issues. After such analysis, we created categories regrouping the most frequently cited and discussed ethical and legal issues. We then proposed a breakdown within such categories that emphasizes the different - yet often interconnecting - ways in which ethics and law are approached for each category of issues. Finally, we identified several key ideas for healthcare professionals and organizations to better integrate ethics and law into their practices. RESULTS We identified six categories of issues related to AI development and implementation in healthcare: (1) privacy; (2) individual autonomy; (3) bias; (4) responsibility and liability; (5) evaluation and oversight; and (6) work, professions and the job market. While each one raises different questions depending on perspective, we propose three main legal and ethical priorities: education and training of healthcare professionals, offering support and guidance throughout the use of AI systems, and integrating the necessary ethical and legal reflection at the heart of the AI tools themselves. CONCLUSIONS By highlighting the main ethical and legal issues involved in the development and implementation of AI technologies in healthcare, we illustrate their profound effects on professionals as well as their relationship with patients and other organizations in the healthcare sector. We must be able to identify AI technologies in medical practices and distinguish them by their nature so we can better react and respond to them. Healthcare professionals need to work closely with ethicists and lawyers involved in the healthcare system, or the development of reliable and trusted AI will be jeopardized.
Collapse
Affiliation(s)
- Maelenn Corfmat
- Faculty of Law, University of Montreal, Ch de la Tour, Montreal, QC, H3T 1J7, Canada.
- Faculty of Law, Economics and Management, University of Paris Cité, Av. Pierre Larousse, Malakoff, 92240, France.
| | - Joé T Martineau
- Department of Management, HEC Montreal, 3000 chemin de la Cote-Sainte-Catherine, Montreal, QC, H3T 2A7, Canada
| | - Catherine Régis
- Faculty of Law, University of Montreal, Ch de la Tour, Montreal, QC, H3T 1J7, Canada
- Canada-CIFAR Chair in Artificial Intelligence, Mila, St-Urbain, Montreal, QC, H2S 3H1, Canada
| |
Collapse
|
5
|
Rainey C. Artificial intelligence and radiographer preliminary image evaluation: What might the future hold for radiographers providing x-ray interpretation in the acute setting? J Med Radiat Sci 2024; 71:495-498. [PMID: 39304330 PMCID: PMC11638352 DOI: 10.1002/jmrs.821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2024] [Accepted: 08/19/2024] [Indexed: 09/22/2024] Open
Abstract
In a stretched healthcare system, radiographer preliminary image evaluation in the acute setting can be a means to optimise patient care by reducing error and increasing efficiencies in the patient journey. Radiographers have shown impressive accuracies in the provision of these initial evaluations, however, barriers such as a lack of confidence and increased workloads have been cited as a reason for radiographer reticence in engagement with this practice. With advances in Artificial Intelligence (AI) technology for assistance in clinical decision-making, and indication that this may increase confidence in diagnostic decision-making with reporting radiographers, the author of this editorial wonders what the impact of this technology might be on clinical decision-making by radiographers in the provision of Preliminary Image Evaluation (PIE).
Collapse
Affiliation(s)
- Clare Rainey
- School of Health SciencesUlster UniversityBelfastUK
| |
Collapse
|
6
|
Dzialas V, Doering E, Eich H, Strafella AP, Vaillancourt DE, Simonyan K, van Eimeren T, International Parkinson Movement Disorders Society‐Neuroimaging Study Group. Houston, We Have AI Problem! Quality Issues with Neuroimaging-Based Artificial Intelligence in Parkinson's Disease: A Systematic Review. Mov Disord 2024; 39:2130-2143. [PMID: 39235364 PMCID: PMC11657025 DOI: 10.1002/mds.30002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2024] [Revised: 08/07/2024] [Accepted: 08/08/2024] [Indexed: 09/06/2024] Open
Abstract
In recent years, many neuroimaging studies have applied artificial intelligence (AI) to facilitate existing challenges in Parkinson's disease (PD) diagnosis, prognosis, and intervention. The aim of this systematic review was to provide an overview of neuroimaging-based AI studies and to assess their methodological quality. A PubMed search yielded 810 studies, of which 244 that investigated the utility of neuroimaging-based AI for PD diagnosis, prognosis, or intervention were included. We systematically categorized studies by outcomes and rated them with respect to five minimal quality criteria (MQC) pertaining to data splitting, data leakage, model complexity, performance reporting, and indication of biological plausibility. We found that the majority of studies aimed to distinguish PD patients from healthy controls (54%) or atypical parkinsonian syndromes (25%), whereas prognostic or interventional studies were sparse. Only 20% of evaluated studies passed all five MQC, with data leakage, non-minimal model complexity, and reporting of biological plausibility as the primary factors for quality loss. Data leakage was associated with a significant inflation of accuracies. Very few studies employed external test sets (8%), where accuracy was significantly lower, and 19% of studies did not account for data imbalance. Adherence to MQC was low across all observed years and journal impact factors. This review outlines that AI has been applied to a wide variety of research questions pertaining to PD; however, the number of studies failing to pass the MQC is alarming. Therefore, we provide recommendations to enhance the interpretability, generalizability, and clinical utility of future AI applications using neuroimaging in PD. © 2024 The Author(s). Movement Disorders published by Wiley Periodicals LLC on behalf of International Parkinson and Movement Disorder Society.
Collapse
Affiliation(s)
- Verena Dzialas
- Department of Nuclear Medicine, Faculty of Medicine and University HospitalUniversity of CologneCologneGermany
- Faculty of Mathematics and Natural SciencesUniversity of CologneCologneGermany
| | - Elena Doering
- Department of Nuclear Medicine, Faculty of Medicine and University HospitalUniversity of CologneCologneGermany
- German Center for Neurodegenerative Diseases (DZNE)BonnGermany
| | - Helena Eich
- Department of Nuclear Medicine, Faculty of Medicine and University HospitalUniversity of CologneCologneGermany
| | - Antonio P. Strafella
- Edmond J. Safra Parkinson Disease Program, Neurology Division, Krembil Brain InstituteUniversity Health NetworkTorontoCanada
- Brain Health Imaging Centre, Centre for Addiction and Mental HealthUniversity of TorontoTorontoCanada
- Temerty Faculty of MedicineUniversity of TorontoTorontoCanada
| | - David E. Vaillancourt
- Department of Applied Physiology and KinesiologyUniversity of FloridaGainesvilleFloridaUSA
| | - Kristina Simonyan
- Department of Otolaryngology—Head and Neck SurgeryHarvard Medical School and Massachusetts Eye and EarBostonMassachusettsUSA
- Department of NeurologyMassachusetts General HospitalBostonMassachusettsUSA
| | - Thilo van Eimeren
- Department of Nuclear Medicine, Faculty of Medicine and University HospitalUniversity of CologneCologneGermany
- Department of Neurology, Faculty of Medicine and University HospitalUniversity of CologneCologneGermany
| | | |
Collapse
|
7
|
Aggarwal N, Drew DA, Parikh RB, Guha S. Ethical Implications of Artificial Intelligence in Gastroenterology: The Co-pilot or the Captain? Dig Dis Sci 2024; 69:2727-2733. [PMID: 39009918 DOI: 10.1007/s10620-024-08557-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Accepted: 06/25/2024] [Indexed: 07/17/2024]
Abstract
Though artificial intelligence (AI) is being widely implemented in gastroenterology (GI) and hepatology and has the potential to be paradigm shifting for clinical practice, its pitfalls must be considered along with its advantages. Currently, although the use of AI is limited in practice to supporting clinical judgment, medicine is rapidly heading toward a global environment where AI will be increasingly autonomous. Broader implementation of AI will require careful ethical considerations, specifically related to bias, privacy, and consent. Widespread use of AI raises concerns related to increasing rates of systematic errors, potentially due to bias introduced in training datasets. We propose that a central repository for collection and analysis for training and validation datasets is essential to overcoming potential biases. Since AI does not have built-in concepts of bias and equality, humans involved in AI development and implementation must ensure its ethical use and development. Moreover, ethical concerns regarding data ownership and health information privacy are likely to emerge, obviating traditional methods of obtaining patient consent that cover all possible uses of patient data. The question of liability in case of adverse events related to use of AI in GI must be addressed among the physician, the healthcare institution, and the AI developer. Though the future of AI in GI is very promising, herein we review the ethical considerations in need of additional guidance informed by community experience and collective expertise.
Collapse
Affiliation(s)
- Nishant Aggarwal
- Department of Internal Medicine, William Beaumont University Hospital, Royal Oak, MI, USA
| | - David A Drew
- Clinical & Translational Epidemiology Unit, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Division of Gastroenterology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Ravi B Parikh
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Corporal Michael J. Crescenz VA Medical Center, Philadelphia, PA, USA
| | - Sushovan Guha
- Gastroenterology and Hepatology, Houston Regional Gastroenterology Institute (HRGI), Houston, TX, USA.
- Department of Clinical Sciences, Tilman J. Fertitta Family College of Medicine, University of Houston, Houston, TX, USA.
| |
Collapse
|
8
|
Rainey C, Bond R, McConnell J, Hughes C, Kumar D, McFadden S. Reporting radiographers' interaction with Artificial Intelligence-How do different forms of AI feedback impact trust and decision switching? PLOS DIGITAL HEALTH 2024; 3:e0000560. [PMID: 39110687 PMCID: PMC11305567 DOI: 10.1371/journal.pdig.0000560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Accepted: 06/22/2024] [Indexed: 08/10/2024]
Abstract
Artificial Intelligence (AI) has been increasingly integrated into healthcare settings, including the radiology department to aid radiographic image interpretation, including reporting by radiographers. Trust has been cited as a barrier to effective clinical implementation of AI. Appropriating trust will be important in the future with AI to ensure the ethical use of these systems for the benefit of the patient, clinician and health services. Means of explainable AI, such as heatmaps have been proposed to increase AI transparency and trust by elucidating which parts of image the AI 'focussed on' when making its decision. The aim of this novel study was to quantify the impact of different forms of AI feedback on the expert clinicians' trust. Whilst this study was conducted in the UK, it has potential international application and impact for AI interface design, either globally or in countries with similar cultural and/or economic status to the UK. A convolutional neural network was built for this study; trained, validated and tested on a publicly available dataset of MUsculoskeletal RAdiographs (MURA), with binary diagnoses and Gradient Class Activation Maps (GradCAM) as outputs. Reporting radiographers (n = 12) were recruited to this study from all four regions of the UK. Qualtrics was used to present each participant with a total of 18 complete examinations from the MURA test dataset (each examination contained more than one radiographic image). Participants were presented with the images first, images with heatmaps next and finally an AI binary diagnosis in a sequential order. Perception of trust in the AI systems was obtained following the presentation of each heatmap and binary feedback. The participants were asked to indicate whether they would change their mind (or decision switch) in response to the AI feedback. Participants disagreed with the AI heatmaps for the abnormal examinations 45.8% of the time and agreed with binary feedback on 86.7% of examinations (26/30 presentations).'Only two participants indicated that they would decision switch in response to all AI feedback (GradCAM and binary) (0.7%, n = 2) across all datasets. 22.2% (n = 32) of participants agreed with the localisation of pathology on the heatmap. The level of agreement with the GradCAM and binary diagnosis was found to be correlated with trust (GradCAM:-.515;-.584, significant large negative correlation at 0.01 level (p = < .01 and-.309;-.369, significant medium negative correlation at .01 level (p = < .01) for GradCAM and binary diagnosis respectively). This study shows that the extent of agreement with both AI binary diagnosis and heatmap is correlated with trust in AI for the participants in this study, where greater agreement with the form of AI feedback is associated with greater trust in AI, in particular in the heatmap form of AI feedback. Forms of explainable AI should be developed with cognisance of the need for precision and accuracy in localisation to promote appropriate trust in clinical end users.
Collapse
Affiliation(s)
- Clare Rainey
- Ulster University, School of Health Sciences, York St, Belfast, Northern Ireland
| | - Raymond Bond
- Ulster University, School of Computing, York St, Belfast, Northern Ireland
| | | | - Ciara Hughes
- Ulster University, School of Health Sciences, York St, Belfast, Northern Ireland
| | - Devinder Kumar
- School of Medicine, Stanford University, California, United States of America
| | - Sonyia McFadden
- Ulster University, School of Health Sciences, York St, Belfast, Northern Ireland
| |
Collapse
|
9
|
Lakkimsetti M, Devella SG, Patel KB, Dhandibhotla S, Kaur J, Mathew M, Kataria J, Nallani M, Farwa UE, Patel T, Egbujo UC, Meenashi Sundaram D, Kenawy S, Roy M, Khan SF. Optimizing the Clinical Direction of Artificial Intelligence With Health Policy: A Narrative Review of the Literature. Cureus 2024; 16:e58400. [PMID: 38756258 PMCID: PMC11098056 DOI: 10.7759/cureus.58400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/16/2024] [Indexed: 05/18/2024] Open
Abstract
Artificial intelligence (AI) has the ability to completely transform the healthcare industry by enhancing diagnosis, treatment, and resource allocation. To ensure patient safety and equitable access to healthcare, it also presents ethical and practical issues that need to be carefully addressed. Its integration into healthcare is a crucial topic. To realize its full potential, however, the ethical issues around data privacy, prejudice, and transparency, as well as the practical difficulties posed by workforce adaptability and statutory frameworks, must be addressed. While there is growing knowledge about the advantages of AI in healthcare, there is a significant lack of knowledge about the moral and practical issues that come with its application, particularly in the setting of emergency and critical care. The majority of current research tends to concentrate on the benefits of AI, but thorough studies that investigate the potential disadvantages and ethical issues are scarce. The purpose of our article is to identify and examine the ethical and practical difficulties that arise when implementing AI in emergency medicine and critical care, to provide solutions to these issues, and to give suggestions to healthcare professionals and policymakers. In order to responsibly and successfully integrate AI in these important healthcare domains, policymakers and healthcare professionals must collaborate to create strong regulatory frameworks, safeguard data privacy, remove prejudice, and give healthcare workers the necessary training.
Collapse
Affiliation(s)
| | - Swati G Devella
- Medicine, Kempegowda Institute of Medical Sciences, Bangalore, IND
| | - Keval B Patel
- Surgery, Narendra Modi Medical College, Ahmedabad, IND
| | | | | | - Midhun Mathew
- Internal Medicine, Trinitas Regional Medical Center, Elizabeth, USA
| | | | - Manisha Nallani
- Medicine, Kamineni Academy of Medical Sciences and Research Center, Hyderabad, IND
| | - Umm E Farwa
- Emergency Medicine, Jinnah Sindh Medical University, Karachi, PAK
| | - Tirath Patel
- Medicine, American University of Antigua, Saint John's, ATG
| | | | - Dakshin Meenashi Sundaram
- Internal Medicine, Employees' State Insurance Corporation (ESIC) Medical College & Post Graduate Institute of Medical Science and Research (PGIMSR), Chennai, IND
| | | | - Mehak Roy
- Internal Medicine, School of Medicine Science and Research, Delhi, IND
| | | |
Collapse
|
10
|
Carmichael J, Costanza E, Blandford A, Struyven R, Keane PA, Balaskas K. Diagnostic decisions of specialist optometrists exposed to ambiguous deep-learning outputs. Sci Rep 2024; 14:6775. [PMID: 38514657 PMCID: PMC10958016 DOI: 10.1038/s41598-024-55410-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 02/23/2024] [Indexed: 03/23/2024] Open
Abstract
Artificial intelligence (AI) has great potential in ophthalmology. We investigated how ambiguous outputs from an AI diagnostic support system (AI-DSS) affected diagnostic responses from optometrists when assessing cases of suspected retinal disease. Thirty optometrists (15 more experienced, 15 less) assessed 30 clinical cases. For ten, participants saw an optical coherence tomography (OCT) scan, basic clinical information and retinal photography ('no AI'). For another ten, they were also given AI-generated OCT-based probabilistic diagnoses ('AI diagnosis'); and for ten, both AI-diagnosis and AI-generated OCT segmentations ('AI diagnosis + segmentation') were provided. Cases were matched across the three types of presentation and were selected to include 40% ambiguous and 20% incorrect AI outputs. Optometrist diagnostic agreement with the predefined reference standard was lowest for 'AI diagnosis + segmentation' (204/300, 68%) compared to 'AI diagnosis' (224/300, 75% p = 0.010), and 'no Al' (242/300, 81%, p = < 0.001). Agreement with AI diagnosis consistent with the reference standard decreased (174/210 vs 199/210, p = 0.003), but participants trusted the AI more (p = 0.029) with segmentations. Practitioner experience did not affect diagnostic responses (p = 0.24). More experienced participants were more confident (p = 0.012) and trusted the AI less (p = 0.038). Our findings also highlight issues around reference standard definition.
Collapse
Affiliation(s)
- Josie Carmichael
- University College London Interaction Centre (UCLIC), UCL, London, UK.
- Institute of Ophthalmology, NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust and UCL, London, UK.
| | - Enrico Costanza
- University College London Interaction Centre (UCLIC), UCL, London, UK
| | - Ann Blandford
- University College London Interaction Centre (UCLIC), UCL, London, UK
| | - Robbert Struyven
- Institute of Ophthalmology, NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust and UCL, London, UK
| | - Pearse A Keane
- Institute of Ophthalmology, NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust and UCL, London, UK
| | - Konstantinos Balaskas
- Institute of Ophthalmology, NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust and UCL, London, UK
| |
Collapse
|
11
|
Roberfroid B, Lee JA, Geets X, Sterpin E, Barragán-Montero AM. DIVE-ART: A tool to guide clinicians towards dosimetrically informed volume editions of automatically segmented volumes in adaptive radiation therapy. Radiother Oncol 2024; 192:110108. [PMID: 38272315 DOI: 10.1016/j.radonc.2024.110108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 01/17/2024] [Accepted: 01/17/2024] [Indexed: 01/27/2024]
Affiliation(s)
- Benjamin Roberfroid
- Université catholique de Louvain - Center of Molecular Imaging, Radiotherapy and Oncology (MIRO), Brussels, Belgium.
| | - John A Lee
- Université catholique de Louvain - Center of Molecular Imaging, Radiotherapy and Oncology (MIRO), Brussels, Belgium
| | - Xavier Geets
- Université catholique de Louvain - Center of Molecular Imaging, Radiotherapy and Oncology (MIRO), Brussels, Belgium; Cliniques universitaires Saint-Luc, Department of Radiation Oncology, Brussels, Belgium
| | - Edmond Sterpin
- Université catholique de Louvain - Center of Molecular Imaging, Radiotherapy and Oncology (MIRO), Brussels, Belgium; KU Leuven - Department of Oncology, Laboratory of Experimental Radiotherapy, Leuven, Belgium; Particle Therapy Interuniversity Center Leuven - PARTICLE, Leuven, Belgium
| | - Ana M Barragán-Montero
- Université catholique de Louvain - Center of Molecular Imaging, Radiotherapy and Oncology (MIRO), Brussels, Belgium
| |
Collapse
|
12
|
Puladi B, Gsaxner C, Kleesiek J, Hölzle F, Röhrig R, Egger J. The impact and opportunities of large language models like ChatGPT in oral and maxillofacial surgery: a narrative review. Int J Oral Maxillofac Surg 2024; 53:78-88. [PMID: 37798200 DOI: 10.1016/j.ijom.2023.09.005] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 09/14/2023] [Accepted: 09/19/2023] [Indexed: 10/07/2023]
Abstract
Since its release at the end of 2022, the social response to ChatGPT, a large language model (LLM), has been huge, as it has revolutionized the way we communicate with computers. This review was performed to describe the technical background of LLMs and to provide a review of the current literature on LLMs in the field of oral and maxillofacial surgery (OMS). The PubMed, Scopus, and Web of Science databases were searched for LLMs and OMS. Adjacent surgical disciplines were included to cover the entire literature, and records from Google Scholar and medRxiv were added. Out of the 57 records identified, 37 were included; 31 (84%) were related to GPT-3.5, four (11%) to GPT-4, and two (5%) to both. Current research on LLMs is mainly limited to research and scientific writing, patient information/communication, and medical education. Classic OMS diseases are underrepresented. The current literature related to LLMs in OMS has a limited evidence level. There is a need to investigate the use of LLMs scientifically and systematically in the core areas of OMS. Although LLMs are likely to add value outside the operating room, the use of LLMs raises ethical and medical regulatory issues that must first be addressed.
Collapse
Affiliation(s)
- B Puladi
- Department of Oral and Maxillofacial Surgery, University Hospital RWTH Aachen, Aachen, Germany; Institute of Medical Informatics, University Hospital RWTH Aachen, Aachen, Germany
| | - C Gsaxner
- Department of Oral and Maxillofacial Surgery, University Hospital RWTH Aachen, Aachen, Germany; Institute of Medical Informatics, University Hospital RWTH Aachen, Aachen, Germany; Institute of Computer Graphics and Vision, Graz University of Technology, Graz, Austria; Department of Oral and Maxillofacial Surgery, Medical University of Graz, Graz, Austria
| | - J Kleesiek
- Institute for AI in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
| | - F Hölzle
- Department of Oral and Maxillofacial Surgery, University Hospital RWTH Aachen, Aachen, Germany
| | - R Röhrig
- Institute of Medical Informatics, University Hospital RWTH Aachen, Aachen, Germany
| | - J Egger
- Institute of Computer Graphics and Vision, Graz University of Technology, Graz, Austria; Institute for AI in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany.
| |
Collapse
|
13
|
Love CS. "Just the Facts Ma'am": Moral and Ethical Considerations for Artificial Intelligence in Medicine and its Potential to Impact Patient Autonomy and Hope. LINACRE QUARTERLY 2023; 90:375-394. [PMID: 37974568 PMCID: PMC10638968 DOI: 10.1177/00243639231162431] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2023]
Abstract
Applying machine-based learning and synthetic cognition, commonly referred to as artificial intelligence (AI), to medicine intimates prescient knowledge. The ability of these algorithms to potentially unlock secrets held within vast data sets makes them invaluable to healthcare. Complex computer algorithms are routinely used to enhance diagnoses in fields like oncology, cardiology, and neurology. These algorithms have found utility in making healthcare decisions that are often complicated by seemingly endless relationships between exogenous and endogenous variables. They have also found utility in the allocation of limited healthcare resources and the management of end-of-life issues. With the increase in computing power and the ability to test a virtually unlimited number of relationships, scientists and engineers have the unprecedented ability to increase the prognostic confidence that comes from complex data analysis. While these systems present exciting opportunities for the democratization and precision of healthcare, their use raises important moral and ethical considerations around Christian concepts of autonomy and hope. The purpose of this essay is to explore some of the practical limitations associated with AI in medicine and discuss some of the potential theological implications that machine-generated diagnoses may present. Specifically, this article examines how these systems may disrupt the patient and healthcare provider relationship emblematic of Christ's healing mission. Finally, this article seeks to offer insights that might help in the development of a more robust ethical framework for the application of these systems in the future.
Collapse
|
14
|
Colicchio TK, Cimino JJ. Beyond the override: Using evidence of previous drug tolerance to suppress drug allergy alerts; a retrospective study of opioid alerts. J Biomed Inform 2023; 147:104508. [PMID: 37748541 DOI: 10.1016/j.jbi.2023.104508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 08/29/2023] [Accepted: 09/22/2023] [Indexed: 09/27/2023]
Abstract
OBJECTIVE Despite the extensive literature exploring alert fatigue, most studies have focused on describing the phenomenon, but not on fixing it. The authors aimed to identify data useful to avert clinically irrelevant alerts to inform future research on clinical decision support (CDS) design. METHODS We conducted a retrospective observational study of opioid drug allergy alert (DAA) overrides for the calendar year of 2019 at a large academic medical center, to identify data elements useful to find irrelevant alerts to be averted. RESULTS Overall, 227,815 DAAs were fired in 2019, with an override rate of 91 % (n = 208196). Opioids represented nearly two-thirds of these overrides (n = 129063; 62 %) and were the drug class with the highest override rate (96 %). On average, 29 opioid DAAs were overridden per patient. While most opioid alerts (97.1 %) are fired for a possible match (the drug class of the allergen matches the drug class of the prescribed drug), they are overridden significantly less frequently for definite match (exact match between allergen and prescribed drug) (88 % vs. 95.9 %, p < 0.001). When comparing the triggering drug with previously administered drugs, override rates were equally high for both definite match (95.9 %), no match (95.5 %), and possible match (95.1 %). Likewise, when comparing to home medications, overrides were excessively high for possible match (96.3 %), no match (96 %), and definite match (94.4 %). CONCLUSION We estimate that 74.5% of opioid DAAs (46.4% of all DAAs) at our institution could be relatively safely averted, since they either have a definite match for previous inpatient administrations suggesting drug tolerance or are fired as possible match with low risk of cross-sensitivity. Future research should focus on identifying other relevant data elements ideally with automated methods and use of emerging standards to empower CDS systems to suppress false-positive alerts while avoiding safety hazards.
Collapse
Affiliation(s)
- Tiago K Colicchio
- Informatics Institute, University of Alabama at Birmingham, AL, USA.
| | - James J Cimino
- Informatics Institute, University of Alabama at Birmingham, AL, USA
| |
Collapse
|
15
|
Abdulkadir Y, Luximon D, Morris E, Chow P, Kishan AU, Mikaeilian A, Lamb JM. Human factors in the clinical implementation of deep learning-based automated contouring of pelvic organs at risk for MRI-guided radiotherapy. Med Phys 2023; 50:5969-5977. [PMID: 37646527 DOI: 10.1002/mp.16676] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 03/07/2023] [Accepted: 04/28/2023] [Indexed: 09/01/2023] Open
Abstract
PURPOSE Deep neural nets have revolutionized the science of auto-segmentation and present great promise for treatment planning automation. However, little data exists regarding clinical implementation and human factors. We evaluated the performance and clinical implementation of a novel deep learning-based auto-contouring workflow for 0.35T magnetic resonance imaging (MRI)-guided pelvic radiotherapy, focusing on automation bias and objective measures of workflow savings. METHODS An auto-contouring model was developed using a UNet-derived architecture for the femoral heads, bladder, and rectum in 0.35T MR images. Training data was taken from 75 patients treated with MRI-guided radiotherapy at our institution. The model was tested against 20 retrospective cases outside the training set, and subsequently was clinically implemented. Usability was evaluated on the first 30 clinical cases by computing Dice coefficient (DSC), Hausdorff distance (HD), and the fraction of slices that were used un-modified by planners. Final contours were retrospectively reviewed by an experienced planner and clinical significance of deviations was graded as negligible, low, moderate, and high probability of leading to actionable dosimetric variations. In order to assess whether the use of auto-contouring led to final contours more or less in agreement with an objective standard, 10 pre-treatment and 10 post-treatment blinded cases were re-contoured from scratch by three expert planners to get expert consensus contours (EC). EC was compared to clinically used (CU) contours using DSC. Student's t-test and Levene's statistic were used to test statistical significance of differences in mean and standard deviation, respectively. Finally, the dosimetric significance of the contour differences were assessed by comparing the difference in bladder and rectum maximum point doses between EC and CU before and after the introduction of automation. RESULTS Median (interquartile range) DSC for the retrospective test data were 0.92(0.02), 0.92(0.06), 0.93(0.06), 0.87(0.04) for the post-processed contours for the right and left femoral heads, bladder, and rectum, respectively. Post-implementation median DSC were 1.0(0.0), 1.0(0.0), 0.98(0.04), and 0.98(0.06), respectively. For each organ, 96.2, 95.4, 59.5, and 68.21 percent of slices were used unmodified by the planner. DSC between EC and pre-implementation CU contours were 0.91(0.05*), 0.91*(0.05*), 0.95(0.04), and 0.88(0.04) for right and left femoral heads, bladder, and rectum, respectively. The corresponding DSC for post-implementation CU contours were 0.93(0.02*), 0.93*(0.01*), 0.96(0.01), and 0.85(0.02) (asterisks indicate statistically significant difference). In a retrospective review of contours used for planning, a total of four deviating slices in two patients were graded as low potential clinical significance. No deviations were graded as moderate or high. Mean differences between EC and CU rectum max-doses were 0.1 ± 2.6 Gy and -0.9 ± 2.5 Gy for pre- and post-implementation, respectively. Mean differences between EC and CU bladder/bladder wall max-doses were -0.9 ± 4.1 Gy and 0.0 ± 0.6 Gy for pre- and post-implementation, respectively. These differences were not statistically significant according to Student's t-test. CONCLUSION We have presented an analysis of the clinical implementation of a novel auto-contouring workflow. Substantial workflow savings were obtained. The introduction of auto-contouring into the clinical workflow changed the contouring behavior of planners. Automation bias was observed, but it had little deleterious effect on treatment planning.
Collapse
Affiliation(s)
- Yasin Abdulkadir
- Department of Radiation Oncology, David Geffen School of Medicine, University of California, Los Angeles, California, USA
| | - Dishane Luximon
- Department of Radiation Oncology, David Geffen School of Medicine, University of California, Los Angeles, California, USA
| | - Eric Morris
- Department of Radiation Oncology, David Geffen School of Medicine, University of California, Los Angeles, California, USA
| | - Phillip Chow
- Department of Radiation Oncology, David Geffen School of Medicine, University of California, Los Angeles, California, USA
| | - Amar U Kishan
- Department of Radiation Oncology, David Geffen School of Medicine, University of California, Los Angeles, California, USA
| | - Argin Mikaeilian
- Department of Radiation Oncology, David Geffen School of Medicine, University of California, Los Angeles, California, USA
| | - James M Lamb
- Department of Radiation Oncology, David Geffen School of Medicine, University of California, Los Angeles, California, USA
| |
Collapse
|
16
|
Rainey C, Villikudathil AT, McConnell J, Hughes C, Bond R, McFadden S. An experimental machine learning study investigating the decision-making process of students and qualified radiographers when interpreting radiographic images. PLOS DIGITAL HEALTH 2023; 2:e0000229. [PMID: 37878569 PMCID: PMC10599497 DOI: 10.1371/journal.pdig.0000229] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 07/29/2023] [Indexed: 10/27/2023]
Abstract
AI is becoming more prevalent in healthcare and is predicted to be further integrated into workflows to ease the pressure on an already stretched service. The National Health Service in the UK has prioritised AI and Digital health as part of its Long-Term Plan. Few studies have examined the human interaction with such systems in healthcare, despite reports of biases being present with the use of AI in other technologically advanced fields, such as finance and aviation. Understanding is needed of how certain user characteristics may impact how radiographers engage with AI systems in use in the clinical setting to mitigate against problems before they arise. The aim of this study is to determine correlations of skills, confidence in AI and perceived knowledge amongst student and qualified radiographers in the UK healthcare system. A machine learning based AI model was built to predict if the interpreter was either a student (n = 67) or a qualified radiographer (n = 39) in advance, using important variables from a feature selection technique named Boruta. A survey, which required the participant to interpret a series of plain radiographic examinations with and without AI assistance, was created on the Qualtrics survey platform and promoted via social media (Twitter/LinkedIn), therefore adopting convenience, snowball sampling This survey was open to all UK radiographers, including students and retired radiographers. Pearson's correlation analysis revealed that males who were proficient in their profession were more likely than females to trust AI. Trust in AI was negatively correlated with age and with level of experience. A machine learning model was built, the best model predicted the image interpreter to be qualified radiographers with 0.93 area under curve and a prediction accuracy of 93%. Further testing in prospective validation cohorts using a larger sample size is required to determine the clinical utility of the proposed machine learning model.
Collapse
Affiliation(s)
- Clare Rainey
- Faculty of Life and Health Sciences, School of Health Sciences, Ulster University, York Street, Belfast, Northern Ireland, United Kingdom
| | - Angelina T. Villikudathil
- Faculty of Life and Health Sciences, School of Health Sciences, Ulster University, York Street, Belfast, Northern Ireland, United Kingdom
| | | | - Ciara Hughes
- Faculty of Life and Health Sciences, School of Health Sciences, Ulster University, York Street, Belfast, Northern Ireland, United Kingdom
| | - Raymond Bond
- Faculty of Computing, School of Computing, Engineering and the Built Environment, Ulster University, York Street, Belfast, Northern Ireland, United Kingdom
| | - Sonyia McFadden
- Faculty of Life and Health Sciences, School of Health Sciences, Ulster University, York Street, Belfast, Northern Ireland, United Kingdom
| |
Collapse
|
17
|
Wang DY, Ding J, Sun AL, Liu SG, Jiang D, Li N, Yu JK. Artificial intelligence suppression as a strategy to mitigate artificial intelligence automation bias. J Am Med Inform Assoc 2023; 30:1684-1692. [PMID: 37561535 PMCID: PMC10531198 DOI: 10.1093/jamia/ocad118] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2022] [Revised: 05/30/2023] [Accepted: 06/19/2023] [Indexed: 08/11/2023] Open
Abstract
BACKGROUND Incorporating artificial intelligence (AI) into clinics brings the risk of automation bias, which potentially misleads the clinician's decision-making. The purpose of this study was to propose a potential strategy to mitigate automation bias. METHODS This was a laboratory study with a randomized cross-over design. The diagnosis of anterior cruciate ligament (ACL) rupture, a common injury, on magnetic resonance imaging (MRI) was used as an example. Forty clinicians were invited to diagnose 200 ACLs with and without AI assistance. The AI's correcting and misleading (automation bias) effects on the clinicians' decision-making processes were analyzed. An ordinal logistic regression model was employed to predict the correcting and misleading probabilities of the AI. We further proposed an AI suppression strategy that retracted AI diagnoses with a higher misleading probability and provided AI diagnoses with a higher correcting probability. RESULTS The AI significantly increased clinicians' accuracy from 87.2%±13.1% to 96.4%±1.9% (P < .001). However, the clinicians' errors in the AI-assisted round were associated with automation bias, accounting for 45.5% of the total mistakes. The automation bias was found to affect clinicians of all levels of expertise. Using a logistic regression model, we identified an AI output zone with higher probability to generate misleading diagnoses. The proposed AI suppression strategy was estimated to decrease clinicians' automation bias by 41.7%. CONCLUSION Although AI improved clinicians' diagnostic performance, automation bias was a serious problem that should be addressed in clinical practice. The proposed AI suppression strategy is a practical method for decreasing automation bias.
Collapse
Affiliation(s)
- Ding-Yu Wang
- Department of Sports Medicine, Peking University Third Hospital, Institute of Sports Medicine of Peking University, Beijing, China
- Beijing Key Laboratory of Sports Injuries, Beijing, China
- Engineering Research Center of Sports Trauma Treatment Technology and Devices, Ministry of Education, Beijing, China
| | - Jia Ding
- Beijing Yizhun Medical AI Co., Ltd, Beijing, China
| | - An-Lan Sun
- Beijing Yizhun Medical AI Co., Ltd, Beijing, China
| | - Shang-Gui Liu
- Department of Sports Medicine, Peking University Third Hospital, Institute of Sports Medicine of Peking University, Beijing, China
- Beijing Key Laboratory of Sports Injuries, Beijing, China
- Engineering Research Center of Sports Trauma Treatment Technology and Devices, Ministry of Education, Beijing, China
| | - Dong Jiang
- Department of Sports Medicine, Peking University Third Hospital, Institute of Sports Medicine of Peking University, Beijing, China
- Beijing Key Laboratory of Sports Injuries, Beijing, China
- Engineering Research Center of Sports Trauma Treatment Technology and Devices, Ministry of Education, Beijing, China
| | - Nan Li
- Research Center of Clinical Epidemiology, Peking University Third Hospital, Beijing, China
| | - Jia-Kuo Yu
- Department of Sports Medicine, Peking University Third Hospital, Institute of Sports Medicine of Peking University, Beijing, China
- Beijing Key Laboratory of Sports Injuries, Beijing, China
- Engineering Research Center of Sports Trauma Treatment Technology and Devices, Ministry of Education, Beijing, China
| |
Collapse
|
18
|
Danilov A, Aronow WS. Artificial Intelligence in Cardiology: Applications and Obstacles. Curr Probl Cardiol 2023; 48:101750. [PMID: 37088174 DOI: 10.1016/j.cpcardiol.2023.101750] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Accepted: 04/17/2023] [Indexed: 04/25/2023]
Abstract
Artificial intelligence (AI) technology is poised to alter the flow of daily life, and in particular, medicine, where it may eventually complement the physician's work in diagnosing and treating disease. Despite the recent frenzy and uptick in AI research over the past decade, the integration of AI into medical practice is in its early stages. Cardiology stands to benefit due to its many diagnostic modalities and diverse treatments. AI methods have been applied to various domains within cardiology: imaging, electrocardiography, wearable devices, risk prediction, and disease classification. While many AI-based approaches have been developed that perform equal to or better than the state-of-the-art, few prospective randomized studies have evaluated their use. Furthermore, obstacles at the intersection of medicine and AI remain unsolved, including model understanding, bias, model evaluation, relevance and reproducibility, and legal and ethical dilemmas. We summarize recent and current applications of AI in cardiology, followed by a discussion of the aforementioned complications.
Collapse
Affiliation(s)
| | - Wilbert S Aronow
- School of Medicine, New York Medical College, Valhalla, NY; Department of Cardiology, Westchester Medical Center, Valhalla, NY
| |
Collapse
|
19
|
Schmidt HG, Mamede S. Improving diagnostic decision support through deliberate reflection: a proposal. Diagnosis (Berl) 2023; 10:38-42. [PMID: 36000188 DOI: 10.1515/dx-2022-0062] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Accepted: 07/25/2022] [Indexed: 11/15/2022]
Abstract
Digital decision support (DDS) is expected to play an important role in improving a physician's diagnostic performance and reducing the burden of diagnostic error. Studies with currently available DDS systems indicate that they lead to modest gains in diagnostic accuracy, and these systems are expected to evolve to become more effective and user-friendly in the future. In this position paper, we propose that a way towards this future is to rethink DDS systems based on deliberate reflection, a strategy by which physicians systematically review the clinical findings observed in a patient in the light of an initial diagnosis. Deliberate reflection has been demonstrated to improve diagnostic accuracy in several contexts. In this paper, we first describe the deliberate reflection strategy, including the crucial element that would make it useful in the interaction with a DDS system. We examine the nature of conventional DDS systems and their shortcomings. Finally, we propose what DDS based on deliberate reflection might look like, and consider why it would overcome downsides of conventional DDS.
Collapse
Affiliation(s)
- Henk G Schmidt
- Department of Psychology, Education and Child Studies, Erasmus University Rotterdam, Rotterdam, The Netherlands.,Institute of Medical Education Research Rotterdam, Erasmus Medical Center, Rotterdam, The Netherlands
| | - Sílvia Mamede
- Department of Psychology, Education and Child Studies, Erasmus University Rotterdam, Rotterdam, The Netherlands.,Institute of Medical Education Research Rotterdam, Erasmus Medical Center, Rotterdam, The Netherlands
| |
Collapse
|
20
|
da Silva JHB, Cortez PC, Jagatheesaperumal SK, de Albuquerque VHC. ECG Measurement Uncertainty Based on Monte Carlo Approach: An Effective Analysis for a Successful Cardiac Health Monitoring System. BIOENGINEERING (BASEL, SWITZERLAND) 2023; 10:bioengineering10010115. [PMID: 36671687 PMCID: PMC9854940 DOI: 10.3390/bioengineering10010115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 01/07/2023] [Accepted: 01/11/2023] [Indexed: 01/15/2023]
Abstract
Measurement uncertainty is one of the widespread concepts applied in scientific works, particularly to estimate the accuracy of measurement results and to evaluate the conformity of products and processes. In this work, we propose a methodology to analyze the performance of measurement systems existing in the design phases, based on a probabilistic approach, by applying the Monte Carlo method (MCM). With this approach, it is feasible to identify the dominant contributing factors of imprecision in the evaluated system. In the design phase, this information can be used to identify where the most effective attention is required to improve the performance of equipment. This methodology was applied over a simulated electrocardiogram (ECG), for which a measurement uncertainty of the order of 3.54% of the measured value was estimated, with a confidence level of 95%. For this simulation, the ECG computational model was categorized into two modules: the preamplifier and the final stage. The outcomes of the analysis show that the preamplifier module had a greater influence on the measurement results over the final stage module, which indicates that interventions in the first module would promote more significant performance improvements in the system. Finally, it was identified that the main source of ECG measurement uncertainty is related to the measurand, focused towards the objective of better characterization of the metrological behavior of the measurements in the ECG.
Collapse
Affiliation(s)
| | - Paulo Cesar Cortez
- Department of Teleinformatics Engineering, Federal University of Ceará, Fortaleza 60455-970, Brazil
| | - Senthil K. Jagatheesaperumal
- Department of Electronics and Communication Engineering, Mepco Schlenk Engineering College, Sivakasi 626005, India
| | - Victor Hugo C. de Albuquerque
- Department of Teleinformatics Engineering, Federal University of Ceará, Fortaleza 60455-970, Brazil
- Correspondence: ; Tel.: +55-85-985-246-835
| |
Collapse
|
21
|
Kostick-Quenet KM, Gerke S. AI in the hands of imperfect users. NPJ Digit Med 2022; 5:197. [PMID: 36577851 PMCID: PMC9795935 DOI: 10.1038/s41746-022-00737-z] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 11/29/2022] [Indexed: 12/29/2022] Open
Abstract
As the use of artificial intelligence and machine learning (AI/ML) continues to expand in healthcare, much attention has been given to mitigating bias in algorithms to ensure they are employed fairly and transparently. Less attention has fallen to addressing potential bias among AI/ML's human users or factors that influence user reliance. We argue for a systematic approach to identifying the existence and impacts of user biases while using AI/ML tools and call for the development of embedded interface design features, drawing on insights from decision science and behavioral economics, to nudge users towards more critical and reflective decision making using AI/ML.
Collapse
Affiliation(s)
| | - Sara Gerke
- Penn State Dickinson Law, Carlisle, PA, USA
| |
Collapse
|
22
|
Müller S. Is there a civic duty to support medical AI development by sharing electronic health records? BMC Med Ethics 2022; 23:134. [PMID: 36496427 PMCID: PMC9736708 DOI: 10.1186/s12910-022-00871-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Accepted: 12/01/2022] [Indexed: 12/13/2022] Open
Abstract
Medical artificial intelligence (AI) is considered to be one of the most important assets for the future of innovative individual and public health care. To develop innovative medical AI, it is necessary to repurpose data that are primarily generated in and for the health care context. Usually, health data can only be put to a secondary use if data subjects provide their informed consent (IC). This regulation, however, is believed to slow down or even prevent vital medical research, including AI development. For this reason, a number of scholars advocate a moral civic duty to share electronic health records (EHRs) that overrides IC requirements in certain contexts. In the medical AI context, the common arguments for such a duty have not been subjected to a comprehensive challenge. This article sheds light on the correlation between two normative discourses concerning informed consent for secondary health record use and the development and use of medical AI. There are three main arguments in favour of a civic duty to support certain developments in medical AI by sharing EHRs: the 'rule to rescue argument', the 'low risks, high benefits argument', and the 'property rights argument'. This article critiques all three arguments because they either derive a civic duty from premises that do not apply to the medical AI context, or they rely on inappropriate analogies, or they ignore significant risks entailed by the EHR sharing process and the use of medical AI. Given this result, the article proposes an alternative civic responsibility approach that can attribute different responsibilities to different social groups and individuals and that can contextualise those responsibilities for the purpose of medical AI development.
Collapse
Affiliation(s)
- Sebastian Müller
- grid.10388.320000 0001 2240 3300Center for Life Ethics/Heinrich Hertz Chair TRA4, University of Bonn, Schaumburg- Lippe-Straße 5-7, 53113 Bonn, Germany
| |
Collapse
|
23
|
Monteith S, Glenn T, Geddes J, Whybrow PC, Achtyes E, Bauer M. Expectations for Artificial Intelligence (AI) in Psychiatry. Curr Psychiatry Rep 2022; 24:709-721. [PMID: 36214931 PMCID: PMC9549456 DOI: 10.1007/s11920-022-01378-5] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/15/2022] [Indexed: 01/29/2023]
Abstract
PURPOSE OF REVIEW Artificial intelligence (AI) is often presented as a transformative technology for clinical medicine even though the current technology maturity of AI is low. The purpose of this narrative review is to describe the complex reasons for the low technology maturity and set realistic expectations for the safe, routine use of AI in clinical medicine. RECENT FINDINGS For AI to be productive in clinical medicine, many diverse factors that contribute to the low maturity level need to be addressed. These include technical problems such as data quality, dataset shift, black-box opacity, validation and regulatory challenges, and human factors such as a lack of education in AI, workflow changes, automation bias, and deskilling. There will also be new and unanticipated safety risks with the introduction of AI. The solutions to these issues are complex and will take time to discover, develop, validate, and implement. However, addressing the many problems in a methodical manner will expedite the safe and beneficial use of AI to augment medical decision making in psychiatry.
Collapse
Affiliation(s)
- Scott Monteith
- Michigan State University College of Human Medicine, Traverse City Campus, Traverse City, MI, 49684, USA.
| | - Tasha Glenn
- ChronoRecord Association, Fullerton, CA, USA
| | - John Geddes
- Department of Psychiatry, University of Oxford, Warneford Hospital, Oxford, UK
| | - Peter C Whybrow
- Department of Psychiatry and Biobehavioral Sciences, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles (UCLA), Los Angeles, CA, USA
| | - Eric Achtyes
- Michigan State University College of Human Medicine, Grand Rapids, MI, 49684, USA
- Network180, Grand Rapids, MI, USA
| | - Michael Bauer
- Department of Psychiatry and Psychotherapy, University Hospital Carl Gustav Carus Medical Faculty, Technische Universität Dresden, Dresden, Germany
| |
Collapse
|
24
|
Ho S, Doig GS, Ly A. Attitudes of optometrists towards artificial intelligence for the diagnosis of retinal disease: A cross-sectional mail-out survey. Ophthalmic Physiol Opt 2022; 42:1170-1179. [PMID: 35924658 DOI: 10.1111/opo.13034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Revised: 07/01/2022] [Accepted: 07/01/2022] [Indexed: 01/07/2023]
Abstract
PURPOSE Artificial intelligence (AI)-based systems have demonstrated great potential in improving the diagnostic accuracy of retinal disease but are yet to achieve widespread acceptance in routine clinical practice. Clinician attitudes are known to influence implementation. Therefore, this study aimed to identify optometrists' attitudes towards the use of AI to assist in diagnosing retinal disease. METHODS A paper-based survey was designed to assess general attitudes towards AI in diagnosing retinal disease and motivators/barriers for future use. Two clinical scenarios for using AI were evaluated: (1) at the point of care to obtain a diagnostic recommendation, versus (2) after the consultation to provide a second opinion. Relationships between participant characteristics and attitudes towards AI were explored. The survey was mailed to 252 randomly selected practising optometrists across Australia, with repeat mail-outs to non-respondents. RESULTS The response rate was 53% (133/252). Respondents' mean (SD) age was 42.7 (13.3) years, and 44.4% (59/133) identified as female, whilst 1.5% (2/133) identified as gender diverse. The mean number of years practising in primary eye care was 18.8 (13.2) years with 64.7% (86/133) working in an independently owned practice. On average, responding optometrists reported positive attitudes (mean score 4.0 out of 5, SD 0.8) towards using AI as a tool to aid the diagnosis of retinal disease, and would be more likely to use AI if it is proven to increase patient access to healthcare (mean score 4.4 out of 5, SD 0.6). Furthermore, optometrists expressed a statistically significant preference for using AI after the consultation to provide a second opinion rather than during the consultation, at the point-of-care (+0.12, p = 0.01). CONCLUSIONS Optometrists have positive attitudes towards the future use of AI as an aid to diagnose retinal disease. Understanding clinician attitudes and preferences for using AI may help maximise its clinical potential and ensure its successful translation into practice.
Collapse
Affiliation(s)
- Sharon Ho
- Centre for Eye Health, The University of New South Wales, Sydney, New South Wales, Australia.,School of Optometry and Vision Science, The University of New South Wales, Sydney, New South Wales, Australia
| | - Gordon S Doig
- Centre for Eye Health, The University of New South Wales, Sydney, New South Wales, Australia.,School of Optometry and Vision Science, The University of New South Wales, Sydney, New South Wales, Australia
| | - Angelica Ly
- Centre for Eye Health, The University of New South Wales, Sydney, New South Wales, Australia.,School of Optometry and Vision Science, The University of New South Wales, Sydney, New South Wales, Australia.,Brien Holden Vision Institute, The University of New South Wales, Sydney, New South Wales, Australia
| |
Collapse
|
25
|
Peace A, Al-Zaiti SS, Finlay D, McGilligan V, Bond R. Exploring decision making 'noise' when interpreting the electrocardiogram in the context of cardiac cath lab activation. J Electrocardiol 2022; 73:157-161. [PMID: 35853754 DOI: 10.1016/j.jelectrocard.2022.07.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 06/29/2022] [Accepted: 07/01/2022] [Indexed: 11/26/2022]
Abstract
In this commentary paper, we discuss the use of the electrocardiogram to help clinicians make diagnostic and patient referral decisions in acute care settings. The paper discusses the factors that are likely to contribute to the variability and noise in the clinical decision making process for catheterization lab activation. These factors include the variable competence in reading ECGs, the intra/inter rater reliability, the lack of standard ECG training, the various ECG machine and filter settings, cognitive biases (such as automation bias which is the tendency to agree with the computer-aided diagnosis or AI diagnosis), the order of the information being received, tiredness or decision fatigue as well as ECG artefacts such as the signal noise or lead misplacement. We also discuss potential research questions and tools that could be used to mitigate this 'noise' and improve the quality of ECG based decision making.
Collapse
Affiliation(s)
- Aaron Peace
- Clinical Translational Research and Innovation Centre, Northern Ireland, UK
| | | | | | | | | |
Collapse
|
26
|
Rainey C, O'Regan T, Matthew J, Skelton E, Woznitza N, Chu KY, Goodman S, McConnell J, Hughes C, Bond R, Malamateniou C, McFadden S. UK reporting radiographers' perceptions of AI in radiographic image interpretation - Current perspectives and future developments. Radiography (Lond) 2022; 28:881-888. [PMID: 35780627 DOI: 10.1016/j.radi.2022.06.006] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 06/07/2022] [Accepted: 06/13/2022] [Indexed: 02/03/2023]
Abstract
INTRODUCTION Radiographer reporting is accepted practice in the UK. With a national shortage of radiographers and radiologists, artificial intelligence (AI) support in reporting may help minimise the backlog of unreported images. Modern AI is not well understood by human end-users. This may have ethical implications and impact human trust in these systems, due to over- and under-reliance. This study investigates the perceptions of reporting radiographers about AI, gathers information to explain how they may interact with AI in future and identifies features perceived as necessary for appropriate trust in these systems. METHODS A Qualtrics® survey was designed and piloted by a team of UK AI expert radiographers. This paper reports the third part of the survey, open to reporting radiographers only. RESULTS 86 responses were received. Respondents were confident in how an AI reached its decision (n = 53, 62%). Less than a third of respondents would be confident communicating the AI decision to stakeholders. Affirmation from AI would improve confidence (n = 49, 57%) and disagreement would make respondents seek a second opinion (n = 60, 70%). There is a moderate trust level in AI for image interpretation. System performance data and AI visual explanations would increase trust. CONCLUSIONS Responses indicate that AI will have a strong impact on reporting radiographers' decision making in the future. Respondents are confident in how an AI makes decisions but less confident explaining this to others. Trust levels could be improved with explainable AI solutions. IMPLICATIONS FOR PRACTICE This survey clarifies UK reporting radiographers' perceptions of AI, used for image interpretation, highlighting key issues with AI integration.
Collapse
Affiliation(s)
- C Rainey
- Ulster University, School of Health Sciences, Faculty of Life and Health Sciences, Shore Road, Newtownabbey, N. Ireland.
| | - T O'Regan
- The Society and College of Radiographers, 207 Providence Square, Mill Street, London, UK
| | - J Matthew
- School of Biomedical Engineering and Imaging Sciences, King's College London, St Thomas' Hospital, London, UK
| | - E Skelton
- School of Biomedical Engineering and Imaging Sciences, King's College London, St Thomas' Hospital, London, UK; Department of Radiography, Division of Midwifery and Radiography, School of Health Sciences, City, University of London, London, UK
| | - N Woznitza
- University College London Hospitals, Bloomsbury, London, UK; School of Allied & Public Health Professions, Canterbury Christ Church University, Canterbury, UK
| | - K-Y Chu
- Department of Oncology, Oxford Institute for Radiation Oncology, University of Oxford, Oxford, UK; Radiotherapy Department, Churchill Hospital, Oxford University Hospitals NHS FT, Oxford, UK
| | - S Goodman
- The Society and College of Radiographers, 207 Providence Square, Mill Street, London, UK
| | | | - C Hughes
- Ulster University, School of Health Sciences, Faculty of Life and Health Sciences, Shore Road, Newtownabbey, N. Ireland
| | - R Bond
- Ulster University, School of Computing, Faculty of Computing, Engineering and the Built Environment, Shore Road, Newtownabbey, N. Ireland
| | - C Malamateniou
- School of Biomedical Engineering and Imaging Sciences, King's College London, St Thomas' Hospital, London, UK; Department of Radiography, Division of Midwifery and Radiography, School of Health Sciences, City, University of London, London, UK
| | - S McFadden
- Ulster University, School of Health Sciences, Faculty of Life and Health Sciences, Shore Road, Newtownabbey, N. Ireland
| |
Collapse
|
27
|
McCradden MD, Anderson JA, A Stephenson E, Drysdale E, Erdman L, Goldenberg A, Zlotnik Shaul R. A Research Ethics Framework for the Clinical Translation of Healthcare Machine Learning. THE AMERICAN JOURNAL OF BIOETHICS : AJOB 2022; 22:8-22. [PMID: 35048782 DOI: 10.1080/15265161.2021.2013977] [Citation(s) in RCA: 52] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The application of artificial intelligence and machine learning (ML) technologies in healthcare have immense potential to improve the care of patients. While there are some emerging practices surrounding responsible ML as well as regulatory frameworks, the traditional role of research ethics oversight has been relatively unexplored regarding its relevance for clinical ML. In this paper, we provide a comprehensive research ethics framework that can apply to the systematic inquiry of ML research across its development cycle. The pathway consists of three stages: (1) exploratory, hypothesis-generating data access; (2) silent period evaluation; (3) prospective clinical evaluation. We connect each stage to its literature and ethical justification and suggest adaptations to traditional paradigms to suit ML while maintaining ethical rigor and the protection of individuals. This pathway can accommodate a multitude of research designs from observational to controlled trials, and the stages can apply individually to a variety of ML applications.
Collapse
Affiliation(s)
- Melissa D McCradden
- Department of Bioethics, The Hospital for Sick Children
- Genetics and Genome Biology, The Hospital for Sick Children, Peter Gilgan Centre for Research and Learning
- Division of Clinical & Public Health, Dalla Lana School of Public Health
| | - James A Anderson
- Department of Bioethics, The Hospital for Sick Children
- Institute for Health Management Policy, & Evaluation, University of Toronto
| | - Elizabeth A Stephenson
- Labatt Family Heart Centre, The Hospital for Sick Children
- Department of Pediatrics, The Hospital for Sick Children
| | - Erik Drysdale
- Genetics and Genome Biology, The Hospital for Sick Children, Peter Gilgan Centre for Research and Learning
| | - Lauren Erdman
- Genetics and Genome Biology, The Hospital for Sick Children, Peter Gilgan Centre for Research and Learning
- Vector Institute
- Department of Computer Science, University of Toronto
| | - Anna Goldenberg
- Department of Bioethics, The Hospital for Sick Children
- Vector Institute
- Department of Computer Science, University of Toronto
- CIFAR
| | - Randi Zlotnik Shaul
- Department of Bioethics, The Hospital for Sick Children
- Department of Pediatrics, The Hospital for Sick Children
- Child Health Evaluative Sciences, The Hospital for Sick Children
| |
Collapse
|
28
|
AI and Clinical Decision Making: The Limitations and Risks of Computational Reductionism in Bowel Cancer Screening. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12073341] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/09/2022]
Abstract
Advances in artificial intelligence in healthcare are frequently promoted as ‘solutions’ to improve the accuracy, safety, and quality of clinical decisions, treatments, and care. Despite some diagnostic success, however, AI systems rely on forms of reductive reasoning and computational determinism that embed problematic assumptions about clinical decision-making and clinical practice. Clinician autonomy, experience, and judgement are reduced to inputs and outputs framed as binary or multi-class classification problems benchmarked against a clinician’s capacity to identify or predict disease states. This paper examines this reductive reasoning in AI systems for colorectal cancer (CRC) to highlight their limitations and risks: (1) in AI systems themselves due to inherent biases in (a) retrospective training datasets and (b) embedded assumptions in underlying AI architectures and algorithms; (2) in the problematic and limited evaluations being conducted on AI systems prior to system integration in clinical practice; and (3) in marginalising socio-technical factors in the context-dependent interactions between clinicians, their patients, and the broader health system. The paper argues that to optimise benefits from AI systems and to avoid negative unintended consequences for clinical decision-making and patient care, there is a need for more nuanced and balanced approaches to AI system deployment and evaluation in CRC.
Collapse
|
29
|
Brisk R, Bond RR, Finlay D, McLaughlin JAD, Piadlo AJ, McEneaney DJ. WaSP-ECG: A Wave Segmentation Pretraining Toolkit for Electrocardiogram Analysis. Front Physiol 2022; 13:760000. [PMID: 35399264 PMCID: PMC8993503 DOI: 10.3389/fphys.2022.760000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 01/03/2022] [Indexed: 11/23/2022] Open
Abstract
Introduction Representation learning allows artificial intelligence (AI) models to learn useful features from large, unlabelled datasets. This can reduce the need for labelled data across a range of downstream tasks. It was hypothesised that wave segmentation would be a useful form of electrocardiogram (ECG) representation learning. In addition to reducing labelled data requirements, segmentation masks may provide a mechanism for explainable AI. This study details the development and evaluation of a Wave Segmentation Pretraining (WaSP) application. Materials and Methods Pretraining: A non-AI-based ECG signal and image simulator was developed to generate ECGs and wave segmentation masks. U-Net models were trained to segment waves from synthetic ECGs. Dataset: The raw sample files from the PTB-XL dataset were downloaded. Each ECG was also plotted into an image. Fine-tuning and evaluation: A hold-out approach was used with a 60:20:20 training/validation/test set split. The encoder portions of the U-Net models were fine-tuned to classify PTB-XL ECGs for two tasks: sinus rhythm (SR) vs atrial fibrillation (AF), and myocardial infarction (MI) vs normal ECGs. The fine-tuning was repeated without pretraining. Results were compared. Explainable AI: an example pipeline combining AI-derived segmentation masks and a rule-based AF detector was developed and evaluated. Results WaSP consistently improved model performance on downstream tasks for both ECG signals and images. The difference between non-pretrained models and models pretrained for wave segmentation was particularly marked for ECG image analysis. A selection of segmentation masks are shown. An AF detection algorithm comprising both AI and rule-based components performed less well than end-to-end AI models but its outputs are proposed to be highly explainable. An example output is shown. Conclusion WaSP using synthetic data and labels allows AI models to learn useful features for downstream ECG analysis with real-world data. Segmentation masks provide an intermediate output that may facilitate confidence calibration in the context of end-to-end AI. It is possible to combine AI-derived segmentation masks and rule-based diagnostic classifiers for explainable ECG analysis.
Collapse
Affiliation(s)
- Rob Brisk
- Faculty of Computing, Engineering and the Built Environment, Ulster University, Belfast, United Kingdom
- Cardiology Department, Craigavon Area Hospital, Craigavon, United Kingdom
- *Correspondence: Rob Brisk,
| | - Raymond R. Bond
- Faculty of Computing, Engineering and the Built Environment, Ulster University, Belfast, United Kingdom
| | - Dewar Finlay
- Faculty of Computing, Engineering and the Built Environment, Ulster University, Belfast, United Kingdom
| | - James A. D. McLaughlin
- Faculty of Computing, Engineering and the Built Environment, Ulster University, Belfast, United Kingdom
| | - Alicja J. Piadlo
- Faculty of Computing, Engineering and the Built Environment, Ulster University, Belfast, United Kingdom
- Cardiology Department, Craigavon Area Hospital, Craigavon, United Kingdom
| | - David J. McEneaney
- Faculty of Computing, Engineering and the Built Environment, Ulster University, Belfast, United Kingdom
- Cardiology Department, Craigavon Area Hospital, Craigavon, United Kingdom
| |
Collapse
|
30
|
Retson TA, Hasenstab KA, Kligerman SJ, Jacobs KE, Yen AC, Brouha SS, Hahn LD, Hsiao A. Reader Perceptions and Impact of AI on CT Assessment of Air Trapping. Radiol Artif Intell 2022; 4:e210160. [PMID: 35391767 DOI: 10.1148/ryai.2021210160] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 09/22/2021] [Accepted: 10/22/2021] [Indexed: 11/11/2022]
Abstract
Quantitative imaging measurements can be facilitated by artificial intelligence (AI) algorithms, but how they might impact decision-making and be perceived by radiologists remains uncertain. After creation of a dedicated inspiratory-expiratory CT examination and concurrent deployment of a quantitative AI algorithm for assessing air trapping, five cardiothoracic radiologists retrospectively evaluated severity of air trapping on 17 examination studies. Air trapping severity of each lobe was evaluated in three stages: qualitatively (visually); semiquantitatively, allowing manual region-of-interest measurements; and quantitatively, using results from an AI algorithm. Readers were surveyed on each case for their perceptions of the AI algorithm. The algorithm improved interreader agreement (intraclass correlation coefficients: visual, 0.28; semiquantitative, 0.40; quantitative, 0.84; P < .001) and improved correlation with pulmonary function testing (forced expiratory volume in 1 second-to-forced vital capacity ratio) (visual r = -0.26, semiquantitative r = -0.32, quantitative r = -0.44). Readers perceived moderate agreement with the AI algorithm (Likert scale average, 3.7 of 5), a mild impact on their final assessment (average, 2.6), and a neutral perception of overall utility (average, 3.5). Though the AI algorithm objectively improved interreader consistency and correlation with pulmonary function testing, individual readers did not immediately perceive this benefit, revealing a potential barrier to clinical adoption. Keywords: Technology Assessment, Quantification © RSNA, 2021.
Collapse
Affiliation(s)
- Tara A Retson
- Department of Radiology, University of California, San Diego, 9452 Medical Center Dr, 4th Floor, La Jolla, CA 92037 (T.A.R., S.J.K., K.E.J., A.C.Y., S.S.B., L.D.H., A.H.); and Department of Mathematics and Statistics, San Diego State University, San Diego, Calif (K.A.H.)
| | - Kyle A Hasenstab
- Department of Radiology, University of California, San Diego, 9452 Medical Center Dr, 4th Floor, La Jolla, CA 92037 (T.A.R., S.J.K., K.E.J., A.C.Y., S.S.B., L.D.H., A.H.); and Department of Mathematics and Statistics, San Diego State University, San Diego, Calif (K.A.H.)
| | - Seth J Kligerman
- Department of Radiology, University of California, San Diego, 9452 Medical Center Dr, 4th Floor, La Jolla, CA 92037 (T.A.R., S.J.K., K.E.J., A.C.Y., S.S.B., L.D.H., A.H.); and Department of Mathematics and Statistics, San Diego State University, San Diego, Calif (K.A.H.)
| | - Kathleen E Jacobs
- Department of Radiology, University of California, San Diego, 9452 Medical Center Dr, 4th Floor, La Jolla, CA 92037 (T.A.R., S.J.K., K.E.J., A.C.Y., S.S.B., L.D.H., A.H.); and Department of Mathematics and Statistics, San Diego State University, San Diego, Calif (K.A.H.)
| | - Andrew C Yen
- Department of Radiology, University of California, San Diego, 9452 Medical Center Dr, 4th Floor, La Jolla, CA 92037 (T.A.R., S.J.K., K.E.J., A.C.Y., S.S.B., L.D.H., A.H.); and Department of Mathematics and Statistics, San Diego State University, San Diego, Calif (K.A.H.)
| | - Sharon S Brouha
- Department of Radiology, University of California, San Diego, 9452 Medical Center Dr, 4th Floor, La Jolla, CA 92037 (T.A.R., S.J.K., K.E.J., A.C.Y., S.S.B., L.D.H., A.H.); and Department of Mathematics and Statistics, San Diego State University, San Diego, Calif (K.A.H.)
| | - Lewis D Hahn
- Department of Radiology, University of California, San Diego, 9452 Medical Center Dr, 4th Floor, La Jolla, CA 92037 (T.A.R., S.J.K., K.E.J., A.C.Y., S.S.B., L.D.H., A.H.); and Department of Mathematics and Statistics, San Diego State University, San Diego, Calif (K.A.H.)
| | - Albert Hsiao
- Department of Radiology, University of California, San Diego, 9452 Medical Center Dr, 4th Floor, La Jolla, CA 92037 (T.A.R., S.J.K., K.E.J., A.C.Y., S.S.B., L.D.H., A.H.); and Department of Mathematics and Statistics, San Diego State University, San Diego, Calif (K.A.H.)
| |
Collapse
|
31
|
Diagnostic Accuracy of the Deep Learning Model for the Detection of ST Elevation Myocardial Infarction on Electrocardiogram. J Pers Med 2022; 12:jpm12030336. [PMID: 35330336 PMCID: PMC8956114 DOI: 10.3390/jpm12030336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 02/10/2022] [Accepted: 02/22/2022] [Indexed: 11/17/2022] Open
Abstract
We aimed to measure the diagnostic accuracy of the deep learning model (DLM) for ST-elevation myocardial infarction (STEMI) on a 12-lead electrocardiogram (ECG) according to culprit artery sorts. From January 2017 to December 2019, we recruited patients with STEMI who received more than one stent insertion for culprit artery occlusion. The DLM was trained with STEMI and normal sinus rhythm ECG for external validation. The primary outcome was the diagnostic accuracy of DLM for STEMI according to the three different culprit arteries. The outcomes were measured using the area under the receiver operating characteristic curve (AUROC), sensitivity (SEN), and specificity (SPE) using the Youden index. A total of 60,157 ECGs were obtained. These included 117 STEMI-ECGs and 60,040 normal sinus rhythm ECGs. When using DLM, the AUROC for overall STEMI was 0.998 (0.996–0.999) with SEN 97.4% (95.7–100) and SPE 99.2% (98.1–99.4). There were no significant differences in diagnostic accuracy within the three culprit arteries. The baseline wanders in false positive cases (83.7%, 345/412) significantly interfered with the accurate interpretation of ST elevation on an ECG. DLM showed high diagnostic accuracy for STEMI detection, regardless of the type of culprit artery. The baseline wanders of the ECGs could affect the misinterpretation of DLM.
Collapse
|
32
|
Buck C, Doctor E, Hennrich J, Jöhnk J, Eymann T. General Practitioners' Attitudes Toward Artificial Intelligence-Enabled Systems: Interview Study. J Med Internet Res 2022; 24:e28916. [PMID: 35084342 PMCID: PMC8832268 DOI: 10.2196/28916] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 06/24/2021] [Accepted: 11/21/2021] [Indexed: 01/14/2023] Open
Abstract
Background General practitioners (GPs) care for a large number of patients with various diseases in very short timeframes under high uncertainty. Thus, systems enabled by artificial intelligence (AI) are promising and time-saving solutions that may increase the quality of care. Objective This study aims to understand GPs’ attitudes toward AI-enabled systems in medical diagnosis. Methods We interviewed 18 GPs from Germany between March 2020 and May 2020 to identify determinants of GPs’ attitudes toward AI-based systems in diagnosis. By analyzing the interview transcripts, we identified 307 open codes, which we then further structured to derive relevant attitude determinants. Results We merged the open codes into 21 concepts and finally into five categories: concerns, expectations, environmental influences, individual characteristics, and minimum requirements of AI-enabled systems. Concerns included all doubts and fears of the participants regarding AI-enabled systems. Expectations reflected GPs’ thoughts and beliefs about expected benefits and limitations of AI-enabled systems in terms of GP care. Environmental influences included influences resulting from an evolving working environment, key stakeholders’ perspectives and opinions, the available information technology hardware and software resources, and the media environment. Individual characteristics were determinants that describe a physician as a person, including character traits, demographic characteristics, and knowledge. In addition, the interviews also revealed the minimum requirements of AI-enabled systems, which were preconditions that must be met for GPs to contemplate using AI-enabled systems. Moreover, we identified relationships among these categories, which we conflate in our proposed model. Conclusions This study provides a thorough understanding of the perspective of future users of AI-enabled systems in primary care and lays the foundation for successful market penetration. We contribute to the research stream of analyzing and designing AI-enabled systems and the literature on attitudes toward technology and practice by fostering the understanding of GPs and their attitudes toward such systems. Our findings provide relevant information to technology developers, policymakers, and stakeholder institutions of GP care.
Collapse
Affiliation(s)
- Christoph Buck
- Department of Business & Information Systems Engineering, University of Bayreuth, Bayreuth, Germany.,Centre for Future Enterprise, Queensland University of Technology, Brisbane, Australia
| | - Eileen Doctor
- Project Group Business & Information Systems Engineering, Fraunhofer Institute for Applied Information Technology, Bayreuth, Germany
| | - Jasmin Hennrich
- Project Group Business & Information Systems Engineering, Fraunhofer Institute for Applied Information Technology, Bayreuth, Germany
| | - Jan Jöhnk
- Finance & Information Management Research Center, Bayreuth, Germany
| | - Torsten Eymann
- Department of Business & Information Systems Engineering, University of Bayreuth, Bayreuth, Germany.,Finance & Information Management Research Center, Bayreuth, Germany
| |
Collapse
|
33
|
Oliva A, Grassi S, Vetrugno G, Rossi R, Della Morte G, Pinchi V, Caputo M. Management of Medico-Legal Risks in Digital Health Era: A Scoping Review. Front Med (Lausanne) 2022; 8:821756. [PMID: 35087854 PMCID: PMC8787306 DOI: 10.3389/fmed.2021.821756] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 12/20/2021] [Indexed: 12/11/2022] Open
Abstract
Artificial intelligence needs big data to develop reliable predictions. Therefore, storing and processing health data is essential for the new diagnostic and decisional technologies but, at the same time, represents a risk for privacy protection. This scoping review is aimed at underlying the medico-legal and ethical implications of the main artificial intelligence applications to healthcare, also focusing on the issues of the COVID-19 era. Starting from a summary of the United States (US) and European Union (EU) regulatory frameworks, the current medico-legal and ethical challenges are discussed in general terms before focusing on the specific issues regarding informed consent, medical malpractice/cognitive biases, automation and interconnectedness of medical devices, diagnostic algorithms and telemedicine. We aim at underlying that education of physicians on the management of this (new) kind of clinical risks can enhance compliance with regulations and avoid legal risks for the healthcare professionals and institutions.
Collapse
Affiliation(s)
- Antonio Oliva
- Legal Medicine, Department of Health Surveillance and Bioethics, Università Cattolica del Sacro Cuore, Rome, Italy
| | - Simone Grassi
- Legal Medicine, Department of Health Surveillance and Bioethics, Università Cattolica del Sacro Cuore, Rome, Italy
| | - Giuseppe Vetrugno
- Legal Medicine, Department of Health Surveillance and Bioethics, Università Cattolica del Sacro Cuore, Rome, Italy
- Risk Management Unit, Fondazione Policlinico A. Gemelli Istituto di Ricovero e Cura a Carattere Scientifico, Rome, Italy
| | - Riccardo Rossi
- Legal Medicine, Department of Health Surveillance and Bioethics, Università Cattolica del Sacro Cuore, Rome, Italy
| | - Gabriele Della Morte
- International Law, Institute of International Studies, Università Cattolica del Sacro Cuore, Milan, Italy
| | - Vilma Pinchi
- Department of Health Sciences, Section of Forensic Medical Sciences, University of Florence, Florence, Italy
| | - Matteo Caputo
- Criminal Law, Department of Juridical Science, Università Cattolica del Sacro Cuore, Milan, Italy
| |
Collapse
|
34
|
Ehrmann D, Harish V, Morgado F, Rosella L, Johnson A, Mema B, Mazwi M. Ignorance Isn't Bliss: We Must Close the Machine Learning Knowledge Gap in Pediatric Critical Care. Front Pediatr 2022; 10:864755. [PMID: 35620143 PMCID: PMC9127438 DOI: 10.3389/fped.2022.864755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 04/18/2022] [Indexed: 12/02/2022] Open
Abstract
Pediatric intensivists are bombarded with more patient data than ever before. Integration and interpretation of data from patient monitors and the electronic health record (EHR) can be cognitively expensive in a manner that results in delayed or suboptimal medical decision making and patient harm. Machine learning (ML) can be used to facilitate insights from healthcare data and has been successfully applied to pediatric critical care data with that intent. However, many pediatric critical care medicine (PCCM) trainees and clinicians lack an understanding of foundational ML principles. This presents a major problem for the field. We outline the reasons why in this perspective and provide a roadmap for competency-based ML education for PCCM trainees and other stakeholders.
Collapse
Affiliation(s)
- Daniel Ehrmann
- Department of Critical Care Medicine, Hospital for Sick Children, Toronto, ON, Canada.,Temerty Centre for Artificial Intelligence Research and Education in Medicine, University of Toronto, Toronto, ON, Canada
| | - Vinyas Harish
- Temerty Centre for Artificial Intelligence Research and Education in Medicine, University of Toronto, Toronto, ON, Canada.,MD/PhD Program, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.,Institute for Health Policy, Management and Evaluation, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
| | - Felipe Morgado
- Temerty Centre for Artificial Intelligence Research and Education in Medicine, University of Toronto, Toronto, ON, Canada.,MD/PhD Program, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.,Department of Medical Biophysics, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Laura Rosella
- MD/PhD Program, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.,Institute for Health Policy, Management and Evaluation, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
| | - Alistair Johnson
- MD/PhD Program, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.,Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, ON, Canada
| | - Briseida Mema
- Department of Critical Care Medicine, Hospital for Sick Children, Toronto, ON, Canada
| | - Mjaye Mazwi
- Department of Critical Care Medicine, Hospital for Sick Children, Toronto, ON, Canada.,MD/PhD Program, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
35
|
Hasani N, Farhadi F, Morris MA, Nikpanah M, Rhamim A, Xu Y, Pariser A, Collins MT, Summers RM, Jones E, Siegel E, Saboury B. Artificial Intelligence in Medical Imaging and its Impact on the Rare Disease Community: Threats, Challenges and Opportunities. PET Clin 2021; 17:13-29. [PMID: 34809862 DOI: 10.1016/j.cpet.2021.09.009] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Almost 1 in 10 individuals can suffer from one of many rare diseases (RDs). The average time to diagnosis for an RD patient is as high as 7 years. Artificial intelligence (AI)-based positron emission tomography (PET), if implemented appropriately, has tremendous potential to advance the diagnosis of RDs. Patient advocacy groups must be active stakeholders in the AI ecosystem if we are to avoid potential issues related to the implementation of AI into health care. AI medical devices must not only be RD-aware at each stage of their conceptualization and life cycle but also should be trained on diverse and augmented datasets representative of the end-user population including RDs. Inability to do so leads to potential harm and unsustainable deployment of AI-based medical devices (AIMDs) into clinical practice.
Collapse
Affiliation(s)
- Navid Hasani
- Department of Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, 9000 Rockville Pike, Building 10, Room 1C455, Bethesda, MD 20892, USA; University of Queensland Faculty of Medicine, Ochsner Clinical School, New Orleans, LA 70121, USA
| | - Faraz Farhadi
- Department of Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, 9000 Rockville Pike, Building 10, Room 1C455, Bethesda, MD 20892, USA
| | - Michael A Morris
- Department of Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, 9000 Rockville Pike, Building 10, Room 1C455, Bethesda, MD 20892, USA; Department of Computer Science and Electrical Engineering, University of Maryland-Baltimore Country, Baltimore, MD, USA
| | - Moozhan Nikpanah
- Department of Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, 9000 Rockville Pike, Building 10, Room 1C455, Bethesda, MD 20892, USA
| | - Arman Rhamim
- Department of Radiology, BC Cancer Research Institute, University of British Columbia, 675 West 10th Avenue, Vancouver, British Columbia, V5Z 1L3, Canada; Department of Physics, BC cancer Research Institute, University of British Columbia, Vancouver, British Columbia, Canada
| | - Yanji Xu
- Office of Rare Diseases Research, National Center for Advancing Translational Sciences, National Institutes of Health (NIH), Bethesda, MD 20892, USA
| | - Anne Pariser
- Office of Rare Diseases Research, National Center for Advancing Translational Sciences, National Institutes of Health (NIH), Bethesda, MD 20892, USA
| | - Michael T Collins
- Skeletal Disorders and Mineral Homeostasis Section, National Institute of Dental and Craniofacial Research, National Institutes of Health (NIH), Bethesda, MD, USA
| | - Ronald M Summers
- Department of Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, 9000 Rockville Pike, Building 10, Room 1C455, Bethesda, MD 20892, USA
| | - Elizabeth Jones
- Department of Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, 9000 Rockville Pike, Building 10, Room 1C455, Bethesda, MD 20892, USA
| | - Eliot Siegel
- Department of Radiology and Nuclear Medicine, University of Maryland Medical Center, 655 W. Baltimore Street, Baltimore, MD 21201, USA
| | - Babak Saboury
- Department of Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, 9000 Rockville Pike, Building 10, Room 1C455, Bethesda, MD 20892, USA; Department of Computer Science and Electrical Engineering, University of Maryland-Baltimore Country, Baltimore, MD, USA; Department of Radiology, Hospital of the University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
36
|
Bond R, Finlay D, Al-Zaiti SS, Macfarlane P. Machine learning with electrocardiograms: A call for guidelines and best practices for 'stress testing' algorithms. J Electrocardiol 2021; 69S:1-6. [PMID: 34340817 DOI: 10.1016/j.jelectrocard.2021.07.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 06/23/2021] [Accepted: 07/04/2021] [Indexed: 12/13/2022]
Abstract
This paper provides a brief description of how computer programs are used to automatically interpret electrocardiograms (ECGs), and also provides a discussion regarding new opportunities. The algorithms that are typically used today in hospitals are knowledge engineered where a computer programmer manually writes computer code and logical statements which are then used to deduce a possible diagnosis. The computer programmer's code represents the criteria and knowledge that is used by clinicians when reading ECGs. This is in contrast to supervised machine learning (ML) approaches which use large, labelled ECG datasets to induct their own 'rules' to automatically classify ECGs. Although there are many ML techniques, deep neural networks are being increasingly explored as ECG classification algorithms when trained on large ECG datasets. Whilst this paper presents some of the pros and cons of each of these approaches, perhaps there are opportunities to develop hybridised algorithms that combine both knowledge and data driven techniques. In this paper, it is pointed out that open ECG data can dramatically influence what international ECG ML researchers focus on and that, ideally, open datasets could align with real world clinical challenges. In addition, some of the pitfalls and opportunities for ML with ECGs are outlined. A potential opportunity for the ECG community is to provide guidelines to researchers to help guide ECG ML practices. For example, whilst general ML guidelines exist, there is perhaps a need to recommend approaches for 'stress testing' and evaluating ML algorithms for ECG analysis, e.g. testing the algorithm with noisy ECGs and ECGs acquired using common lead and electrode misplacements. This paper provides a primer on ECG ML and discusses some of the key challenges and opportunities.
Collapse
Affiliation(s)
- Raymond Bond
- Faculty of Computing, Engineering and the Built Environment, Ulster University, Jordanstown Campus, Northern Ireland, UK.
| | - Dewar Finlay
- Faculty of Computing, Engineering and the Built Environment, Ulster University, Jordanstown Campus, Northern Ireland, UK
| | | | - Peter Macfarlane
- Institute of Health and Wellbeing, University of Glasgow, Glasgow, Scotland, UK
| |
Collapse
|
37
|
Ronzio L, Campagner A, Cabitza F, Gensini GF. Unity Is Intelligence: A Collective Intelligence Experiment on ECG Reading to Improve Diagnostic Performance in Cardiology. J Intell 2021; 9:jintelligence9020017. [PMID: 33915991 PMCID: PMC8167709 DOI: 10.3390/jintelligence9020017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 02/21/2021] [Accepted: 03/09/2021] [Indexed: 12/03/2022] Open
Abstract
Medical errors have a huge impact on clinical practice in terms of economic and human costs. As a result, technology-based solutions, such as those grounded in artificial intelligence (AI) or collective intelligence (CI), have attracted increasing interest as a means of reducing error rates and their impacts. Previous studies have shown that a combination of individual opinions based on rules, weighting mechanisms, or other CI solutions could improve diagnostic accuracy with respect to individual doctors. We conducted a study to investigate the potential of this approach in cardiology and, more precisely, in electrocardiogram (ECG) reading. To achieve this aim, we designed and conducted an experiment involving medical students, recent graduates, and residents, who were asked to annotate a collection of 10 ECGs of various complexity and difficulty. For each ECG, we considered groups of increasing size (from three to 30 members) and applied three different CI protocols. In all cases, the results showed a statistically significant improvement (ranging from 9% to 88%) in terms of diagnostic accuracy when compared to the performance of individual readers; this difference held for not only large groups, but also smaller ones. In light of these results, we conclude that CI approaches can support the tasks mentioned above, and possibly other similar ones as well. We discuss the implications of applying CI solutions to clinical settings, such as cases of augmented ‘second opinions’ and decision-making.
Collapse
Affiliation(s)
- Luca Ronzio
- Dipartimento di Informatica, Sistemistica e Comunicazione, University of Milano-Bicocca, Viale Sarca 336, 20126 Milan, Italy; (L.R.); (A.C.)
| | - Andrea Campagner
- Dipartimento di Informatica, Sistemistica e Comunicazione, University of Milano-Bicocca, Viale Sarca 336, 20126 Milan, Italy; (L.R.); (A.C.)
| | - Federico Cabitza
- Dipartimento di Informatica, Sistemistica e Comunicazione, University of Milano-Bicocca, Viale Sarca 336, 20126 Milan, Italy; (L.R.); (A.C.)
- Correspondence:
| | | |
Collapse
|
38
|
Felmingham CM, Adler NR, Ge Z, Morton RL, Janda M, Mar VJ. The Importance of Incorporating Human Factors in the Design and Implementation of Artificial Intelligence for Skin Cancer Diagnosis in the Real World. Am J Clin Dermatol 2021; 22:233-242. [PMID: 33354741 DOI: 10.1007/s40257-020-00574-4] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Artificial intelligence (AI) algorithms have been shown to diagnose skin lesions with impressive accuracy in experimental settings. The majority of the literature to date has compared AI and dermatologists as opponents in skin cancer diagnosis. However, in the real-world clinical setting, the clinician will work in collaboration with AI. Existing evidence regarding the integration of such AI diagnostic tools into clinical practice is limited. Human factors, such as cognitive style, personality, experience, preferences, and attitudes may influence clinicians' use of AI. In this review, we consider these human factors and the potential cognitive errors, biases, and unintended consequences that could arise when using an AI skin cancer diagnostic tool in the real world. Integrating this knowledge in the design and implementation of AI technology will assist in ensuring that the end product can be used effectively. Dermatologist leadership in the development of these tools will further improve their clinical relevance and safety.
Collapse
Affiliation(s)
- Claire M Felmingham
- School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC, Australia.
- Victorian Melanoma Service, Alfred Hospital, 55 Commercial Road, Melbourne, VIC, 3004, Australia.
| | - Nikki R Adler
- School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC, Australia
| | - Zongyuan Ge
- Monash eResearch Centre, Monash University, Clayton, Australia
- Department of Electrical and Computer Systems Engineering, Faculty of Engineering, Monash University, Melbourne, VIC, Australia
- Monash-Airdoc Research Centre, Monash University, Melbourne, VIC, Australia
| | - Rachael L Morton
- NHMRC Clinical Trials Centre, Faculty of Medicine and Health, University of Sydney, Camperdown, NSW, Australia
| | - Monika Janda
- Centre for Health Services Research, Faculty of Medicine, The University of Queensland, Brisbane, QLD, Australia
| | - Victoria J Mar
- School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC, Australia
- Victorian Melanoma Service, Alfred Hospital, 55 Commercial Road, Melbourne, VIC, 3004, Australia
| |
Collapse
|
39
|
Schmidt HG, Mamede S. How cognitive psychology changed the face of medical education research. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2020; 25:1025-1043. [PMID: 33244724 PMCID: PMC7704490 DOI: 10.1007/s10459-020-10011-0] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2020] [Accepted: 10/27/2020] [Indexed: 05/25/2023]
Abstract
In this article, the contributions of cognitive psychology to research and development of medical education are assessed. The cognitive psychology of learning consists of activation of prior knowledge while processing new information and elaboration on the resulting new knowledge to facilitate storing in long-term memory. This process is limited by the size of working memory. Six interventions based on cognitive theory that facilitate learning and expertise development are discussed: (1) Fostering self-explanation, (2) elaborative discussion, and (3) distributed practice; (4) help with decreasing cognitive load, (5) promoting retrieval practice, and (6) supporting interleaving practice. These interventions contribute in different measure to various instructional methods in use in medical education: problem-based learning, team-based learning, worked examples, mixed practice, serial-cue presentation, and deliberate reflection. The article concludes that systematic research into the applicability of these ideas to the practice of medical education presently is limited and should be intensified.
Collapse
Affiliation(s)
- Henk G Schmidt
- Department of Psychology, Erasmus University, P.O. Box 1738, 3000, DR, Rotterdam, the Netherlands.
| | - Silvia Mamede
- Department of Psychology, Erasmus University, P.O. Box 1738, 3000, DR, Rotterdam, the Netherlands
| |
Collapse
|
40
|
Bhat A, Podstawczyk D, Walther BK, Aggas JR, Machado-Aranda D, Ward KR, Guiseppi-Elie A. Toward a hemorrhagic trauma severity score: fusing five physiological biomarkers. J Transl Med 2020; 18:348. [PMID: 32928219 PMCID: PMC7490913 DOI: 10.1186/s12967-020-02516-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2020] [Accepted: 09/04/2020] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND To introduce the Hemorrhage Intensive Severity and Survivability (HISS) score, based on the fusion of multi-biomarker data; glucose, lactate, pH, potassium, and oxygen tension, to serve as a patient-specific attribute in hemorrhagic trauma. MATERIALS AND METHODS One hundred instances of Sensible Fictitious Rationalized Patient (SFRP) data were synthetically generated and the HISS score assigned by five clinically active physician experts (100 [5]). The HISS score stratifies the criticality of the trauma patient as; low(0), guarded(1), elevated(2), high(3) and severe(4). Standard classifier algorithms; linear support vector machine (SVM-L), multi-class ensemble bagged decision tree (EBDT), artificial neural network with bayesian regularization (ANN:BR) and possibility rule-based using function approximation (PRBF) were evaluated for their potential to similarly classify and predict a HISS score. RESULTS SVM-L, EBDT, ANN:BR and PRBF generated score predictions with testing accuracies (majority vote) corresponding to 0.91 ± 0.06, 0.93 ± 0.04, 0.92 ± 0.07, and 0.92 ± 0.03, respectively, with no statistically significant difference (p > 0.05). Targeted accuracies of 0.99 and 0.999 could be achieved with SFRP data size and clinical expert scores of 147[7](0.99) and 154[9](0.999), respectively. CONCLUSIONS The predictions of the data-driven model in conjunction with an adjunct multi-analyte biosensor intended for point-of-care continual monitoring of trauma patients, can aid in patient stratification and triage decision-making.
Collapse
Affiliation(s)
- Ankita Bhat
- Center for Bioelectronics, Biosensors and Biochips (C3B®), Department of Biomedical Engineering, Texas A&M University, College Station, TX 77843 USA
| | - Daria Podstawczyk
- Department of Process Engineering and Technology of Polymer and Carbon Materials, Wroclaw University of Science and Technology, Norwida 4/6, 50-373 Wroclaw, Poland
| | - Brandon K. Walther
- Center for Bioelectronics, Biosensors and Biochips (C3B®), Department of Biomedical Engineering, Texas A&M University, College Station, TX 77843 USA
- Department of Cardiovascular Sciences, Houston Methodist Institute for Academic Medicine and Houston Methodist Research Institute, 6670 Bertner Ave, Houston, TX 77030 USA
| | - John R. Aggas
- Center for Bioelectronics, Biosensors and Biochips (C3B®), Department of Biomedical Engineering, Texas A&M University, College Station, TX 77843 USA
| | - David Machado-Aranda
- Departments of Emergency Medicine and Biomedical Engineering, Michigan Center for Integrative Research in Critical Care, University of Michigan, Ann Arbor, MI 48109 USA
- Department of Surgery, Division of Acute Care Surgery, University of Michigan, Ann Arbor, MI 48109 USA
| | - Kevin R. Ward
- Department of Surgery, Division of Acute Care Surgery, University of Michigan, Ann Arbor, MI 48109 USA
| | - Anthony Guiseppi-Elie
- Center for Bioelectronics, Biosensors and Biochips (C3B®), Department of Biomedical Engineering, Texas A&M University, College Station, TX 77843 USA
- Department of Cardiovascular Sciences, Houston Methodist Institute for Academic Medicine and Houston Methodist Research Institute, 6670 Bertner Ave, Houston, TX 77030 USA
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843 USA
- ABTECH Scientific, Inc, Biotechnology Research Park, 800 East Leigh Street, Richmond, VA 23219 USA
| |
Collapse
|
41
|
Knoery CR, Heaton J, Polson R, Bond R, Iftikhar A, Rjoob K, McGilligan V, Peace A, Leslie SJ. Systematic Review of Clinical Decision Support Systems for Prehospital Acute Coronary Syndrome Identification. Crit Pathw Cardiol 2020; 19:119-125. [PMID: 32209826 PMCID: PMC7386869 DOI: 10.1097/hpc.0000000000000217] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2019] [Accepted: 02/23/2020] [Indexed: 12/19/2022]
Abstract
OBJECTIVES Timely prehospital diagnosis and treatment of acute coronary syndrome (ACS) are required to achieve optimal outcomes. Clinical decision support systems (CDSS) are platforms designed to integrate multiple data and can aid with management decisions in the prehospital environment. The review aim was to describe the accuracy of CDSS and individual components in the prehospital ACS management. METHODS This systematic review examined the current literature regarding the accuracy of CDSS for ACS in the prehospital setting, the influence of computer-aided decision-making and of 4 components: electrocardiogram, biomarkers, patient history, and examination findings. The impact of these components on sensitivity, specificity, and positive and negative predictive values was assessed. RESULTS A total of 11,439 articles were identified from a search of databases, of which 199 were screened against the eligibility criteria. Eight studies were found to meet the eligibility and quality criteria. There was marked heterogeneity between studies which precluded formal meta-analysis. However, individual components analysis found that patient history led to significant improvement in the sensitivity and negative predictive values. CDSS which incorporated all 4 components tended to show higher sensitivities and negative predictive values. CDSS incorporating computer-aided electrocardiogram diagnosis showed higher specificities and positive predictive values. CONCLUSIONS Although heterogeneity precluded meta-analysis, this review emphasizes the potential of ACS CDSS in prehospital environments that incorporate patient history in addition to integration of multiple components. The higher sensitivity of certain components, along with higher specificity of computer-aided decision-making, highlights the opportunity for developing an integrated algorithm with computer-aided decision support.
Collapse
Affiliation(s)
- Charles Richard Knoery
- From the Division of Rural Health and Wellbeing, University of the Highlands and Islands, Centre for Health Science, Inverness, United Kingdom
- Cardiac Unit, NHS Highland, Inverness, United Kingdom
| | - Janet Heaton
- From the Division of Rural Health and Wellbeing, University of the Highlands and Islands, Centre for Health Science, Inverness, United Kingdom
| | - Rob Polson
- Highland Health Sciences Library, University of the Highlands and Islands, Centre for Health Science, Inverness, United Kingdom
| | - Raymond Bond
- Ulster University, Jordanstown Campus, Newtownabbey, Northern Ireland, United Kingdom
| | - Aleeha Iftikhar
- Ulster University, Jordanstown Campus, Newtownabbey, Northern Ireland, United Kingdom
| | - Khaled Rjoob
- Ulster University, Jordanstown Campus, Newtownabbey, Northern Ireland, United Kingdom
| | - Victoria McGilligan
- Centre for Personalised Medicine, Ulster University, Londonderry, Northern Ireland, United Kingdom
| | - Aaron Peace
- Centre for Personalised Medicine, Ulster University, Londonderry, Northern Ireland, United Kingdom
- Altnagelvin Cardiology Department, Altnagelvin Hospital, Northern Ireland, United Kingdom
| | - Stephen James Leslie
- From the Division of Rural Health and Wellbeing, University of the Highlands and Islands, Centre for Health Science, Inverness, United Kingdom
- Cardiac Unit, NHS Highland, Inverness, United Kingdom
| |
Collapse
|
42
|
Knoery CR, Bond R, Iftikhar A, Rjoob K, McGilligan V, Peace A, Heaton J, Leslie SJ. SPICED-ACS: Study of the potential impact of a computer-generated ECG diagnostic algorithmic certainty index in STEMI diagnosis: Towards transparent AI. J Electrocardiol 2019; 57S:S86-S91. [PMID: 31472927 DOI: 10.1016/j.jelectrocard.2019.08.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Revised: 07/23/2019] [Accepted: 08/08/2019] [Indexed: 01/21/2023]
Abstract
BACKGROUND Computerised electrocardiogram (ECG) interpretation diagnostic algorithms have been developed to guide clinical decisions like with ST segment elevation myocardial infarction (STEMI) where time in decision making is critical. These computer-generated diagnoses have been proven to strongly influence the final ECG diagnosis by the clinician; often called automation bias. However, the computerised diagnosis may be inaccurate and could result in a wrong or delayed treatment harm to the patient. We hypothesise that an algorithmic certainty index alongside a computer-generated diagnosis might mitigate automation bias. The impact of reporting a certainty index on the final diagnosis is not known. PURPOSE To ascertain whether knowledge of the computer-generated ECG algorithmic certainty index influences operator diagnostic accuracy. METHODOLOGY Clinicians who regularly analyse ECGs such as cardiology or acute care doctors, cardiac nurses and ambulance staff were invited to complete an online anonymous survey between March and April 2019. The survey had 36 ECGs with a clinical vignette of a typical chest pain and which were either a STEMI, normal, or borderline (but do not fit the STEMI criteria) along with an artificially created certainty index that was either high, medium, low or none. Participants were asked whether the ECG showed a STEMI and their confidence in the diagnosis. The primary outcomes were whether a computer-generated certainty index influenced interpreter's diagnostic decisions and improved their diagnostic accuracy. Secondary outcomes were influence of certainty index between different types of clinicians and influence of certainty index on user's own-diagnostic confidence. RESULTS A total of 91 participants undertook the survey and submitted 3262 ECG interpretations of which 75% of ECG interpretations were correct. Presence of a certainty index significantly increased the odds ratio of a correct ECG interpretation (OR 1.063, 95% CI 1.022-1.106, p = 0.004) but there was no significant difference between correct certainty index and incorrect certainty index (OR 1.028, 95% CI 0.923-1.145, p = 0.615). There was a trend for low certainty index to increase odds ratio compared to no certainty index (OR 1.153, 95% CI 0.898-1.482, p = 0.264) but a high certainty index significantly decreased the odds ratio of a correct ECG interpretation (OR 0.492, 95% CI 0.391-0.619, p < 0.001). There was no impact of presence of a certainty index (p = 0.528) or correct certainty index (p = 0.812) on interpreters' confidence in their ECG interpretation. CONCLUSIONS Our results show that the presence of an ECG certainty index improves the users ECG interpretation accuracy. This effect is not seen with differing levels of confidence within a certainty index, with reduced ECG interpretation success with a high certainty index compared with a trend for increased success with a low certainty index. This suggests that a certainty index improves interpretation when there is an increased element of doubt, possibly forcing the ECG user to spend more time and effort analysing the ECG. Further research is needed looking at time spent analysing differing certainty indices with alternate ECG diagnoses.
Collapse
Affiliation(s)
- C R Knoery
- Division of Rural Health and Wellbeing, University of Highlands and Islands, Inverness IV2 3JH, UK; Cardiology Department, Altnagelvin Hospital, Londonderry BT47 6SB, Northern Ireland, UK.
| | - R Bond
- Ulster University, Jordanstown Campus, Shore Rd, Newtownabbey BT37 0QB, Northern Ireland, UK
| | - A Iftikhar
- Ulster University, Jordanstown Campus, Shore Rd, Newtownabbey BT37 0QB, Northern Ireland, UK
| | - K Rjoob
- Ulster University, Jordanstown Campus, Shore Rd, Newtownabbey BT37 0QB, Northern Ireland, UK
| | - V McGilligan
- Centre for Personalised Medicine, Ulster University, Londonderry BT47 6SB, Northern Ireland, UK
| | - A Peace
- Centre for Personalised Medicine, Ulster University, Londonderry BT47 6SB, Northern Ireland, UK; Cardiology Department, Altnagelvin Hospital, Londonderry BT47 6SB, Northern Ireland, UK
| | - J Heaton
- Division of Rural Health and Wellbeing, University of Highlands and Islands, Inverness IV2 3JH, UK
| | - S J Leslie
- Division of Rural Health and Wellbeing, University of Highlands and Islands, Inverness IV2 3JH, UK; Cardiac Unit, Raigmore Hospital, NHS Highland, Inverness IV2 3UJ, UK
| |
Collapse
|
43
|
McHugh LC, Snyder K, Yager TD. The effect of uncertainty in patient classification on diagnostic performance estimations. PLoS One 2019; 14:e0217146. [PMID: 31116772 PMCID: PMC6530857 DOI: 10.1371/journal.pone.0217146] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2018] [Accepted: 05/06/2019] [Indexed: 11/28/2022] Open
Abstract
Background The performance of a new diagnostic test is typically evaluated against a comparator which is assumed to correspond closely to some true state of interest. Judgments about the new test’s performance are based on the differences between the outputs of the test and comparator. It is commonly assumed that a small amount of uncertainty in the comparator’s classifications will negligibly affect the measured performance of a diagnostic test. Methods Simulated datasets were generated to represent typical diagnostic scenarios. Comparator noise was introduced in the form of random misclassifications, and the effect on the apparent performance of the diagnostic test was determined. An actual dataset from a clinical trial on a new diagnostic test for sepsis was also analyzed. Results We demonstrate that as little as 5% misclassification of patients by the comparator can be enough to statistically invalidate performance estimates such as sensitivity, specificity and area under the receiver operating characteristic curve, if this uncertainty is not measured and taken into account. This distortion effect is found to increase non-linearly with comparator uncertainty, under some common diagnostic scenarios. For clinical populations exhibiting high degrees of classification uncertainty, failure to measure and account for this effect will introduce significant risks of drawing false conclusions. The effect of classification uncertainty is magnified further for high performing tests that would otherwise reach near-perfection in diagnostic evaluation trials. A requirement of very high diagnostic performance for clinical adoption, such as a 99% sensitivity, can be rendered nearly unachievable even for a perfect test, if the comparator diagnosis contains even small amounts of uncertainty. This paper and an accompanying online simulation tool demonstrate the effect of classification uncertainty on the apparent performance of tests across a range of typical diagnostic scenarios. Both simulated and real datasets are used to show the degradation of apparent test performance as comparator uncertainty increases. Conclusions Overall, a 5% or greater misclassification rate by the comparator can lead to significant underestimation of true test performance. An online simulation tool allows researchers to explore this effect using their own trial parameters (https://imperfect-gold-standard.shinyapps.io/classification-noise/) and the source code is freely available (https://github.com/ksny/Imperfect-Gold-Standard).
Collapse
Affiliation(s)
- Leo C. McHugh
- Immunexpress, Inc., Seattle, Washington, United States of America
- * E-mail:
| | - Kevin Snyder
- Center for Drug Evaluation and Research, United States Food and Drug Administration, Silver Spring, Maryland, United States of America
| | - Thomas D. Yager
- Immunexpress, Inc., Seattle, Washington, United States of America
| |
Collapse
|
44
|
Litell JM, Meyers HP, Smith SW. Emergency physicians should be shown all triage ECGs, even those with a computer interpretation of “Normal”. J Electrocardiol 2019; 54:79-81. [DOI: 10.1016/j.jelectrocard.2019.03.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2018] [Revised: 02/24/2019] [Accepted: 03/05/2019] [Indexed: 10/27/2022]
|