Retrospective Study
Copyright ©The Author(s) 2022. Published by Baishideng Publishing Group Inc. All rights reserved.
World J Crit Care Med. Sep 9, 2022; 11(5): 317-329
Published online Sep 9, 2022. doi: 10.5492/wjccm.v11.i5.317
Prediction of hospital mortality in intensive care unit patients from clinical and laboratory data: A machine learning approach
Elena Caires Silveira, Soraya Mattos Pretti, Bruna Almeida Santos, Caio Fellipe Santos Corrêa, Leonardo Madureira Silva, Fabrício Freire de Melo
Elena Caires Silveira, Soraya Mattos Pretti, Bruna Almeida Santos, Caio Fellipe Santos Corrêa, Leonardo Madureira Silva, Fabrício Freire de Melo, Multidisciplinary Institute of Health, Federal University of Bahia, Vitória da Conquista 45-029094, Brazil
Author contributions: Caires Silveira E collected and entered the data, performed the data analysis/statistics and interpretation, and participated in preparation and review of manuscript; Mattos Pretti S and Santos BA participated in the preparation of manuscript and wrote the literature analysis/search; Santos Corrêa CF and Madureira Silva L participated in review of manuscript; Freire de Melo F designed the research and participated in review of manuscript.
Institutional review board statement: For this study, there was no need for an appraisal by an ethics committee, since only publicly available anonymized data were used.
Informed consent statement: This manuscript does not involve “Signed Informed Consent Form”, as it was produced from previously anonymized, publicly available and free of charge data, obeying the norms of medical bioethics. Thus, there was no direct or even indirect contact between researchers and patients, with no necessity for "Signed Informed Consent Form" to carry out our study.
Conflict-of-interest statement: All the authors report no relevant conflicts of interest for this article.
Data sharing statement: No additional data are available.
Open-Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Fabrício Freire de Melo, PhD, Professor, Multidisciplinary Institute of Health, Federal University of Bahia, Rua Hormindo Barros, 58, Quadra 17, Lote 58, Candeias, Vitória da Conquista 45-029094, Brazil. freiremeloufba@gmail.com
Received: June 22, 2021
Peer-review started: June 22, 2021
First decision: July 31, 2021
Revised: August 13, 2021
Accepted: July 5, 2022
Article in press: July 5, 2022
Published online: September 9, 2022
Processing time: 441 Days and 4.9 Hours
Abstract
BACKGROUND

Intensive care unit (ICU) patients demand continuous monitoring of several clinical and laboratory parameters that directly influence their medical progress and the staff’s decision-making. Those data are vital in the assistance of these patients, being already used by several scoring systems. In this context, machine learning approaches have been used for medical predictions based on clinical data, which includes patient outcomes.

AIM

To develop a binary classifier for the outcome of death in ICU patients based on clinical and laboratory parameters, a set formed by 1087 instances and 50 variables from ICU patients admitted to the emergency department was obtained in the “WiDS (Women in Data Science) Datathon 2020: ICU Mortality Prediction” dataset.

METHODS

For categorical variables, frequencies and risk ratios were calculated. Numerical variables were computed as means and standard deviations and Mann-Whitney U tests were performed. We then divided the data into a training (80%) and test (20%) set. The training set was used to train a predictive model based on the Random Forest algorithm and the test set was used to evaluate the predictive effectiveness of the model.

RESULTS

A statistically significant association was identified between need for intubation, as well predominant systemic cardiovascular involvement, and hospital death. A number of the numerical variables analyzed (for instance Glasgow Coma Score punctuations, mean arterial pressure, temperature, pH, and lactate, creatinine, albumin and bilirubin values) were also significantly associated with death outcome. The proposed binary Random Forest classifier obtained on the test set (n = 218) had an accuracy of 80.28%, sensitivity of 81.82%, specificity of 79.43%, positive predictive value of 73.26%, negative predictive value of 84.85%, F1 score of 0.74, and area under the curve score of 0.85. The predictive variables of the greatest importance were the maximum and minimum lactate values, adding up to a predictive importance of 15.54%.

CONCLUSION

We demonstrated the efficacy of a Random Forest machine learning algorithm for handling clinical and laboratory data from patients under intensive monitoring. Therefore, we endorse the emerging notion that machine learning has great potential to provide us support to critically question existing methodologies, allowing improvements that reduce mortality.

Keywords: Hospital mortality; Machine learning; Patient outcome assessment; Routinely collected health data; Intensive care units; Critical care outcomes

Core Tip: Considering the critical nature of patients admitted to intensive care units (ICUs), this study seeks to analyze clinical and laboratory data using a machine learning model based on a Random Forest algorithm. Consequently, we developed a binary classifier that forecasts death outcome, achieving a relevant area under the curve value of 0.85 and identifying the variables that contributed the most to the prediction. With this, we aim to contribute to the improvement and methodological advancement in the development of clinically relevant machine learning tools, seeking to make medical practice decisions more accurate and reduce mortality in ICU patients.