Machine learning in data abstraction: A computable phenotype for sepsis and septic shock diagnosis in the intensive care unit

doi:10.5492/wjccm.v8.i7.120

Advanced Search

BPG is committed to discovery and dissemination of knowledge

Home / Archive / Volume 8, Issue 7

This Article

(213)

(223)

(0)

(7)

(6045)

(12)

Peer-Review Report of This Article

CrossCheck and Google Search of This Article

Academic Rules and Norms of This Article

Citation of this article

Corresponding Author of This Article

Research Domain of This Article

Article-Type of This Article

Open-Access Policy of This Article

Times Cited Counts in Google of This Article

Journal Information of This Article

Publication Name

World Journal of Critical Care Medicine

ISSN

2220-3141

Publisher of This Article

Baishideng Publishing Group Inc, 7041 Koll Center Parkway, Suite 160, Pleasanton, CA 94566, USA

- Full Article with Cover (PDF)
- Full Article (XML)

Retrospective Cohort Study

World J Crit Care Med. Nov 19, 2019; 8(7): 120-126
Published online Nov 19, 2019. doi: 10.5492/wjccm.v8.i7.120

Machine learning in data abstraction: A computable phenotype for sepsis and septic shock diagnosis in the intensive care unit

Rahul Kashyap, Nathan Jerome Smischney, Timothy J Weister, Danette Bruns, Arnaldo Lopez Ruiz, Laura Piccolo Serafim, Prabij Dhungana

Prabij Dhungana, Nathan Jerome Smischney, Rahul Kashyap, Department of Anesthesiology and Perioperative Medicine, Mayo Clinic, Rochester, MN 55905, United States

Prabij Dhungana, Laura Piccolo Serafim, Arnaldo Lopez Ruiz, Nathan Jerome Smischney, Rahul Kashyap, Multidisciplinary Epidemiology and Translational Research in Intensive Care, Mayo Clinic, Rochester, MN 55905, United States

Laura Piccolo Serafim, Arnaldo Lopez Ruiz, Department of Medicine, Division of Pulmonary and Critical Care Medicine, Mayo Clinic, Rochester, MN 55905, United States

Danette Bruns, Timothy J Weister, Anesthesia Clinical Research Unit, Mayo Clinic, MN 55905, United States

Author contributions: All listed authors provided intellectual contribution and made critical revisions of this paper; Kashyap R, Lopes Ruiz A and Smischney NJ contributed to study conception and design; Dhungana P, Piccolo Serafim L, BrunsD and Weister TJ contributed to data acquisition; Dhungana P, Piccolo Serafim L, Smischney NJ and Kashyap R contributed to data analysis; all authors approved the final version of the manuscript.

Institutional review board statement: The study was reviewed and approved by the Mayo Clinic Institutional Review Board.

Informed consent statement: Retrospective study was exempt from need for informed consent.

Conflict-of-interest statement: Authors declare no conflict of interests for this article.

STROBE statement: The authors have read the STROBE Statement-checklist of items, and the manuscript was prepared and revised according to the STROBE Statement-checklist of items.

Corresponding author: Rahul Kashyap, MBBS, Assistant Professor, MBA, Department of Anesthesiology and Perioperative Medicine, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, United States. kashyap.rahul@mayo.edu

Telephone: +1-507-2557196

Received: April 23, 2019
Peer-review started: May 8, 2019
First decision: August 2, 2019
Revised: August 21, 2019
Accepted: October 27, 2019
Article in press: October 27, 2019
Published online: November 19, 2019
Processing time: 212 Days and 21.9 Hours

Abstract

BACKGROUND

With the recent change in the definition (Sepsis-3 Definition) of sepsis and septic shock, an electronic search algorithm was required to identify the cases for data automation. This supervised machine learning method would help screen a large amount of electronic medical records (EMR) for efficient research purposes.

AIM

To develop and validate a computable phenotype via supervised machine learning method for retrospectively identifying sepsis and septic shock in critical care patients.

METHODS

A supervised machine learning method was developed based on culture orders, Sequential Organ Failure Assessment (SOFA) scores, serum lactate levels and vasopressor use in the intensive care units (ICUs). The computable phenotype was derived from a retrospective analysis of a random cohort of 100 patients admitted to the medical ICU. This was then validated in an independent cohort of 100 patients. We compared the results from computable phenotype to a gold standard by manual review of EMR by 2 blinded reviewers. Disagreement was resolved by a critical care clinician. A SOFA score ≥ 2 during the ICU stay with a culture 72 h before or after the time of admission was identified. Sepsis versions as V1 was defined as blood cultures with SOFA ≥ 2 and Sepsis V2 was defined as any culture with SOFA score ≥ 2. A serum lactate level ≥ 2 mmol/L from 24 h before admission till their stay in the ICU and vasopressor use with Sepsis-1 and-2 were identified as Septic Shock-V1 and-V2 respectively.

RESULTS

In the derivation subset of 100 random patients, the final machine learning strategy achieved a sensitivity-specificity of 100% and 84% for Sepsis-1, 100% and 95% for Sepsis-2, 78% and 80% for Septic Shock-1, and 80% and 90% for Septic Shock-2. An overall percent of agreement between two blinded reviewers had a k = 0.86 and 0.90 for Sepsis 2 and Septic shock 2 respectively. In validation of the algorithm through a separate 100 random patient subset, the reported sensitivity and specificity for all 4 diagnoses were 100%-100% each.

CONCLUSION

Supervised machine learning for identification of sepsis and septic shock is reliable and an efficient alternative to manual chart review.

Keywords: Machine learning; Computable phenotype; Critical care; Sepsis; Septic shock

Core tip: This study presents and validates a supervised machine learning model for the identification of sepsis and septic shock cases using electronic medical records as an alternative to manual chart review. This method showed to be an efficient, fast and reliable option for retrospective data abstraction, with the potential to be applied to other clinical conditions.