1
|
Li J, Guo Y, Weng C, Wang T, Lu W, Lin L, Wu J, Cheng G, Hu Q. Assessing the robustness of vascular surgery meta-analyses using the Fragility Index: a cross-sectional study. BMJ Open 2025; 15:e098320. [PMID: 40316351 PMCID: PMC12049906 DOI: 10.1136/bmjopen-2024-098320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/21/2024] [Accepted: 04/24/2025] [Indexed: 05/04/2025] Open
Abstract
OBJECTIVES To systematically assess the robustness of meta-analyses based on randomised controlled trials (RCTs) in vascular surgery using the Fragility Index (FI). DESIGN Cross-sectional study. SETTING Meta-analyses published in English from January 2019 to April 2025, identified from EMBASE, PubMed and Web of Science. PARTICIPANTS 67 articles, with 291 meta-analyses involving RCTs evaluating vascular surgical interventions, covering venous, aortic, peripheral arterial, vascular access and other relevant fields. MAIN OUTCOME MEASURES FI, defined as the minimum number of event changes required to alter the statistical significance of meta-analysis results, and its association with sample size and total number of events, analysed using frequency distribution histograms and restricted cubic spline models. RESULTS The median FI was 7, with considerable variation across different fields. Aortic meta-analyses demonstrated higher robustness compared with venous and vascular access meta-analyses. FI showed a non-linear relationship with sample size and total number of events, indicating robustness improved only up to specific thresholds, beyond which robustness declined or plateaued. CONCLUSION Overall robustness of meta-analyses in vascular surgery was moderate, with notable variability among research areas. FI provides valuable insight into the stability of synthesised evidence, suggesting the need for improved methodological quality and advocating broader adoption of FI in meta-analytical research.
Collapse
Affiliation(s)
- Jiacheng Li
- Department of Vascular Surgery, The Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People's Hospital, Quzhou, Zhejiang, China
| | - Yi Guo
- Department of Nosocomial Infection Control, The Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People's Hospital, Quzhou, Zhejiang, China
| | - Chengxin Weng
- Division of Vascular Surgery, Department of General Surgery, West China Hospital of Sichuan University, Chengdu, Sichuan, China
| | - Tiehao Wang
- Division of Vascular Surgery, Department of General Surgery, West China Hospital of Sichuan University, Chengdu, Sichuan, China
| | - Wei Lu
- Department of Vascular Surgery, The Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People's Hospital, Quzhou, Zhejiang, China
| | - Lihong Lin
- Department of Nosocomial Infection Control, The Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People's Hospital, Quzhou, Zhejiang, China
| | - Jiawen Wu
- Department of Vascular Surgery, The Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People's Hospital, Quzhou, Zhejiang, China
| | - Guobing Cheng
- Department of Vascular Surgery, The Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People's Hospital, Quzhou, Zhejiang, China
| | - Qiang Hu
- Department of Vascular Surgery, The Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People's Hospital, Quzhou, Zhejiang, China
| |
Collapse
|
2
|
Patel S, Green A. Death by p-value: the overreliance on p-values in critical care research. Crit Care 2025; 29:73. [PMID: 39934845 PMCID: PMC11816520 DOI: 10.1186/s13054-025-05307-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2025] [Accepted: 02/01/2025] [Indexed: 02/13/2025] Open
Abstract
The p-value has changed from a versatile tool for scientific reasoning to a strict judge of medical information, with the usual 0.05 cutoff frequently deciding a study's significance and subsequent clinical use. Through an examination of five critical care interventions that demonstrated meaningful treatment effects yet narrowly missed conventional statistical significance, this paper illustrates how rigid adherence to p-value thresholds may obscure therapeutically beneficial findings. By providing a clear, step-by-step illustration of a basic Bayesian calculation, we demonstrate that clinical importance can remain undetected when relying solely on p-values. These observations challenge current statistical paradigms and advocate for hybrid approaches-including both frequentist and Bayesian methodologies-to provide a more comprehensive understanding of clinical data, ultimately leading to better-informed medical decisions.
Collapse
Affiliation(s)
- Sharad Patel
- Department of Critical Care Medicine, Cooper University Health Care and Cooper Medical School of Rowan University, 1 Cooper Plaza, Camden, NJ, 08103, USA.
| | - Adam Green
- Department of Critical Care Medicine, Cooper University Health Care and Cooper Medical School of Rowan University, 1 Cooper Plaza, Camden, NJ, 08103, USA
| |
Collapse
|
3
|
Khan NS, Dhanda AK, Takashima M, Liu R, Yoshiyasu Y, Wu W, Jin W, McCoul ED, Ramanathan M, Ahmed OG. What is the robustness of randomized controlled trials supporting rhinosinusitis guidelines? Am J Otolaryngol 2025; 46:104575. [PMID: 39740532 DOI: 10.1016/j.amjoto.2024.104575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2024] [Accepted: 12/17/2024] [Indexed: 01/02/2025]
Abstract
PURPOSE To determine the robustness of randomized controlled trials (RCTs) supporting the current rhinosinusitis guideline; International Consensus Statement on Allergy and Rhinology: rhinosinusitis (ICAR-RS). MATERIALS & METHODS RCTs referenced by ICAR-RS with primary dichotomous outcomes were analyzed. The Fragility Index (FI) was calculated for trials with statistically significant findings. Trial characteristics, the FI, and FI minus number lost to follow-up (LTF) were assessed for associations. RESULTS A total of 317 RCTs were identified, with 38 trials possessing a primary dichotomous outcome. Thirty-one percent evaluated surgical interventions and 24 % were industry-sponsored. The mean sample size was 116 with 9 patients, on average, LTF. Sixty-three percent were eligible for FI calculation and had a median FI of 2.5 (IQR 1, 4.25). Sixty-seven percent of trials had an FI ≤ 3, indicating low robustness. No difference in FI was observed between trials with and without industry support (p = 0.577). The FI was less than or equal to the number of patients LTF in 33 % of trials (n = 8). Higher FI was strongly correlated with higher sample size, total number of events, p-value, and grade of recommendation (p < 0.001). After adjusting for covariates, higher sample size and total number of events were associated with higher FI. CONCLUSION The RCTs used to support the ICAR-RS have an overall low robustness and future rhinosinusitis trials should report FI measures to provide improved context of their results.
Collapse
Affiliation(s)
- Najm S Khan
- Rutgers Robert Wood Johnson Medical School, New Brunswick, NJ, USA; Department of Otolaryngology - Head and Neck Surgery, Houston Methodist, Houston, TX, USA.
| | - Aatin K Dhanda
- Department of Otolaryngology - Head and Neck Surgery, Houston Methodist, Houston, TX, USA
| | - Masayoshi Takashima
- Department of Otolaryngology - Head and Neck Surgery, Houston Methodist, Houston, TX, USA
| | - Richard Liu
- Division of Biostatistics, Department of Population Health, New York University Grossman School of Medicine, NY, New York, USA
| | - Yuki Yoshiyasu
- Department of Otolaryngology-Head and Neck Surgery, University of Texas Medical Branch, Galveston, TX, USA
| | - Wenbo Wu
- Division of Biostatistics, Department of Population Health, New York University Grossman School of Medicine, NY, New York, USA
| | - Whitney Jin
- Baylor College of Medicine, Houston, TX, USA
| | - Edward D McCoul
- Department of Otolaryngology - Head and Neck Surgery, Tulane University School of Medicine, New Orleans, LA, USA; Department of Otorhinolaryngology and Communication Sciences, Ochsner Clinic Foundation, New Orleans, LA, USA
| | - Murugappan Ramanathan
- Department of Otolaryngology - Head and Neck Surgery, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Omar G Ahmed
- Department of Otolaryngology - Head and Neck Surgery, Houston Methodist, Houston, TX, USA
| |
Collapse
|
4
|
Kahana N, Boaz E, Horesh N, Emile SH, Dourado J, Aeschbacher P, Rogers P, Gefen R, Lo Menzo E, Rosenthal RJ. Evaluation of the robustness of randomized controlled trials for the treatment modalities of esophageal cancer using the fragility index - a systematic review. Surg Endosc 2024; 38:7037-7044. [PMID: 39443379 DOI: 10.1007/s00464-024-11343-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2024] [Accepted: 10/06/2024] [Indexed: 10/25/2024]
Abstract
BACKGROUND Esophageal cancer remains a significant global health challenge. Several treatment modalities were explored in randomized controlled trials (RCTs) in recent decades. This study evaluates the robustness of RCTs focusing on esophageal cancer treatment using the fragility index (FI) and reverse fragility index (RFI). METHODS A systematic review of RCTs studying different treatment modalities for esophageal cancer from 2000 to 2023 was conducted. The FI and RFI were utilized to gauge the robustness of statistically significant and non-significant outcomes, respectively. The FI represents the minimal number of patient outcomes that would need to alter to overturn a trial's statistical significance, while RFI indicates the minimal changes required to achieve significance in non-significant results. RESULTS Out of 4028 studies retrieved, 21 RCTs were included for final analysis. The studies spanned 2001 to 2023 with a mean followup of 66 months (range, 29-108 months) and median number of patients of 194 (range, 45-802). The most common treatment modalities examined in these studies were neoadjuvant chemoradiotherapy (n = 7, 33.3%), neoadjuvant chemotherapy (n = 4, 19.0%), and neoadjuvant immunotherapy (n = 2, 9.5%). Only 5 studies (23.8%) had a statistically significant primary outcome result with a median FI of 6 (IQR, 2.5-8.5). Non-significant primary outcomes were seen in 16 studies (76.2%) with a median RFI of 4 (IQR 1-11) and lost to followup of 0 (IQR 0-4). In the study with the highest FI (10), the FI was lower than the number of patients lost to followup (13). CONCLUSION Our findings demonstrate that most RCTs on esophageal cancer treatments did not report significant primary outcomes. The few studies that reported significant results had a low fragility index, suggesting a vulnerability in their findings.
Collapse
Affiliation(s)
- Noam Kahana
- Department of General Surgery, Cleveland Clinic Florida, 2950 Cleveland Clinic Blvd., Weston, FL, 33331, USA
- Department of General Surgery Shaare Zedek Medical Center, Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Elad Boaz
- Department of General Surgery, Cleveland Clinic Florida, 2950 Cleveland Clinic Blvd., Weston, FL, 33331, USA
- Department of General Surgery Shaare Zedek Medical Center, Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Nir Horesh
- Department of General Surgery, Cleveland Clinic Florida, 2950 Cleveland Clinic Blvd., Weston, FL, 33331, USA
- Department of Surgery and Transplantations, Sheba Medical Center, Ramat Gan, Israel, Tel Aviv University, Tel Aviv, Israel
| | - Sameh Hany Emile
- Department of General Surgery, Cleveland Clinic Florida, 2950 Cleveland Clinic Blvd., Weston, FL, 33331, USA
- Colorectal Surgery Unit, Faculty of Medicine, Mansoura University, Mansoura, Egypt
| | - Justin Dourado
- Department of General Surgery, Cleveland Clinic Florida, 2950 Cleveland Clinic Blvd., Weston, FL, 33331, USA
| | - Pauline Aeschbacher
- Department of General Surgery, Cleveland Clinic Florida, 2950 Cleveland Clinic Blvd., Weston, FL, 33331, USA
| | - Pete Rogers
- Department of General Surgery, Cleveland Clinic Florida, 2950 Cleveland Clinic Blvd., Weston, FL, 33331, USA
| | - Rachel Gefen
- Department of General Surgery, Cleveland Clinic Florida, 2950 Cleveland Clinic Blvd., Weston, FL, 33331, USA
- Department of General Surgery, Hadassah Medical Organization and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Emanuele Lo Menzo
- Department of General Surgery, Cleveland Clinic Florida, 2950 Cleveland Clinic Blvd., Weston, FL, 33331, USA
| | - Raul J Rosenthal
- Department of General Surgery, Cleveland Clinic Florida, 2950 Cleveland Clinic Blvd., Weston, FL, 33331, USA.
| |
Collapse
|
5
|
Nanji K, Xie J, Hatamnejad A, Pur DR, Phillips M, Zeraatkar D, Wong TY, Guymer RH, Kaiser PK, Sivaprasad S, Bhandari M, Steel DH, Wykoff CC, Chaudhary V. Exploring the fragility of meta-analyses in ophthalmology: a systematic review. Eye (Lond) 2024; 38:3153-3160. [PMID: 39033242 PMCID: PMC11543934 DOI: 10.1038/s41433-024-03255-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 06/13/2024] [Accepted: 07/15/2024] [Indexed: 07/23/2024] Open
Abstract
OBJECTIVE The fragility index (FI) of a meta-analysis evaluates the extent that the statistical significance can be changed by modifying the event status of individuals from included trials. Understanding the FI improves the interpretation of the results of meta-analyses and can help to inform changes to clinical practice. This review determined the fragility of ophthalmology-related meta-analyses. METHODS Meta-analyses of randomized controlled trials with binary outcomes published in a journal classified as 'Ophthalmology' according to the Journal Citation Report or an Ophthalmology-related Cochrane Review were included. An iterative process determined the FI of each meta-analysis. Multivariable linear regression modeling evaluated the relationship between the FI and potential predictive factors in statistically significant and non-significant meta-analyses. RESULTS 175 meta-analyses were included. The median FI was 6 (Q1-Q3: 3-12). This meant that moving 6 outcomes from one group to another would reverse the study's findings. The FI was 1 for 18 (10.2%) of the included meta-analyses and was ≤5 for 75 (42.4%) of the included meta-analyses. The number of events (p < 0.001) and the p-value (p < 0.001) were the best predictors of the FI in both significant and non-significant meta-analyses. CONCLUSION The statistical significance of meta-analyses in ophthalmology often hinges on the outcome of a few patients. The number of events and the p-value are the most important factors in determining the fragility of the evidence. The FI is an easily interpretable measure that can supplement the reader's understanding of the strength of the evidence being presented. PROSPERO REGISTRATION CRD42022377589.
Collapse
Affiliation(s)
- Keean Nanji
- Department of Surgery, Division of Ophthalmology, McMaster University, Hamilton, ON, Canada
- Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, ON, Canada
| | - Jim Xie
- Department of Surgery, Division of Ophthalmology, McMaster University, Hamilton, ON, Canada
| | - Amin Hatamnejad
- Department of Surgery, Division of Ophthalmology, McMaster University, Hamilton, ON, Canada
| | - Daiana R Pur
- Schulich School of Medicine and Dentistry, Western University, London, ON, Canada
| | - Mark Phillips
- Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, ON, Canada
| | - Dena Zeraatkar
- Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, ON, Canada
- Department of Anesthesia, McMaster University, Hamilton, ON, Canada
| | - Tien Yin Wong
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore
- Tsinghua Medicine, Tsinghua University, Beijing, China
| | - Robyn H Guymer
- Centre for Eye Research Australia, Royal Victorian Eye and Ear Hospital, East Melbourne, Australia
- Department of Surgery (Ophthalmology), The University of Melbourne, Melbourne, Australia
| | - Peter K Kaiser
- Cole Eye Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Sobha Sivaprasad
- NIHR Moorfields Biomedical Research Centre, Moorfields Eye Hospital, London, UK
| | - Mohit Bhandari
- Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, ON, Canada
- Department of Surgery, Division of Orthopedic Surgery, McMaster University, Hamilton, ON, Canada
| | - David H Steel
- Bioscience Institute, Newcastle University, Newcastle Upon Tyne, UK
- Sunderland Eye Infirmary, Sunderland, UK
| | - Charles C Wykoff
- Retina Consultants of Texas, Houston, TX, USA
- Blanton Eye Institute, Houston Methodist Hospital, Houston, TX, USA
| | - Varun Chaudhary
- Department of Surgery, Division of Ophthalmology, McMaster University, Hamilton, ON, Canada.
| |
Collapse
|
6
|
Muñoz J, Cedeño JA, Castañeda GF, Visedo LC. Personalized ventilation adjustment in ARDS: A systematic review and meta-analysis of image, driving pressure, transpulmonary pressure, and mechanical power. Heart Lung 2024; 68:305-315. [PMID: 39214040 DOI: 10.1016/j.hrtlng.2024.08.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 06/28/2024] [Accepted: 08/16/2024] [Indexed: 09/04/2024]
Abstract
BACKGROUND Acute Respiratory Distress Syndrome (ARDS) necessitates personalized treatment strategies due to its heterogeneity, aiming to mitigate Ventilator-Induced Lung Injury (VILI). Advanced monitoring techniques, including imaging, driving pressure, transpulmonary pressure, and mechanical power, present potential avenues for tailored interventions. OBJECTIVE To review some of the most important techniques for achieving greater personalization of mechanical ventilation in ARDS patients as evaluated in randomized clinical trials, by analyzing their effect on three clinically relevant aspects: mortality, ventilator-free days, and gas exchange. METHODS Following PRISMA guidelines, we conducted a systematic review and meta-analysis of Randomized Clinical Trials (RCTs) involving adult ARDS patients undergoing personalized ventilation adjustments. Outcomes were mortality (primary end-point), ventilator-free days, and oxygenation improvement. RESULTS Among 493 identified studies, 13 RCTs (n = 1255) met inclusion criteria. No personalized ventilation strategy demonstrated superior outcomes compared to traditional protocols. Meta-analysis revealed no significant reduction in mortality with image-guided (RR 0.88, 95 % CI 0.70-1.11), driving pressure-guided (RR 0.61, 95 % CI 0.29-1.30), or transpulmonary pressure-guided (RR 0.85, 95 % CI 0.58-1.24) strategies. Ventilator-free days and oxygenation outcomes showed no significant differences. CONCLUSION Our study does not support the superiority of personalized ventilation techniques over traditional protocols in ARDS patients. Further research is needed to standardize ventilation strategies and determine their impact on mechanical ventilation outcomes.
Collapse
Affiliation(s)
- Javier Muñoz
- ICU, Hospital General Universitario Gregorio Marañón, C/ Dr. Esquedo 46, 28009 Madrid, Spain.
| | - Jamil Antonio Cedeño
- ICU, Hospital General Universitario Gregorio Marañón, C/ Dr. Esquedo 46, 28009 Madrid, Spain
| | | | - Lourdes Carmen Visedo
- C. S. San Juan de la Cruz, Pozuelo de Alarcón, C/ San Juan de la Cruz s/n, 28223 Madrid, Spain
| |
Collapse
|
7
|
McKinney JA, Day Carson K, Lin L, Sanchez-Ramos L. Fragility of statistically significant outcomes in obstetric randomized trials. Am J Obstet Gynecol MFM 2024; 6:101449. [PMID: 39095024 DOI: 10.1016/j.ajogmf.2024.101449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 07/11/2024] [Accepted: 07/19/2024] [Indexed: 08/04/2024]
Affiliation(s)
- Jordan A McKinney
- Department of Obstetrics and Gynecology, University of Florida College of Medicine, Jacksonville, FL.
| | - Kelcey Day Carson
- Department of Obstetrics and Gynecology, University of Florida College of Medicine, Jacksonville, FL
| | - Lifeng Lin
- Department of Epidemiology and Biostatistics, University of Arizona, Tucson, AZ
| | - Luis Sanchez-Ramos
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, University of Florida College of Medicine, Jacksonville, FL
| |
Collapse
|
8
|
Meade MH, Buchan L, Michael M, Woods B. The Fragility Index: Understanding Its Application in Clinical Research. Clin Spine Surg 2024; 37:337-339. [PMID: 39037066 DOI: 10.1097/bsd.0000000000001668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Accepted: 06/28/2024] [Indexed: 07/23/2024]
Abstract
With the vast increase in spinal surgery research and accessibility, critical evaluation of studies is paramount. Historically, P values and confidence intervals have been the gold standard, but more recently, the inclusion of the Fragility Index has brought a more holistic approach. The Fragility Index aims to communicate the robustness of a trial and how tenuous statistical significance may be. It can be used in conjunction with more traditional methods for evaluating research.
Collapse
Affiliation(s)
- Matthew H Meade
- Division of Orthopaedic Surgery, Rowan University, Stratford, NJ
| | - Levi Buchan
- Division of Orthopaedic Surgery, Rowan University, Stratford, NJ
| | - Mark Michael
- Division of Orthopaedic Surgery, Rowan University, Stratford, NJ
| | - Barrett Woods
- The Rothman Institute at Thomas Jefferson University, Division of Orthopaedic Spine Surgery, Philadelphia, PA
| |
Collapse
|
9
|
Jones TW, Hendrick T, Chase AM. Heterogeneity, Bayesian thinking, and phenotyping in critical care: A primer. Am J Health Syst Pharm 2024; 81:812-832. [PMID: 38742459 DOI: 10.1093/ajhp/zxae139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2024] [Indexed: 05/16/2024] Open
Abstract
PURPOSE To familiarize clinicians with the emerging concepts in critical care research of Bayesian thinking and personalized medicine through phenotyping and explain their clinical relevance by highlighting how they address the issues of frequent negative trials and heterogeneity of treatment effect. SUMMARY The past decades have seen many negative (effect-neutral) critical care trials of promising interventions, culminating in calls to improve the field's research through adopting Bayesian thinking and increasing personalization of critical care medicine through phenotyping. Bayesian analyses add interpretive power for clinicians as they summarize treatment effects based on probabilities of benefit or harm, contrasting with conventional frequentist statistics that either affirm or reject a null hypothesis. Critical care trials are beginning to include prospective Bayesian analyses, and many trials have undergone reanalysis with Bayesian methods. Phenotyping seeks to identify treatable traits to target interventions to patients expected to derive benefit. Phenotyping and subphenotyping have gained prominence in the most syndromic and heterogenous critical care disease states, acute respiratory distress syndrome and sepsis. Grouping of patients has been informative across a spectrum of clinically observable physiological parameters, biomarkers, and genomic data. Bayesian thinking and phenotyping are emerging as elements of adaptive clinical trials and predictive enrichment, paving the way for a new era of high-quality evidence. These concepts share a common goal, sifting through the noise of heterogeneity in critical care to increase the value of existing and future research. CONCLUSION The future of critical care medicine will inevitably involve modification of statistical methods through Bayesian analyses and targeted therapeutics via phenotyping. Clinicians must be familiar with these systems that support recommendations to improve decision-making in the gray areas of critical care practice.
Collapse
Affiliation(s)
- Timothy W Jones
- Department of Pharmacy, Piedmont Eastside Medical Center, Snellville, GA
- Department of Clinical and Administrative Pharmacy, University of Georgia College of Pharmacy, Athens, GA, USA
| | - Tanner Hendrick
- Department of Pharmacy, University of North Carolina Medical Center, Chapel Hill, NC, USA
| | - Aaron M Chase
- Department of Clinical and Administrative Pharmacy, University of Georgia College of Pharmacy, Athens, GA
- Department of Pharmacy, Augusta University Medical Center, Augusta, GA, USA
| |
Collapse
|
10
|
Li KD, Venishetty N, Fernandez AM, Hakam N, Ghaffar U, Gupta S, Patel HV, Breyer BN. Fragility of overactive bladder medication clinical trials: A systematic review. Neurourol Urodyn 2024; 43:1523-1533. [PMID: 38594889 DOI: 10.1002/nau.25468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Accepted: 04/01/2024] [Indexed: 04/11/2024]
Abstract
PURPOSE Overactive bladder (OAB) syndrome significantly impairs quality of life, often necessitating pharmacological interventions with associated risks. The fragility of OAB trial outcomes, as measured by the fragility index (FI: smallest number of event changes to reverse statistical significance) and quotient (FQ: FI divided by total sample size expressed as a percentage), is critical yet unstudied. MATERIALS AND METHODS We conducted a systematic search for randomized controlled trials on OAB medications published between January 2000 and August 2023. Inclusion criteria were trials with two parallel arms reporting binary outcomes related to OAB medications. We extracted trial details, outcomes, and statistical tests employed. We calculated FI and FQ, analyzing associations with trial characteristics through linear regression. RESULTS We included 57 trials with a median sample size of 211 participants and a 12% median lost to follow-up. Most studies investigated anticholinergics (37/57, 65%). The median FI/FQ was 5/3.5%. Larger trials were less fragile (median FI 8; FQ 1.0%) compared to medium (FI: 4; FQ 2.5%) and small trials (FI: 4; FQ 8.3%). Double-blinded studies exhibited higher FQs (median 2.9%) than unblinded trials (6.7%). Primary and secondary outcomes had higher FIs (median 5 and 6, respectively) than adverse events (FI: 4). Each increase in 10 participants was associated with a +0.19 increase in FI (p < 0.001). CONCLUSIONS A change in outcome for a median of five participants, or 3.5% of the total sample size, could reverse the direction of statistical significance in OAB trials. Studies with larger sample sizes and efficacy outcomes from blinded trials were less fragile.
Collapse
Affiliation(s)
- Kevin D Li
- Department of Urology, University of California San Francisco, San Francisco, California, USA
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California, USA
| | - Nikit Venishetty
- Paul L. Foster School of Medicine, Texas Tech University Health Sciences Center, El Paso, Texas, USA
| | - Adrian M Fernandez
- Department of Urology, University of California San Francisco, San Francisco, California, USA
| | - Nizar Hakam
- Department of Urology, University of California San Francisco, San Francisco, California, USA
| | - Umar Ghaffar
- Department of Urology, University of California San Francisco, San Francisco, California, USA
| | - Shiv Gupta
- Department of Urology, University of California San Francisco, San Francisco, California, USA
| | - Hiren V Patel
- Department of Urology, University of California San Francisco, San Francisco, California, USA
| | - Benjamin N Breyer
- Department of Urology, University of California San Francisco, San Francisco, California, USA
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California, USA
| |
Collapse
|
11
|
Luo M, Huang J, Wang Y, Li Y, Liu Z, Liu M, Tao Y, Cao R, Chai Q, Liu J, Fei Y. How fragile the positive results of Chinese herbal medicine randomized controlled trials on irritable bowel syndrome are? BMC Complement Med Ther 2024; 24:300. [PMID: 39143474 PMCID: PMC11323352 DOI: 10.1186/s12906-024-04561-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 06/21/2024] [Indexed: 08/16/2024] Open
Abstract
OBJECTIVE The fragility index (FI), which is the minimum number of changes in status from "event" to "non-event" resulting in a loss of statistical significance, serves as a significant supplementary indicator for clinical physicians in interpreting clinical trial results and aids in understanding the outcomes of randomized controlled trials (RCTs). In this systematic literature survey, we evaluated the FI for RCTs evaluating Chinese herbal medicine (CHM) for irritable bowel syndrome (IBS), and explored potential associations between study characteristics and the robustness of RCTs. METHODS A comprehensive search was conducted in four databases in Chinese and four databases in English from their inception to January 1, 2023. RCTs encompassed 1:1 ratio into two parallel arms and reported at least one binary outcome that demonstrated statistical significance were included. FI was calculated by the iterative reduction of a target outcome event in the treatment group and concomitant subtraction of a non-target event from that group, until positive significance (defined as P < 0.05 by Fisher's exact test) is lost. The lower the FI (minimum 1) of a trial outcome, the more fragile the positive result of the outcome was. Linear regression models were adopted to explore influence factors of the value of FI. RESULTS A total of 30 trials from 2 4118 potentially relevant citations were finally included. The median FI of total trials included was 1.5 (interquartile range [IQR], 1-5), and half of the trials (n = 15) had a FI equal to 1. In 12 trials (40%), the total number of participants lost to follow-up surpassed the respective FI. The study also identified that increased FI was significantly associated with no TCM syndrome differentiation for inclusion criteria of the patients, larger total sample size, low risk of bias, and larger numbers of events. CONCLUSIONS The majority of CHM IBS RCTs with positive results were found to be fragile. Ensuring adequate sample size, scientifically rigorous study design, proper control of confounding factors, and a quality control calibration for consistency of TCM diagnostic results among clinicians should be addressed to increase the robustness of the RCTs. We recommend reporting the FI as one of the components of sensitivity analysis in future RCTs to facilitate the assessment of the fragility of trials.
Collapse
Affiliation(s)
- Minjing Luo
- Centre for Evidence-Based Chinese Medicine, Beijing University of Chinese Medicine, No.11, Bei San Huan Dong Lu, Chaoyang District, Beijing, 100029, China
- School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing, 100029, China
| | - Jinghan Huang
- Centre for Evidence-Based Chinese Medicine, Beijing University of Chinese Medicine, No.11, Bei San Huan Dong Lu, Chaoyang District, Beijing, 100029, China
| | - Yingqiao Wang
- Centre for Evidence-Based Chinese Medicine, Beijing University of Chinese Medicine, No.11, Bei San Huan Dong Lu, Chaoyang District, Beijing, 100029, China
| | - Yilin Li
- School of Qi-Huang Chinese Medicine, Beijing University of Chinese Medicine, Beijing, 100029, China
| | - Zhihan Liu
- Centre for Evidence-Based Chinese Medicine, Beijing University of Chinese Medicine, No.11, Bei San Huan Dong Lu, Chaoyang District, Beijing, 100029, China
- School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing, 100029, China
| | - Meijun Liu
- Centre for Evidence-Based Chinese Medicine, Beijing University of Chinese Medicine, No.11, Bei San Huan Dong Lu, Chaoyang District, Beijing, 100029, China
| | - Yunci Tao
- Centre for Evidence-Based Chinese Medicine, Beijing University of Chinese Medicine, No.11, Bei San Huan Dong Lu, Chaoyang District, Beijing, 100029, China
- School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing, 100029, China
| | - Rui Cao
- Centre for Evidence-Based Chinese Medicine, Beijing University of Chinese Medicine, No.11, Bei San Huan Dong Lu, Chaoyang District, Beijing, 100029, China
- School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing, 100029, China
| | - Qianyun Chai
- Centre for Evidence-Based Chinese Medicine, Beijing University of Chinese Medicine, No.11, Bei San Huan Dong Lu, Chaoyang District, Beijing, 100029, China
- School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing, 100029, China
| | - Jianping Liu
- Centre for Evidence-Based Chinese Medicine, Beijing University of Chinese Medicine, No.11, Bei San Huan Dong Lu, Chaoyang District, Beijing, 100029, China
- School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing, 100029, China
| | - Yutong Fei
- Centre for Evidence-Based Chinese Medicine, Beijing University of Chinese Medicine, No.11, Bei San Huan Dong Lu, Chaoyang District, Beijing, 100029, China.
- School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing, 100029, China.
| |
Collapse
|
12
|
Al-Asadi M, Sherren M, Abdel Khalik H, Leroux T, Ayeni OR, Madden K, Khan M. The Continuous Fragility Index of Statistically Significant Findings in Randomized Controlled Trials That Compare Interventions for Anterior Shoulder Instability. Am J Sports Med 2024; 52:2667-2675. [PMID: 38258495 PMCID: PMC11344964 DOI: 10.1177/03635465231202522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 07/31/2023] [Indexed: 01/24/2024]
Abstract
BACKGROUND Evidence-based care relies on robust research. The fragility index (FI) is used to assess the robustness of statistically significant findings in randomized controlled trials (RCTs). While the traditional FI is limited to dichotomous outcomes, a novel tool, the continuous fragility index (CFI), allows for the assessment of the robustness of continuous outcomes. PURPOSE To calculate the CFI of statistically significant continuous outcomes in RCTs evaluating interventions for managing anterior shoulder instability (ASI). STUDY DESIGN Meta-analysis; Level of evidence, 2. METHODS A search was conducted across the MEDLINE, Embase, and CENTRAL databases for RCTs assessing management strategies for ASI from inception to October 6, 2022. Studies that reported a statistically significant difference between study groups in ≥1 continuous outcome were included. The CFI was calculated and applied to all available RCTs reporting interventions for ASI. Multivariable linear regression was performed between the CFI and various study characteristics as predictors. RESULTS There were 27 RCTs, with a total of 1846 shoulders, included. The median sample size was 61 shoulders (IQR, 43). The median CFI across 27 RCTs was 8.2 (IQR, 17.2; 95% CI, 3.6-15.4). The median CFI was 7.9 (IQR, 21; 95% CI, 1-22) for 11 studies comparing surgical methods, 22.6 (IQR, 16; 95% CI, 8.2-30.4) for 6 studies comparing nonsurgical reduction interventions, 2.8 for 3 studies comparing immobilization methods, and 2.4 for 3 studies comparing surgical versus nonsurgical interventions. Significantly, 22 of 57 included outcomes (38.6%) from studies with completed follow-up data had a loss to follow-up exceeding their CFI. Multivariable regression demonstrated that there was a statistically significant positive correlation between a trial's sample size and the CFI of its outcomes (r = 0.23 [95% CI, 0.13-0.33]; P < .001). CONCLUSION More than a third of continuous outcomes in ASI trials had a CFI less than the reported loss to follow-up. This carries the significant risk of reversing trial findings and should be considered when evaluating available RCT data. We recommend including the FI, CFI, and loss to follow-up in the abstracts of future RCTs.
Collapse
Affiliation(s)
- Mohammed Al-Asadi
- Faculty of Health Sciences, McMaster University, Hamilton, Ontario, Canada
| | | | - Hassaan Abdel Khalik
- Division of Orthopaedic Surgery, Department of Surgery, McMaster University, Hamilton, Ontario, Canada
| | - Timothy Leroux
- Division of Orthopaedic Surgery, Department of Surgery, University of Toronto, Toronto, Ontario, Canada
| | - Olufemi R. Ayeni
- Division of Orthopaedic Surgery, Department of Surgery, McMaster University, Hamilton, Ontario, Canada
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Kim Madden
- Division of Orthopaedic Surgery, Department of Surgery, McMaster University, Hamilton, Ontario, Canada
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Moin Khan
- Division of Orthopaedic Surgery, Department of Surgery, McMaster University, Hamilton, Ontario, Canada
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| |
Collapse
|
13
|
Skorochod R, Gronovich Y. Fragility Index and Fragility Quotient in Statistically Significant Randomized Controlled Trials in Plastic Breast Surgery. PLASTIC AND RECONSTRUCTIVE SURGERY-GLOBAL OPEN 2024; 12:e5916. [PMID: 38903137 PMCID: PMC11188868 DOI: 10.1097/gox.0000000000005916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 05/01/2024] [Indexed: 06/22/2024]
Abstract
Background The fragility index (FI) was conceived as an adjunct to the P value, signifying the strength of statistically significant results. The index states the minimal number of patients whose outcome must be changed from "event" to "nonevent" for the results to be statistically nonsignificant. The FI was applied in various medical specialties to assess the robustness of results presented in studies. We aim to assess the robustness of statistically significant results in studies on plastic surgery of the breast and determine factors correlated with studies deemed fragile. Methods A systematic literature review of PubMed databases using designated keywords was performed. Background characteristics were extracted from the studies, alongside the significance of outcomes. FI and fragility quotient were calculated for each analyzed outcome and correlated with various baseline characteristics. Results FI and fragility quotient were both significantly correlated only with the P value of the analyzed outcomes. However, grouping studies based on the P value into three categories did not demonstrate a difference in FI. Comparisons of fragile and robust studies did not demonstrate a statistically significant change in terms of baseline variables, except for the mean P value of the outcome. Conclusion Statistically significant results of randomized controlled trials in plastic surgery of the breast suffer from extensive fragility, and researchers should critically implement their conclusions in their practice.
Collapse
Affiliation(s)
- Ron Skorochod
- From the Department of Plastic and Reconstructive Surgery, Shaare Zedek Medical Center; Hebrew University Faculty of Medicine, Jerusalem, Israel
| | - Yoav Gronovich
- From the Department of Plastic and Reconstructive Surgery, Shaare Zedek Medical Center; Hebrew University Faculty of Medicine, Jerusalem, Israel
| |
Collapse
|
14
|
Ramesh AV, Munby HNP, Thomas M. The fragility index in randomised controlled trials of interventions for aneurysmal subarachnoid haemorrhage: A systematic review. J Intensive Care Soc 2024; 25:164-170. [PMID: 38737309 PMCID: PMC11086711 DOI: 10.1177/17511437231218199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/14/2024] Open
Abstract
Background Fragility analysis supplements the p-value and risk of bias assessment in the interpretation of results of randomised controlled trials. In this systematic review we determine the fragility index (FI) and fragility quotient (FQ) of randomised trials in aneurysmal subarachnoid haemorrhage. Methods This is a systematic review registered with PROSPERO (ID: CRD42020173604). Randomised controlled trials in adults with aneurysmal subarachnoid haemorrhage were analysed if they reported a statistically significant primary outcome of mortality, function (e.g. modified Rankin Scale), vasospasm or delayed neurological deterioration. Results We identified 4825 records with 18 randomised trials selected for analysis. The median fragility index was 2.5 (inter-quartile range 0.25-5) and the median fragility quotient was 0.015 (IQR 0.02-0.039). Five of 20 trial outcomes (25%) had a fragility index of 0. In seven trials (39.0%), the number of participants lost to follow-up was greater than or equal to the fragility index. Only 16.7% of trials are at low risk of bias. Conclusion Randomised controlled trial evidence supporting management of aneurysmal subarachnoid haemorrhage is weaker than indicated by conventional analysis using p-values alone. Increased use of fragility analysis by clinicians and researchers could improve the translation of evidence to practice.
Collapse
Affiliation(s)
- Aravind V Ramesh
- ST6 Intensive Care Medicine, North Bristol NHS Trust, Bristol, UK
| | - Henry NP Munby
- ST7 Intensive Care Medicine & Respiratory Medicine, University Hospitals Bristol and Weston NHS Foundation Trust, Bristol, UK
| | - Matt Thomas
- Intensive Care Medicine, North Bristol NHS Trust, Bristol, UK
| |
Collapse
|
15
|
Bai X, Wan Z, Li Y, Jiang Q, Wu X, Xu B, Li X, Zhou R, Mi J, Sun Y, Ruan G, Han W, Li G, Yang H. Fragility index analysis for randomized controlled trials of approved biologicals and small molecule drugs in inflammatory bowel diseases. Int Immunopharmacol 2024; 130:111752. [PMID: 38422772 DOI: 10.1016/j.intimp.2024.111752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 01/30/2024] [Accepted: 02/21/2024] [Indexed: 03/02/2024]
Abstract
INTRODUCTION Biologics and small molecules have been increasingly applied in Crohn's disease (CD) and ulcerative colitis (UC). But the robustness of their trials has not been evaluated. METHODS We initially collected all the approved biologics or small molecules for CD or UC up to December 1, 2022. Databases were then queried by keywords in chemical name and CD or UC. Randomized controlled trials (RCTs) in the two-arm, 1:1 design were included. Fragility index (FI) and fragility quotient (FQ) were subsequently calculated. RESULTS We included twenty-eight RCTs, including nine pivotal trials listed in approval labels, nineteen non-pivotal trials not included in the labels. The median sample size was 99 [IQR, 60-262] and the median number of loss-of-follow-up (LFU) was 14 [IQR, 8-43]. Pivotal trials in the labels had the median FI of 8 [IQR, 4-14, n = 6] that was marginally higher than non-pivotal trials (3 [IQR, 2-4], p = 0.08). The median FQ was 0.0330 [IQR, 0.1220-0.0466] and 0.0310 [IQR, 0.0129-0.0540] for pivotal and non-pivotal trials, respectively (p = 1.0). The sample size and FI were significantly correlated (Spearman correlation coefficient [r] = 0.56, 95 %CI 0.21-0.78, p = 0.003). The number of total events was also significantly correlated with FI (r = 0.53, 95 %CI 0.17-0.77, p = 0.006). Study p-values were significantly associated with FI (p = 0.01): trials with p-values < 0.001 had the highest median FI of 10 [IQR, 6-17]. No factor was found strongly correlated with FQ. CONCLUSION Results from trials assessing administration-approved biologics or small molecules for treating CD or UC were vulnerable to small changes by measuring FI or FQ. Pivotal studies contributing to regulatory approvals exhibited a relatively higher degree of resilience compared to non-pivotal trials.
Collapse
Affiliation(s)
- Xiaoyin Bai
- Department of Gastroenterology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - Ziqi Wan
- Eight-year Program, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Yi Li
- Tsinghua Clinical Research Institute, School of Medicine, Tsinghua University, Beijing, China
| | - Qingwei Jiang
- Department of Gastroenterology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - Xia Wu
- Department of Medicine, Tufts Medical Center, Boston, MA 02111, USA
| | | | | | - Runing Zhou
- Department of Gastroenterology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - Jiarui Mi
- Department of Cell and Molecular Biology, Karolinska Institutet, Solna, Sweden
| | - Yinghao Sun
- Department of Gastroenterology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - Gechong Ruan
- Department of Gastroenterology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - Wei Han
- Institute of Basic Medical Sciences, School of Basic Medicine, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, China
| | | | - Hong Yang
- Department of Gastroenterology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China.
| |
Collapse
|
16
|
Kotani Y, Turi S, Ortalda A, Baiardo Redaelli M, Marchetti C, Landoni G, Bellomo R. Positive single-center randomized trials and subsequent multicenter randomized trials in critically ill patients: a systematic review. Crit Care 2023; 27:465. [PMID: 38017475 PMCID: PMC10685543 DOI: 10.1186/s13054-023-04755-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 11/21/2023] [Indexed: 11/30/2023] Open
Abstract
BACKGROUND It is unclear how often survival benefits observed in single-center randomized controlled trials (sRCTs) involving critically ill patients are confirmed by subsequent multicenter randomized controlled trials (mRCTs). We aimed to perform a systemic literature review of sRCTs with a statistically significant mortality reduction and to evaluate whether subsequent mRCTs confirmed such reduction. METHODS We searched PubMed for sRCTs published in the New England Journal of Medicine, JAMA, or Lancet, from inception until December 31, 2016. We selected studies reporting a statistically significant mortality decrease using any intervention (drug, technique, or strategy) in adult critically ill patients. We then searched for subsequent mRCTs addressing the same research question tested by the sRCT. We compared the concordance of results between sRCTs and mRCTs when any mRCT was available. We registered this systematic review in the PROSPERO International Prospective Register of Systematic Reviews (CRD42023455362). RESULTS We identified 19 sRCTs reporting a significant mortality reduction in adult critically ill patients. For 16 sRCTs, we identified at least one subsequent mRCT (24 trials in total), while the interventions from three sRCTs have not yet been addressed in a subsequent mRCT. Only one out of 16 sRCTs (6%) was followed by a mRCT replicating a significant mortality reduction; 14 (88%) were followed by mRCTs with no mortality difference. The positive finding of one sRCT (6%) on intensive glycemic control was contradicted by a subsequent mRCT showing a significant mortality increase. Of the 14 sRCTs referenced at least once in international guidelines, six (43%) have since been either removed or suggested against in the most recent versions of relevant guidelines. CONCLUSION Mortality reduction shown by sRCTs is typically not replicated by mRCTs. The findings of sRCTs should be considered hypothesis-generating and should not contribute to guidelines.
Collapse
Affiliation(s)
- Yuki Kotani
- Department of Anesthesia and Intensive Care, IRCCS San Raffaele Scientific Institute, Via Olgettina 60, 20132, Milan, Italy
- School of Medicine, Vita-Salute San Raffaele University, Via Olgettina 58, 20132, Milan, Italy
- Department of Intensive Care Medicine, Kameda Medical Center, 929 Higashi-cho, Kamogawa, Chiba, 296-8602, Japan
| | - Stefano Turi
- Department of Anesthesia and Intensive Care, IRCCS San Raffaele Scientific Institute, Via Olgettina 60, 20132, Milan, Italy
| | - Alessandro Ortalda
- Department of Anesthesia and Intensive Care, IRCCS San Raffaele Scientific Institute, Via Olgettina 60, 20132, Milan, Italy
| | - Martina Baiardo Redaelli
- Department of Anesthesia and Intensive Care, IRCCS San Raffaele Scientific Institute, Via Olgettina 60, 20132, Milan, Italy
| | - Cristiano Marchetti
- Department of Anesthesia and Intensive Care, IRCCS San Raffaele Scientific Institute, Via Olgettina 60, 20132, Milan, Italy
| | - Giovanni Landoni
- Department of Anesthesia and Intensive Care, IRCCS San Raffaele Scientific Institute, Via Olgettina 60, 20132, Milan, Italy.
- School of Medicine, Vita-Salute San Raffaele University, Via Olgettina 58, 20132, Milan, Italy.
| | - Rinaldo Bellomo
- Department of Critical Care, The University of Melbourne, Melbourne, Australia
- Australian and New Zealand Intensive Care Research Centre, Monash University, Melbourne, Australia
| |
Collapse
|
17
|
Pearsall C, Constant M, Saltzman BM, Parisien RL, Levine W, Trofa D. The Fragility of Statistical Significance in Sham Orthopaedic Surgery: A Systematic Review of Randomized Controlled Trials. J Am Acad Orthop Surg 2023; 31:e994-e1002. [PMID: 37678845 DOI: 10.5435/jaaos-d-23-00245] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 07/26/2023] [Indexed: 09/09/2023] Open
Abstract
OBJECTIVES The purpose of this study was to determine the stability of statistical findings among sham surgery randomized controlled trials (RCTs) in orthopaedic surgery using fragility analysis. METHODS PubMed systematic review was conducted to include studies reporting dichotomous outcomes pertaining to sham surgery. The final review included eight RCTs involving only partial meniscectomies and vertebroplasties from 2009 to 2020. With a fixed sample size with dichotomous outcome measures (events versus non-events), the Total Fragility Index (TFI), which is composed of the fragility index (FI) and reverse fragility index (RFI), was calculated by altering the ratio of events to non-events in an iterative fashion until results were reversed from significant to nonsignificant findings (FI) or vice versa (RFI). The TFI, FI, and RFI were divided by their sample sizes to obtain the respective total fragility quotient, fragility quotient (FQ), and reverse fragility quotient. Median fragility indices and quotients were reported for all studies. RESULTS The eight RCTs included 50 dichotomous outcomes involving either partial meniscectomies or vertebroplasties, with a median TFI and total fragility quotient of 5 [interquartile range (IQR) 4 to 6] and 0.035 (IQR 0.028 to 0.048), respectively, indicating that a median of five total patients or 3.5 per 100 patients would need to experience a different outcome to reverse significant or insignificant findings for each of the eight trials. Among the 8 statistically significant ( P < 0.05) outcome events (16%), the respective FI and FQ were 2 (IQR 1 to 5) and 0.018 (IQR 0.010 to 0.044). Among the 42 statistically insignificant outcome events (84%), the respective RFI and reverse fragility quotient were 5 (IQR 4 to 6) and 0.04 (IQR 0.034 to 0.048). The median number of patients lost to follow-up was 1.5 (IQR 0.5 to 2). CONCLUSION The unstable findings in partial meniscectomy and vertebroplasty sham surgical RCTs undermine their study conclusions and recommendations. We recommend using fragility analysis in future sham surgical RCTs to contextualize statistical findings. LEVEL OF EVIDENCE Level IV; Systematic Review.
Collapse
Affiliation(s)
- Christian Pearsall
- From the Department of Orthopedic Surgery, Columbia University Irving Medical Center, New York, NY (Pearsall, Constant, Levine, and Trofa), the Department of Orthopedic Surgery, OrthoCarolina, Charlotte, NC (Saltzman), and the Department of Orthopedic Surgery, Mount Sinai Health System, New York, NY (Parisien)
| | | | | | | | | | | |
Collapse
|
18
|
Bleakley CM, Wagemans J, Schurz AP, Smoliga JM. How robust are clinical trials in primary and secondary ankle sprain prevention? Phys Ther Sport 2023; 64:85-90. [PMID: 37801794 DOI: 10.1016/j.ptsp.2023.08.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 08/24/2023] [Accepted: 08/25/2023] [Indexed: 10/08/2023]
Abstract
OBJECTIVES Determine the statistical stability of RCTs examining primary and secondary prevention of ankle sprains. METHODS Databases were searched to August 2023. We included parallel design RCTs, using conservative interventions for preventing ankle sprain, reporting dichotomous injury event outcomes. Statistical stability was quantified using Fragility Index (FI) and Fragility Quotient (FQ). Subgroup analyses were undertaken to test if FI varied based on by study objective, original approach to analysis (frequency vs time to event), follow-up duration, and pre-registration. RESULTS 3559 studies were screened with 45 RCTs included. The median number of events required to change the statistical significance (FI) was 4 (IQR 1-6). FI was similar regardless of study objective, original analysis, follow-up duration, and pre-registration status. Median (IQR) FQ was 0.015 (0.005-0.046), therefore reversing events <2 patients/100 would alter significance. In 80% of studies the number of patients lost to follow-up was greater than the FI. CONCLUSION RCTs informing primary and secondary prevention of ankle sprain are fragile. Only a small percentage of outcome event reversals would reverse study significance, and this is often exceeded by the number of drop outs. Robust reporting of dichotomous outcomes requires the use P values and key metrics such as FI or FQ.
Collapse
Affiliation(s)
- C M Bleakley
- Faculty of Life and Health Sciences, Ulster University, Belfast, United Kingdom.
| | - J Wagemans
- Faculty of Medicine and Health Sciences, University of Antwerp, Belgium
| | - A P Schurz
- Department of Health Professions, Bern University of Applied Sciences, Switzerland; Faculty of Physical Education and Physiotherapy, Vrije Universiteit Brussels, Belgium
| | - J M Smoliga
- Department of Physical Therapy, High Point University, United States; School of Medicine, Tufts University, United States
| |
Collapse
|
19
|
Berg A, Lyons NB, Badami A, Reynolds J, Pizano L, Pust GD, Meizoso J, Namias N, Yeh DD. Statistical Power of Randomized Controlled Trials in Trauma Surgery. J Am Coll Surg 2023; 237:731-736. [PMID: 37417653 DOI: 10.1097/xcs.0000000000000800] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/08/2023]
Abstract
BACKGROUND Our purpose was to conduct a bibliometric study investigating the prevalence of underpowered randomized controlled trials (RCTs) in trauma surgery. STUDY DESIGN A medical librarian conducted a search of RCTs in trauma published from 2000 to 2021. Data extracted included study type, sample size calculation, and power analyses. Post hoc calculations were performed using a power of 80% and an alpha level of 0.05. A CONSORT checklist was then tabulated from each study as well as a fragility index for studies with statistical significance. RESULTS In total 187 RCTs from multiple continents and 60 journals were examined. A total of 133 (71%) were found to have "positive" findings consistent with their hypothesis. When evaluating their methods, 51.3% of articles did not report how they calculated their intended sample size. Of those that did, 25 (27%) did not meet their target enrollment. When examining post hoc power, 46%, 57%, and 65% were adequately powered to detect small, medium, and large effect sizes, respectively. Only 11% of RCTs had complete adherence with CONSORT reporting guidelines and the average CONSORT score was 19 out of 25. For positive superiority trials with binary outcomes, the fragility index median (interquartile range) was 2 (2 to 8). CONCLUSIONS A concerningly large proportion of recently published RCTs in trauma surgery do not report a priori sample size calculations, do not meet enrollment targets, and are not adequately powered to detect even large effect sizes. There exists opportunity for improvement of trauma surgery study design, conduct, and reporting.
Collapse
Affiliation(s)
- Arthur Berg
- From the Department of Trauma and Surgical Critical Care, Jackson Memorial Hospital, Miami, FL (Berg, Lyons, Badami, Reynolds, Pizano, Pust, Meizoso, Namias)
| | - Nicole B Lyons
- From the Department of Trauma and Surgical Critical Care, Jackson Memorial Hospital, Miami, FL (Berg, Lyons, Badami, Reynolds, Pizano, Pust, Meizoso, Namias)
| | - Abbasali Badami
- From the Department of Trauma and Surgical Critical Care, Jackson Memorial Hospital, Miami, FL (Berg, Lyons, Badami, Reynolds, Pizano, Pust, Meizoso, Namias)
| | - John Reynolds
- From the Department of Trauma and Surgical Critical Care, Jackson Memorial Hospital, Miami, FL (Berg, Lyons, Badami, Reynolds, Pizano, Pust, Meizoso, Namias)
| | - Louis Pizano
- From the Department of Trauma and Surgical Critical Care, Jackson Memorial Hospital, Miami, FL (Berg, Lyons, Badami, Reynolds, Pizano, Pust, Meizoso, Namias)
| | - Gerd Daniel Pust
- From the Department of Trauma and Surgical Critical Care, Jackson Memorial Hospital, Miami, FL (Berg, Lyons, Badami, Reynolds, Pizano, Pust, Meizoso, Namias)
| | - Jonathan Meizoso
- From the Department of Trauma and Surgical Critical Care, Jackson Memorial Hospital, Miami, FL (Berg, Lyons, Badami, Reynolds, Pizano, Pust, Meizoso, Namias)
| | - Nicholas Namias
- From the Department of Trauma and Surgical Critical Care, Jackson Memorial Hospital, Miami, FL (Berg, Lyons, Badami, Reynolds, Pizano, Pust, Meizoso, Namias)
| | - Daniel Dante Yeh
- and the Department of Trauma and Surgical Critical Care, Denver Health, Denver, CO (Yeh)
| |
Collapse
|
20
|
Parhar KKS, Doig C. The authors reply. Crit Care Med 2023; 51:e188-e189. [PMID: 37589527 DOI: 10.1097/ccm.0000000000005971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/18/2023]
Affiliation(s)
- Ken Kuljit S Parhar
- Department of Critical Care Medicine, University of Calgary and Alberta Health Services, Foothills Medical Center, Calgary, AB, Canada
- O'Brien Institute for Public Health, University of Calgary, Calgary, AB, Canada
- Libin Cardiovascular Institute, University of Calgary, Calgary, AB, Canada
- Department of Community Health Sciences, University of Calgary, Calgary, AB, Canada
| | - Christopher Doig
- Department of Critical Care Medicine, University of Calgary and Alberta Health Services, Foothills Medical Center, Calgary, AB, Canada
- O'Brien Institute for Public Health, University of Calgary, Calgary, AB, Canada
- Department of Community Health Sciences, University of Calgary, Calgary, AB, Canada
| |
Collapse
|
21
|
Hayes J, Zuercher M, Gai N, Chowdhury AR, Aoyama K. The Fragility Index of randomized controlled trials in pediatric anesthesiology. Can J Anaesth 2023; 70:1449-1460. [PMID: 37286747 DOI: 10.1007/s12630-023-02513-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 01/16/2023] [Accepted: 01/23/2023] [Indexed: 06/09/2023] Open
Abstract
PURPOSE The P value is a widely used measure of statistical importance but has many drawbacks and limitations, one being that it does not reflect the robustness of the results of a clinical trial. The Fragility Index (FI) was developed as a measure of how many outcome events would need to change to nonevents to render a significant P value nonsignificant (P ≥ 0.05). The FI of trials from other medical specialties is typically < 5. We aimed to determine the FI of pediatric anesthesiology randomized controlled trials (RCT) and to test for association with various characteristics of the included trials. METHODS We conducted a comprehensive systematic search of high-impact anesthesia, surgical, and medical journals from the last 25 years for trials comparing an intervention between two groups with a statistically significant P value (< 0.05) for a dichotomous outcome. We also compared FI values for variables that reflect the quality and importance of a trial. RESULTS The median [interquartile range] FI was 3 [1-7] and correlated positively with the number of participants (rS = 0.41; P < 0.001) and events (rS = 0.42; P < 0.001), and negatively with the P value (rPB = -0.36; P < 0.001). Other measures of trial quality and impact or importance were not strongly associated with the FI. CONCLUSIONS The FI of published trials in pediatric anesthesiology is similarly low as in other medical specialties. Larger trials with more events and P values ≤ 0.01 were associated with a higher FI.
Collapse
Affiliation(s)
- Jason Hayes
- Department of Anesthesia and Pain Medicine, The Hospital for Sick Children (SickKids), 555 University Avenue, Toronto, ON, M5G 1X8, Canada.
- Department of Anesthesiology and Pain Medicine, University of Toronto, Toronto, ON, Canada.
| | - Mael Zuercher
- Department of Anesthesia and Pain Medicine, The Hospital for Sick Children (SickKids), 555 University Avenue, Toronto, ON, M5G 1X8, Canada
| | - Nan Gai
- Department of Anesthesia and Pain Medicine, The Hospital for Sick Children (SickKids), 555 University Avenue, Toronto, ON, M5G 1X8, Canada
- Department of Anesthesiology and Pain Medicine, University of Toronto, Toronto, ON, Canada
| | - Apala Roy Chowdhury
- Department of Anesthesia and Pain Medicine, The Hospital for Sick Children (SickKids), 555 University Avenue, Toronto, ON, M5G 1X8, Canada
| | - Kazuyoshi Aoyama
- Department of Anesthesia and Pain Medicine, The Hospital for Sick Children (SickKids), 555 University Avenue, Toronto, ON, M5G 1X8, Canada
- Department of Anesthesiology and Pain Medicine, University of Toronto, Toronto, ON, Canada
- Program in Child Health Evaluative Sciences, SickKids Research Institute, Toronto, ON, Canada
| |
Collapse
|
22
|
Reynolds PM, Wells L, Powell M, MacLaren R. Associated Mortality Risk of Proton Pump Inhibitor Therapy for the Prevention of Stress Ulceration in Intensive Care Unit Patients: A Systematic Review and Meta-analysis. J Clin Gastroenterol 2023; 57:586-594. [PMID: 35648972 DOI: 10.1097/mcg.0000000000001723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Accepted: 04/27/2022] [Indexed: 12/10/2022]
Abstract
GOALS The aim was to systematically evaluate risks and benefits of proton pump inhibitor (PPI) use for stress ulcer prophylaxis in the critically ill patient. BACKGROUND Whether PPIs increase mortality in the critically ill patient remains controversial. STUDY Systematic review and meta-analysis of randomized controlled trials (RCTs) and cohort studies with trial sequential analysis, Bayesian sensitivity analysis, and fragility index analysis. RESULTS A total of 31 studies in 78,009 critically ill adults receiving PPIs versus any comparator were included. PPI use was associated with an increased mortality risk in all studies [19.6% PPI vs. 17.5% comparator; RR: 1.10; 95% confidence interval (CI): 1.02-1.20; P =0.01], in the subgroup of RCTs (19.4% vs. 18.7%; RR: 1.05; 95% CI: 1.0-1.09, P =0.04), but not cohort studies (19.9% vs. 16.7%; RR: 1.12; 95% CI: 0.98-1.28, P =0.09). Results were maintained with a Bayesian sensitivity analysis (RR: 1.13; 95% credible interval: 1.035-1.227) and a fragility index analysis, but not sequential analysis ( P =0.16). RCTs with a higher baseline severity of illness revealed the greatest mortality risk with PPI use (32.1% PPI vs. 29.4% comparator; RR: 1.09; 95% CI: 1.04-1.14; P <0.001). PPI use reduced clinically important bleeding in RCTs (1.4% PPI vs. 2.1% comparator; RR: 0.67; 95% CI: 0.5-0.9; P =0.009) but increased bleeding in cohort studies (2.7% PPI vs. 1.2% comparator; RR: 2.05; 95% CI: 1.2-3.52; P =0.009). PPI use was not associated with a lower incidence of clinically important bleeding when compared with histamine-2 receptor antagonists (1.3% vs. 1.9%; RR: 0.59; 95% CI: 0.28-1.25, P =0.09). CONCLUSIONS This meta-analysis demonstrated an association between PPI use and an increased risk of mortality.
Collapse
Affiliation(s)
- Paul M Reynolds
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado, Denver, CO
| | - Lauren Wells
- PGY2 Emergency Medicine Pharmacy Resident, Froedtert and the Medical College of Wisconsin, Wauwatosa, WI
| | | | - Robert MacLaren
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado, Denver, CO
| |
Collapse
|
23
|
Sidali S, Sritharan N, Campani C, Gregory J, Durand F, Ganne-Carrié N, Ronot M, Lévy V, Nault JC. Fragility index of positive phase II and III randomised clinical trials of treatments for hepatocellular carcinoma (2002-2022). JHEP Rep 2023; 5:100755. [PMID: 37425214 PMCID: PMC10326696 DOI: 10.1016/j.jhepr.2023.100755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 03/18/2023] [Accepted: 03/21/2023] [Indexed: 07/11/2023] Open
Abstract
Background & Aims The fragility index (FI), i.e., theminimum number of best survivors reassigned to the control group required to revert the statistically significant result of a clinical trial to non-significant, is a metric to evaluate the robustness of randomized controlled trials (RCTs). We aimed to assess the FI in the field of HCC. Methods This is a retrospective analysis of phase 2 and 3 RCTs for the treatment of HCC published between 2002 and 2022. We included two-arm studies with 1:1 randomization and significant positive results for a primary time-to-event endpoint for the FI calculation, which involves the iterative addition of a best survivor from the experimental group to the control group, until positive significance (p <0,05, Log-rank test) is lost. Results We identified 51 phase 2 and 3 positive RCTs, of which 29 (57%) were eligible for fragility index calculation. After reconstruction of the Kaplan-Meier curves, 25/29 studies remained significant, among which the analysis was performed. The median (interquartile range (IQR)) FI was 5 (2-10) and Fragility Quotient (FQ) was 3% (1%-6%). Ten trials (40%) had a FI of 2 or less. FI was positively correlated to the blind assessment of the primary endpoint (median FI 9 with blind assessment versus 2 without, p = 0.01), the number of reported events in the control arm (RS = 0.45, p = 0.02) and to impact factor (RS = 0.58, p = 0.003). Conclusions Several phases 2 and 3 RCTs in HCC have a low fragility index, underlying the limited robustness on the conclusion of their superiority over control treatments. The fragility index might provide an additional tool to assess the robustness of clinical trial data in HCC. Impact and implications The fragility index is a method to assess robustness of a clinical trial and is defined the minimum number of best survivors reassigned to the control group required to revert the statistically significant result of a clinical trial to non-significant. Among 25 randomised controlled trials in HCC, the median fragility index was 5, and 10 trials among 25 (40%) had a fragility index of 2 or less, indicating an important fragility.
Collapse
Affiliation(s)
- Sabrina Sidali
- Université de Paris, Service d’Hépatologie, DMU DIGEST, Hôpital Beaujon, APHP Nord, Clichy, France
- Centre de Recherche des Cordeliers, Sorbonne Université, Inserm, Université de Paris, Team ‘Functional Genomics of Solid Tumors’, Equipe Labellisée Ligue Nationale Contre le Cancer, Labex OncoImmunology, Paris, France
| | - Nanthara Sritharan
- Department of Clinical Research, Paris Seine Saint Denis Hospital, Sorbonne Paris University, APHP, Bobigny, France
| | - Claudia Campani
- Centre de Recherche des Cordeliers, Sorbonne Université, Inserm, Université de Paris, Team ‘Functional Genomics of Solid Tumors’, Equipe Labellisée Ligue Nationale Contre le Cancer, Labex OncoImmunology, Paris, France
| | - Jules Gregory
- Department of Radiology, FHU MOSAIC, Hôpital Beaujon APHP Nord, Clichy, France
- Université de Paris, INSERM, UMR1153, Epidemiology and Biostatistics Sorbonne Paris Cité Center (CRESS), METHODS Team, Paris, France
| | - François Durand
- Université de Paris, Service d’Hépatologie, DMU DIGEST, Hôpital Beaujon, APHP Nord, Clichy, France
| | - Nathalie Ganne-Carrié
- Centre de Recherche des Cordeliers, Sorbonne Université, Inserm, Université de Paris, Team ‘Functional Genomics of Solid Tumors’, Equipe Labellisée Ligue Nationale Contre le Cancer, Labex OncoImmunology, Paris, France
- Liver Unit, Hôpital Avicenne, Hôpitaux Universitaires Paris-Seine-Saint-Denis, Assistance-Publique Hôpitaux de Paris, Bobigny, France
- Unité de Formation et de Recherche Santé Médecine et Biologie Humaine, Université Sorbonne Paris Nord, Bobigny, France
| | - Maxime Ronot
- Department of Radiology, FHU MOSAIC, Hôpital Beaujon APHP Nord, Clichy, France
- Université de Paris, INSERM U1149 ‘Centre de Recherche sur L'inflammation’, CRI, Paris, France
| | - Vincent Lévy
- Department of Clinical Research, Paris Seine Saint Denis Hospital, Sorbonne Paris University, APHP, Bobigny, France
- ECSTRRA Team, CRESS UMR 1153, Hôpital Saint-Louis, APHP, Paris, France
| | - Jean-Charles Nault
- Centre de Recherche des Cordeliers, Sorbonne Université, Inserm, Université de Paris, Team ‘Functional Genomics of Solid Tumors’, Equipe Labellisée Ligue Nationale Contre le Cancer, Labex OncoImmunology, Paris, France
- Liver Unit, Hôpital Avicenne, Hôpitaux Universitaires Paris-Seine-Saint-Denis, Assistance-Publique Hôpitaux de Paris, Bobigny, France
- Unité de Formation et de Recherche Santé Médecine et Biologie Humaine, Université Sorbonne Paris Nord, Bobigny, France
| |
Collapse
|
24
|
Aslan A, Stevens C, Aldine AS, Mamilly A, De Alba L, Arevalo O, Ahuja C, Cuellar HH. The reproducibility of interventional radiology randomized controlled trials and external validation of a classification system. Diagn Interv Radiol 2023; 29:529-534. [PMID: 37070845 PMCID: PMC10679611 DOI: 10.4274/dir.2023.222052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 02/24/2023] [Indexed: 04/19/2023]
Abstract
PURPOSE The fragility index (FI) measures the robustness of randomized controlled trials (RCTs). It complements the P value by taking into account the number of outcome events. In this study, the authors measured the FI for major interventional radiology RCTs. METHODS Interventional radiology RCTs published between January 2010 and December 2022 relating to trans-jugular intrahepatic portosystemic shunt, trans-arterial chemoembolization, needle biopsy, angiography, angioplasty, thrombolysis, and nephrostomy tube insertion were analyzed to measure the FI and robustness of the studies. RESULTS A total of 34 RCTs were included. The median FI of those studies was 4.5 (range 1-68). Seven trials (20.6%) had a number of patients lost to follow-up that was higher than their FI, and 15 (44.1%) had a FI of 1-3. CONCLUSION The median FI, and hence the reproducibility of interventional radiology RCTs, is low compared to other medical fields, with some having a FI of 1, which should be interrupted cautiously.
Collapse
Affiliation(s)
- Assala Aslan
- Department of Radiology and Interventional Radiology, Ochsner-Louisiana State University, Shreveport, United States
| | - Christopher Stevens
- Department of Radiology and Interventional Radiology, Ochsner-Louisiana State University, Shreveport, United States
| | - Amro Saad Aldine
- Department of Radiology and Interventional Radiology, Ochsner-Louisiana State University, Shreveport, United States
| | - Ahmed Mamilly
- Department of Radiology and Interventional Radiology, Ochsner-Louisiana State University, Shreveport, United States
| | - Luis De Alba
- Department of Radiology and Interventional Radiology, Ochsner-Louisiana State University, Shreveport, United States
| | - Octavio Arevalo
- Department of Radiology and Interventional Radiology, Ochsner-Louisiana State University, Shreveport, United States
| | - Chaitanya Ahuja
- Department of Radiology and Interventional Radiology, Ochsner-Louisiana State University, Shreveport, United States
| | - Hugo H. Cuellar
- Department of Radiology and Interventional Radiology, Ochsner-Louisiana State University, Shreveport, United States
| |
Collapse
|
25
|
Lee Y, Samarasinghe Y, Chen LH, Jong A, Hapugall A, Javidan A, McKechnie T, Doumouras A, Hong D. Fragility of statistically significant findings from randomized trials in comparing laparoscopic versus robotic abdominopelvic surgeries. Surg Endosc 2023:10.1007/s00464-023-10063-4. [PMID: 37095233 DOI: 10.1007/s00464-023-10063-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Accepted: 04/01/2023] [Indexed: 04/26/2023]
Abstract
BACKGROUND Utility of robotic over laparoscopic approach has been an area of debate across all surgical specialties over the past decade. The fragility index (FI) is a metric that evaluates the frailty of randomized controlled trials (RCTs) findings by altering the status of patients from an event to non-event until significance is lost. This study aims to evaluate the robustness of RCTs comparing laparoscopic and robotic abdominopelvic surgeries through the FI. METHODS A search was conducted in MEDLINE and EMBASE for RCTs with dichotomous outcomes comparing laparoscopic and robot-assisted surgery in general surgery, gynecology, and urology. The FI and reverse fragility Index (RFI) metrics were used to assess the strength of findings reported by RCTs, and bivariate correlation was conducted to analyze relationships between FI and trial characteristics. RESULTS A total of 21 RCTs were included, with a median sample size of 89 participants (Interquartile range [IQR] 62-126). The median FI was 2 (IQR 0-15) and median RFI 5.5 (IQR 4-8.5). The median FI was 3 (IQR 1-15) for general surgery (n = 7), 2 (0.5-3.5) for gynecology (n = 4), and 0 (IQR 0-8.5) for urology RCTs (n = 4). Correlation was found between increasing FI and decreasing p-value, but not sample size, number of outcome events, journal impact factor, loss to follow-up, or risk of bias. CONCLUSION RCTs comparing laparoscopic and robotic abdominal surgery did not prove to be very robust. While possible advantages of robotic surgery may be emphasized, it remains novel and requires further concrete RCT data.
Collapse
Affiliation(s)
- Yung Lee
- Division of General Surgery, McMaster University, Hamilton, ON, Canada
- Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
| | | | - Lucy H Chen
- Division of General Surgery, McMaster University, Hamilton, ON, Canada
| | - Audrey Jong
- Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Akithma Hapugall
- Division of General Surgery, McMaster University, Hamilton, ON, Canada
| | - Arshia Javidan
- Division of Vascular Surgery, University of Toronto, Toronto, ON, Canada
| | - Tyler McKechnie
- Division of General Surgery, McMaster University, Hamilton, ON, Canada
- Department of Health Research Methods and Evidence, McMaster University, Hamilton, ON, Canada
| | | | - Dennis Hong
- Division of General Surgery, McMaster University, Hamilton, ON, Canada.
- Division of General Surgery, St. Joseph's Healthcare, 50 Charlton Avenue East, Hamilton, ON, L8N 4A6, Canada.
| |
Collapse
|
26
|
Murad MH, Kara Balla A, Khan MS, Shaikh A, Saadi S, Wang Z. Thresholds for interpreting the fragility index derived from sample of randomised controlled trials in cardiology: a meta-epidemiologic study. BMJ Evid Based Med 2023; 28:133-136. [PMID: 35264405 DOI: 10.1136/bmjebm-2021-111858] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/26/2022] [Indexed: 11/03/2022]
Abstract
The fragility index (FI) was proposed as a simplified way to communicate robustness of statistically significant results and their susceptibility to a change of a handful number of events. While this index is intuitive, it is not anchored by a cut-off or a guide for interpretation. We identified cardiovascular trials published in six high impact journals from 2007 to 2021 (500 or more participants and a dichotomous statistically significant primary outcome). We estimated area under curve (AUC) to determine FI value that best predicts whether the treatment effect was precise, defined as adequately powered for a plausible relative risk reduction (RRR) of 25% or 30% or having a CI that is sufficiently narrow to exclude a risk reduction that is too small (close to the null, <0.05). The median FI of 201 included cardiovascular trials was 13 (range 1-172). FI exceeded the number of patients lost to follow-up in 46/201 (22.89%) trials. FI values of 19 and 22 predicted that trials would be precise (powered for RRR of 30% and 25%; respectively, combined with CI that excluded risk reduction <0.05). AUC for meeting these precision criteria was 0.90 (0.86-0.94). In conclusion, FI values that range 19-22 may meet various definitions of precision and can be used as a rule of thumb to suggest that a treatment effect is likely precise and less susceptible to random error. The number of patients lost to follow-up should be presented alongside FI to better illustrate fragility.
Collapse
Affiliation(s)
- Mohammad Hassan Murad
- Division of Public Health, Infectious Diseases and Occupational Medicine, Mayo Clinic, Rochester, MN, USA
- Evidence-based Practice Center, Mayo Clinic, Rochester, Minnesota, USA
| | | | - Muhammad Shahzeb Khan
- Division of Cardiology, Duke University School of Medicine, Durham, North Carolina, USA
| | - Asim Shaikh
- Department of Internal Medicine, Dow University of Health Sciences, Karachi, Pakistan
| | - Samer Saadi
- Evidence-based Practice Center, Mayo Clinic, Rochester, Minnesota, USA
| | - Zhen Wang
- Evidence-based Practice Center, Mayo Clinic, Rochester, Minnesota, USA
| |
Collapse
|
27
|
Lee Y, Samarasinghe Y, Chen LH, Hapugall A, Javidan A, McKechnie T, Doumouras A, Hong D. Fragility of statistically significant outcomes in randomized trials comparing bariatric surgeries. Int J Obes (Lond) 2023:10.1038/s41366-023-01298-1. [PMID: 37005473 DOI: 10.1038/s41366-023-01298-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Revised: 03/08/2023] [Accepted: 03/14/2023] [Indexed: 04/04/2023]
Abstract
BACKGROUND Randomized controlled trials (RCTs) are regarded as high-level evidence, but the strength of their P values can be difficult to ascertain. The Fragility Index (FI) is a novel metric that evaluates the frailty of trial findings. It is defined as the minimum number of patients required to change from a non-event to event for the findings to lose statistical significance. This study aims to characterize the robustness of bariatric surgery RCTs by examining their FIs. METHODS A search was conducted in MEDLINE, EMBASE, and CENTRAL from January 2000 to February 2022 for RCTs comparing two bariatric surgeries with statistically significant dichotomous outcomes. Bivariate correlation was conducted to identify associations between FI and trial characteristics. RESULTS A total of 35 RCTs were included with a median sample size of 80 patients (Interquartile range [IQR] 58-109). The median FI was 2 (IQR 0-5), indicating that altering the status of two patients in one treatment arm would overturn the statistical significance of results. Subgroup analyses of RCTs evaluating diabetes-related outcomes produced a FI of 4 (IQR 2-6.5), while RCTs comparing Roux-en-Y gastric bypass and sleeve gastrectomy had an FI of 2 (IQR 0.5-5). Increasing FI was found to be correlated with decreasing P value, increasing sample size, increasing number of events, and increasing journal impact factor. CONCLUSION Bariatric surgery RCTs are fragile, with only a few patients required to change from non-events to events to reverse the statistical significance of most trials. Future research should examine the use of FI in trial design.
Collapse
Affiliation(s)
- Yung Lee
- Division of General Surgery, McMaster University, Hamilton, ON, Canada
- Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
| | | | - Lucy H Chen
- Division of General Surgery, McMaster University, Hamilton, ON, Canada
| | - Akithma Hapugall
- Division of General Surgery, McMaster University, Hamilton, ON, Canada
| | - Arshia Javidan
- Division of Vascular Surgery, University of Toronto, Toronto, ON, Canada
| | - Tyler McKechnie
- Division of General Surgery, McMaster University, Hamilton, ON, Canada
- Department of Health Research Methods and Evidence, McMaster University, Hamilton, ON, Canada
| | | | - Dennis Hong
- Division of General Surgery, McMaster University, Hamilton, ON, Canada.
| |
Collapse
|
28
|
Lee Y, Samarasinghe Y, Javidan A, Tahir U, Samarasinghe N, Shargall Y, Finley C, Hanna W, Agzarian J. The fragility of significant results from randomized controlled trials in esophageal surgeries. Esophagus 2023; 20:195-204. [PMID: 36689016 DOI: 10.1007/s10388-023-00985-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Accepted: 01/05/2023] [Indexed: 01/24/2023]
Abstract
While randomized controlled trials (RCTs) are regarded as one of the highest forms of clinical research, the robustness of their P values can be difficult to ascertain. Defined as the minimum number of patients in a study arm that would need to be changed from a non-event to event for the findings to lose significance, the Fragility Index is a method for evaluating results from these trials. This study aims to calculate the Fragility Index for trials evaluating perioperative esophagectomy-related interventions to determine the strength of RCTs in this field. MEDLINE and EMBASE were searched for RCTs related to esophagectomy that reported a significant dichotomous outcome. Two reviewers independently screened articles and performed the data extractions with risk of bias assessment. The Fragility Index was calculated using a two-tailed Fisher's exact test. Bivariate correlation was conducted to evaluate associations between the Fragility Index and study characteristics. 41 RCTs were included, and the median sample size was 80 patients [Interquartile range (IQR) 60-161]. Of the included outcomes, 29 (71%) were primary, and 12 (29%) were secondary. The median Fragility Index was 1 (IQR 1-3), meaning that by changing one patient from a non-event to event, the results would become non-significant. Fragility Index was correlated with P value, number of events, and journal impact factor. The RCTs related to esophagectomy did not prove to be robust, as the significance of their results could be changed by altering the outcome status of a handful of patients in one study arm.
Collapse
Affiliation(s)
- Yung Lee
- Division of General Surgery, McMaster University, Hamilton, ON, Canada
| | - Yasith Samarasinghe
- Division of Thoracic Surgery, Department of Surgery, McMaster University, 50 Charlton Avenue East T-2105, Hamilton, ON, L8N 4A6, Canada
| | - Arshia Javidan
- Division of Vascular Surgery, University of Toronto, Toronto, ON, Canada
| | - Umair Tahir
- Division of Thoracic Surgery, Department of Surgery, McMaster University, 50 Charlton Avenue East T-2105, Hamilton, ON, L8N 4A6, Canada
| | | | - Yaron Shargall
- Division of Thoracic Surgery, Department of Surgery, McMaster University, 50 Charlton Avenue East T-2105, Hamilton, ON, L8N 4A6, Canada
| | - Christian Finley
- Division of Thoracic Surgery, Department of Surgery, McMaster University, 50 Charlton Avenue East T-2105, Hamilton, ON, L8N 4A6, Canada
| | - Wael Hanna
- Division of Thoracic Surgery, Department of Surgery, McMaster University, 50 Charlton Avenue East T-2105, Hamilton, ON, L8N 4A6, Canada
| | - John Agzarian
- Division of Thoracic Surgery, Department of Surgery, McMaster University, 50 Charlton Avenue East T-2105, Hamilton, ON, L8N 4A6, Canada.
| |
Collapse
|
29
|
Axelrod D, Comeau-Gauthier M, Prada C, Bzovsky S, Heels-Ansdell D, Petrisor B, Jeray K, Bhandari M, Schemitsch E, Sprague S. Change in Gustilo-Anderson classification at time of surgery does not increase risk for surgical site infection in patients with open fractures: A secondary analysis of a multicenter, prospective randomized controlled trial. OTA Int 2023; 6:e231. [PMID: 36760661 PMCID: PMC9904191 DOI: 10.1097/oi9.0000000000000231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Accepted: 11/13/2022] [Indexed: 02/05/2023]
Abstract
Introduction Open fractures represent a major source of morbidity. Surgical site infections (SSIs) after open fractures are associated with a high rate of reoperations and hospitalizations, which are associated with a lower health-related quality of life. Early antibiotic delivery, typically chosen through an assessment of the size and contamination of the wound, has been shown to be an effective technique to reduce the risk of SSI in open fractures. The Gustilo-Anderson classification (GAC) was devised as a grading system of open fractures after a complete operative debridement of the wound had been undertaken but is commonly used preoperatively to help with the choice of initial antibiotics. Incorrect preoperative GAC, leading to less aggressive initial management, may influence the risk of SSI after open fracture. The objectives of this study were to determine (1) how often the GAC changed from the initial to definitive grading, (2) the injury and patient characteristics associated with increases and decreases of the GAC, and (3) whether a change in GAC was associated with an increased risk of SSI. Methods Using data from the FLOW trial, a large multicenter randomized study, we used descriptive statistics to quantify how frequently the GAC changed from the initial to definitive grading. We used regression models to determine which injury and patient characteristics were associated with increases and decreases in GAC and whether a change in GAC was associated with SSI. Results Of the 2420 participants included, 305 participants had their preoperative GAC change (12.6%). The factors associated with upgrading the GAC (from preoperative score to the definitive assessment) included fracture sites other than the tibia, bone loss at presentation, width of wound, length of wound, and skin loss at presentation. However, initial misclassification of type III fractures as type II fractures was not associated with an increased risk of SSI (P = 0.14). Conclusions When treating patients with open fracture wounds, surgeons should consider that 12% of all injuries may initially be misclassified when using the GAC, particularly fractures that have bone loss at presentation or those located in sites different than the tibia. However, even in misclassified fractures, it did not seem to increase the risk of SSI.
Collapse
Affiliation(s)
- Daniel Axelrod
- Division of Orthopaedic Surgery, Department of Surgery, McMaster University, Hamilton, ON, Canada
| | - Marianne Comeau-Gauthier
- Division of Orthopaedic Surgery, Department of Surgery, McMaster University, Hamilton, ON, Canada
| | - Carlos Prada
- Division of Orthopaedic Surgery, Department of Surgery, McMaster University, Hamilton, ON, Canada
| | - Sofia Bzovsky
- Division of Orthopaedic Surgery, Department of Surgery, McMaster University, Hamilton, ON, Canada
| | - Diane Heels-Ansdell
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada
| | - Brad Petrisor
- Division of Orthopaedic Surgery, Department of Surgery, McMaster University, Hamilton, ON, Canada
| | - Kyle Jeray
- Department of Orthopaedic Surgery, Prisma Health-Upstate, Greenville, SC
| | - Mohit Bhandari
- Division of Orthopaedic Surgery, Department of Surgery, McMaster University, Hamilton, ON, Canada
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada
| | - Emil Schemitsch
- Department of Surgery, Western University, London, ON, Canada
| | - Sheila Sprague
- Division of Orthopaedic Surgery, Department of Surgery, McMaster University, Hamilton, ON, Canada
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada
| |
Collapse
|
30
|
Davis JD, Sanchez-Ramos L, McKinney JA, Lin L, Kaunitz AM. Intrapartum amnioinfusion reduces meconium aspiration syndrome and improves neonatal outcomes in patients with meconium-stained fluid: a systematic review and meta-analysis. Am J Obstet Gynecol 2023; 228:S1179-S1191.e19. [PMID: 37164492 DOI: 10.1016/j.ajog.2022.07.047] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 07/26/2022] [Accepted: 07/27/2022] [Indexed: 03/20/2023]
Abstract
OBJECTIVE This study aimed to reassess the effect of prophylactic transcervical amnioinfusion for intrapartum meconium-stained amniotic fluid on meconium aspiration syndrome and other adverse neonatal and maternal outcomes. DATA SOURCES From inception to November 2021, a systematic search of the literature was performed in PubMed, Embase, Web of Science, and Scopus databases and gray literature sources. STUDY ELIGIBILITY CRITERIA We identified randomized controlled trials of patients with intrapartum moderate to thick meconium-stained amniotic fluid that evaluated the effect of amnioinfusion on adverse neonatal and maternal outcomes. METHODS Of note, 2 reviewers independently abstracted data and gauged study quality by assigning a modified Jadad score. Meconium aspiration syndrome constituted the primary outcome. The secondary outcomes were meconium below the cords, Apgar scores of <7 at 5 minutes, neonatal acidosis, cesarean delivery, cesarean delivery for fetal heart rate abnormalities, neonatal intensive care unit admission, and postpartum endometritis. This study calculated the odds ratios with 95% confidence intervals for categorical outcomes and weighted mean differences with 95% confidence intervals for continuous outcomes. RESULTS A total of 24 randomized studies with 5994 participants met the inclusion criteria. The overall odds of meconium aspiration syndrome was reduced by 67% in the amnioinfusion group (pooled odds ratio, 0.33; 95% confidence interval, 0.21-0.51). Except for postpartum endometritis, amnioinfusion was associated with a significant reduction in all secondary outcomes. CONCLUSION Our study found that the use of intrapartum amnioinfusion in the setting of meconium-stained amniotic fluid significantly reduces the odds of meconium aspiration syndrome and other adverse neonatal outcomes.
Collapse
|
31
|
Demarquette A, Perrault T, Alapetite T, Bouizegarene M, Bronnert R, Fouré G, Masson C, Nicolas V, Lasocki S, Léger M. Spin and fragility in randomised controlled trials in the anaesthesia literature: a systematic review. Br J Anaesth 2023; 130:528-535. [PMID: 36759291 DOI: 10.1016/j.bja.2023.01.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 12/22/2022] [Accepted: 01/02/2023] [Indexed: 02/10/2023] Open
Abstract
BACKGROUND Given variable frequency of misleading reports and the potential for spin (a way of describing results that can mislead readers) to influence interpretation of randomised controlled trials (RCTs), we have undertaken a spin reassessment. We evaluated the quality of recent literature in anaesthesia journals by assessing the presence of spin and calculating the fragility index. METHODS This systematic review of randomised trials was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement. We searched via PubMed® from January 1, 2019 to January 1, 2021 to identify all RCTs published in one of the 20 anaesthesia journals with the highest journal impact factors during this time. Four pairs of reviewers assessed articles independently for eligibility using a piloted electronic data extraction form. They assessed the presence of spin in statistically negative RCTs and calculated the fragility index for statistically positive RCTs. RESULTS Of the 802 screened records, 162 (20%) articles were analysed for spin, and 65 (8%) trials were analysed for fragility index. For the statistically negative studies, 66 articles (40%) presented spin; 89% of these occurrences of spin were described in the conclusion of the abstract. The primary type of spin was the highlight of secondary outcomes (67%). For statistically positive trials, the median fragility index was 4 [1-8]. CONCLUSIONS This systematic review showed that 40% of statistically negative trials in high-impact anaesthesia journals could mislead readers. For statistically positive RCTs, the results relied on few subjects, with a median fragility index of 4 [1-8]. Efforts must be continued to reduce spin and fragility in the medical literature.
Collapse
Affiliation(s)
- Achille Demarquette
- Anaesthesiology and Critical Care Department, Angers University Hospital, Angers, France.
| | - Tristan Perrault
- Anaesthesiology and Critical Care Department, Angers University Hospital, Angers, France
| | - Thomas Alapetite
- Anaesthesiology and Critical Care Department, Angers University Hospital, Angers, France
| | - Madjid Bouizegarene
- Anaesthesiology and Critical Care Department, Angers University Hospital, Angers, France
| | - Romain Bronnert
- Anaesthesiology and Critical Care Department, Angers University Hospital, Angers, France
| | - Gaël Fouré
- Anaesthesiology and Critical Care Department, Angers University Hospital, Angers, France
| | - Charline Masson
- Anaesthesiology and Critical Care Department, Angers University Hospital, Angers, France
| | - Vivian Nicolas
- Anaesthesiology and Critical Care Department, Angers University Hospital, Angers, France
| | - Sigismond Lasocki
- Anaesthesiology and Critical Care Department, Angers University Hospital, Angers, France
| | - Maxime Léger
- Anaesthesiology and Critical Care Department, Angers University Hospital, Angers, France; INSERM UMR 1246, SPHERE, Nantes University, Tours University, Nantes, France
| |
Collapse
|
32
|
Liu Q, Chen H, Gao Y, Zhu C. Robustness of Significant Dichotomous Outcomes in Randomized Controlled Trials in the Treatment of Patients with COVID-19: A Systematic Analysis. INTENSIVE CARE RESEARCH 2023; 3:38-49. [PMID: 36687387 PMCID: PMC9836340 DOI: 10.1007/s44231-022-00027-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 12/23/2022] [Indexed: 01/13/2023]
Abstract
Purpose Significant results of randomized controlled trials (RCTs) should be properly weighed. This study adopted fragility index (FI) to evaluate the robustness of significant dichotomous outcomes from RCTs on coronavirus disease 2019 (COVID-19) treatment. Materials and methods ClinicalTrials.gov and PubMed were searched from inception to July 31, 2021. FIs were calculated and their distribution was depicted. FI's categorical influential factors were analyzed. Spearman correlation coefficient (r s) was reported for the relationship between FI and the continuous characteristics of RCTs. Results Fifty RCTs with 120 outcomes in 7869 patients were included. The FI distribution was abnormal with median 3 (interquartile range 1-7, P = 0.0001). The FIs and robustness were affected by the outcomes of interest, various patient populations, and interventions (T = 18.215,16.667, 23.107; P = 0.02,0.0001, 0.001, respectively). A cubic relationship between the FIs and absolute difference of events between groups with R square of 0.848 (T = 215.828, P = 0.0001, R square = 0.865) was observed. A strong negative logarithmic relationship existed between FI and the P value with R square = - 0.834. Conclusion The robustness of significant dichotomous outcomes of COVID-19 treatments was fragile and affected by the outcomes of interest, patients, interventions, P value, and absolute difference of events between the groups. FI was an useful quantitative metric for the binary significant outcomes on COVID-19 treatments. Registration PROSPERO (CRD42021272455). Supplementary Information The online version contains supplementary material available at 10.1007/s44231-022-00027-y.
Collapse
Affiliation(s)
- Qi Liu
- Emergency Department, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, No. 1st, Jianshe Eastern Road, Zhengzhou, Henan Province People's Republic of China.,Department of Translational Medicine Center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou, Henan Province People's Republic of China
| | - Hong Chen
- Emergency Department, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, No. 1st, Jianshe Eastern Road, Zhengzhou, Henan Province People's Republic of China.,Department of Translational Medicine Center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou, Henan Province People's Republic of China
| | - Yonghua Gao
- Department of Respiratory and Critical Care Medicine, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, People's Republic of China
| | - Changju Zhu
- Emergency Department, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, No. 1st, Jianshe Eastern Road, Zhengzhou, Henan Province People's Republic of China.,Henan Medical Key Laboratory of Emergency and Trauma Research, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou, Henan Province People's Republic of China
| |
Collapse
|
33
|
Taouktsi N, Papageorgiou ST, Tousinas G, Papanikolopoulou S, Grammatikopoulou MG, Giannakoulas G, Goulis DG. Fragility of cardiovascular outcome trials (CVOTs) examining nutrition interventions among patients with diabetes mellitus: a systematic review of randomized controlled trials. Hormones (Athens) 2022; 21:665-681. [PMID: 36129664 PMCID: PMC9712353 DOI: 10.1007/s42000-022-00396-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 08/29/2022] [Indexed: 11/04/2022]
Abstract
PURPOSE There is controversy regarding the optimal statistical method to interpret how robust is a statistically significant result. The fragility index (FI) and the reverse fragility index (RFI) are quantitative measures that can facilitate the appraisal of a clinical trial's robustness. This study was performed to evaluate the FI and RFI of randomized controlled trials (RCTs) examining nutritional interventions in patients with diabetes mellitus, focusing on cardiovascular outcomes. METHODS A systematic search was conducted and relevant RCTs were identified in three databases. RCTs examining nutritional interventions (supplements or dietary patterns) in patients with DM with dichotomous primary endpoints involving cardiovascular outcomes were eligible. Data were extracted to compose 2 × 2 event tables and the FI and RFI were calculated for each comparison, using Fisher's exact test. Risk of bias (RoB) of the included RCTs was assessed with the Cochrane RoB 2.0 tool. RESULTS A total of 14,315 records were screened and 10 RCTs were included in the analyses. The median FI of the paired comparisons was 3 (IQR: 2-4) and the median RFI was 8 (IQR: 4.5-17). RoB and heterogeneity were low. CONCLUSIONS RCTs examining nutritional interventions and cardiovascular outcomes among patients with diabetes mellitus appear to be statistically fragile. Τhe FI and the RFI can be reported and interpreted as an additional perspective of a trial's robustness. HIGHLIGHTS • In the evidence-healthcare era, assessing how robust statistically significant results are remains a matter of controversy. • Recently, the fragility index (FI) and reverse fragility index (RFI) were proposed to assess the robustness of randomized controlled trials (RCTs) with 2 × 2 comparisons. • When applying the FI and RFI, RCTs examining nutritional interventions and cardiovascular outcomes among patients with diabetes mellitus (DM) appear to be statistically fragile. • Τhe FI and the RFI can be reported and interpreted as an additional perspective of a trial's robustness. • RCTs implementing nutrition interventions among patients with DM can improve their methodology.
Collapse
Affiliation(s)
- Niki Taouktsi
- Medical School, Faculty of Health Sciences, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Stefanos T Papageorgiou
- Medical School, Faculty of Health Sciences, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Georgios Tousinas
- Unit of Reproductive Endocrinology, 1st Department of Obstetrics and Gynecology, Medical School, Aristotle University of Thessaloniki, Papageorgiou General Hospital, Thessaloniki, GR-56429, Greece
| | | | - Maria G Grammatikopoulou
- Unit of Reproductive Endocrinology, 1st Department of Obstetrics and Gynecology, Medical School, Aristotle University of Thessaloniki, Papageorgiou General Hospital, Thessaloniki, GR-56429, Greece
- Department of Rheumatology and Clinical Immunology, Medical School, University of Thessaly, Larissa, Greece
| | - George Giannakoulas
- Department of Cardiology, AHEPA University Hospital, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Dimitrios G Goulis
- Unit of Reproductive Endocrinology, 1st Department of Obstetrics and Gynecology, Medical School, Aristotle University of Thessaloniki, Papageorgiou General Hospital, Thessaloniki, GR-56429, Greece.
| |
Collapse
|
34
|
Constant M, Trofa DP, Saltzman BM, Ahmad CS, Li X, Parisien RL. The Fragility of Statistical Significance in Patellofemoral Instability Research: A Systematic Review. Am J Sports Med 2022; 50:3714-3718. [PMID: 34633219 DOI: 10.1177/03635465211039202] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
BACKGROUND Fragility analysis is increasingly utilized to evaluate the robustness of results within the orthopaedic literature and has frequently revealed instability of reported outcomes. PURPOSE/HYPOTHESIS The purpose of this investigation was to utilize a fragility analysis to evaluate the stability of reported results in the patellofemoral instability (PFI) literature. We hypothesized the demonstration of significant fragility in patellofemoral research to be similar to that identified throughout other areas of the orthopaedic literature. STUDY DESIGN Systematic review; Level of evidence, 4. METHODS The PubMed database was queried from January 1, 2000, to October 10, 2020 for comparative trials in 10 prominent orthopaedic journals that reported dichotomous outcomes related to the management of PFI. The fragility index (FI) and the fragility quotient (FQ) were calculated for each individual outcome event, and the overall FI and FQ were determined for all included studies. RESULTS A total of 22 comparative studies comprising 11 randomized controlled trials and 11 nonrandomized trials were included for the analysis. A total of 75 outcome events underwent a fragility analysis and revealed a median FI and FQ of 3 (interquartile range [IQR], 1-5) and 0.043 (IQR, 0.018-0.081), respectively. Also 27% of included studies reported loss to follow-up greater than the overall FI, therefore suggesting the maintenance of the follow-up may have resulted in the reversal of significance. CONCLUSION The result of the comprehensive fragility analysis demonstrated a lack of robustness in PFI research with the alteration of only a few outcome events required to reverse statistical significance. We therefore recommend the triple reporting of the P value, the FI, and the FQ to aid in the interpretation of the statistical integrity of future comparative trials in the PFI literature.
Collapse
Affiliation(s)
- Michael Constant
- Department of Orthopaedics, New York-Presbyterian Hospital, Columbia University Medical Center, New York, New York, USA
| | - David P Trofa
- Department of Orthopaedics, New York-Presbyterian Hospital, Columbia University Medical Center, New York, New York, USA
| | - Bryan M Saltzman
- OrthoCarolina Sports Medicine Center, Charlotte, North Carolina, USA
| | - Christopher S Ahmad
- Department of Orthopaedics, New York-Presbyterian Hospital, Columbia University Medical Center, New York, New York, USA
| | - Xinning Li
- Department of Orthopaedics, Boston University Medical Center, Boston, Massachusetts, USA
| | - Robert L Parisien
- Department of Orthopaedic Surgery & Sports Medicine, Mount Sinai, New York, New York, USA
| |
Collapse
|
35
|
Capuano I, Buonanno P, Riccio E, Bianco A, Pisani A. Randomized Controlled Trials on Renin Angiotensin Aldosterone System Inhibitors in Chronic Kidney Disease Stages 3-5: Are They Robust? A Fragility Index Analysis. J Clin Med 2022; 11:6184. [PMID: 36294504 PMCID: PMC9605379 DOI: 10.3390/jcm11206184] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 10/12/2022] [Accepted: 10/13/2022] [Indexed: 11/17/2022] Open
Abstract
Inhibition of the renin-angiotensin-aldosterone system (RAAS) is broadly recommended in many nephrological guidelines to prevent chronic kidney disease (CKD) progression. This work aimed to analyze the robustness of randomized controlled trials (RCTs) investigating the renal and cardiovascular outcomes in CKD stages 3-5 patients treated with RAAS inhibitors (RAASi). We searched for RCTs in MEDLINE (PubMed), EMBASE databases, and the Cochrane register. Fragility indexes (FIs) for every primary and secondary outcome were calculated according to Walsh et al., who first described this novel metric, suggesting 8 as the cut-off to consider a study robust. Spearman coefficient was calculated to correlate FI to p value and sample size of statistically significant primary and secondary outcomes. Twenty-two studies met the inclusion criteria, including 80,455 patients. Sample size considerably varied among the studies (median: 1693.5, range: 73-17,276). The median follow-up was 38 months (range 24-58). The overall median of both primary and secondary outcomes was 0 (range 0-117 and range 0-55, respectively). The median of FI for primary and secondary outcomes with a p value lower than 0.05 was 6 (range: 1-117) and 7.5 (range: 1-55), respectively. The medians of the FI for primary outcomes with a p value lower than 0.05 in CKD and no CKD patients were 5.5 (range 1-117) and 22 (range 1-80), respectively. Only a few RCTs have been shown to be robust. Our analysis underlined the need for further research with appropriate sample sizes and study design to explore the real potentialities of RAASi in the progression of CKD.
Collapse
Affiliation(s)
- Ivana Capuano
- Department of Public Health, University of Naples “Federico II”, 80131 Naples, Italy
| | - Pasquale Buonanno
- Department of Neurosciences, Reproductive and Odontostomatological Sciences, University of Naples “Federico II”, 80131 Naples, Italy
| | - Eleonora Riccio
- Institute for Biomedical Research and Innovation, National Research Council of Italy, 80125 Palermo, Italy
| | - Antonio Bianco
- Interdepartmental Research Center for Arterial Hypertension and Associated Pathologies (CIRIAPA)-Hypertension Research Center, University of Naples “Federico II”, 80131 Naples, Italy
| | - Antonio Pisani
- Department of Public Health, University of Naples “Federico II”, 80131 Naples, Italy
| |
Collapse
|
36
|
Grimes DR. The ellipse of insignificance, a refined fragility index for ascertaining robustness of results in dichotomous outcome trials. eLife 2022; 11:e79573. [PMID: 36125120 PMCID: PMC9586556 DOI: 10.7554/elife.79573] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Accepted: 09/13/2022] [Indexed: 11/29/2022] Open
Abstract
There is increasing awareness throughout biomedical science that many results do not withstand the trials of repeat investigation. The growing abundance of medical literature has only increased the urgent need for tools to gauge the robustness and trustworthiness of published science. Dichotomous outcome designs are vital in randomized clinical trials, cohort studies, and observational data for ascertaining differences between experimental and control arms. It has however been shown with tools like the fragility index (FI) that many ostensibly impactful results fail to materialize when even small numbers of patients or subjects in either the control or experimental arms are recoded from event to non-event. Critics of this metric counter that there is no objective means to determine a meaningful FI. As currently used, FI is not multidimensional and is computationally expensive. In this work, a conceptually similar geometrical approach is introduced, the ellipse of insignificance. This method yields precise deterministic values for the degree of manipulation or miscoding that can be tolerated simultaneously in both control and experimental arms, allowing for the derivation of objective measures of experimental robustness. More than this, the tool is intimately connected with sensitivity and specificity of the event/non-event tests, and is readily combined with knowledge of test parameters to reject unsound results. The method is outlined here, with illustrative clinical examples.
Collapse
Affiliation(s)
- David Robert Grimes
- School of Physical Sciences, Dublin City UniversityDublinIreland
- Discipline of Radiation Therapy, Trinity College DublinDublinIreland
| |
Collapse
|
37
|
Morris SC, Gowd AK, Agarwalla A, Phipatanakul WP, Amin NH, Liu JN. Fragility of statistically significant findings from randomized clinical trials of surgical treatment of humeral shaft fractures: A systematic review. World J Orthop 2022; 13:825-836. [PMID: 36189338 PMCID: PMC9516622 DOI: 10.5312/wjo.v13.i9.825] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Revised: 02/28/2022] [Accepted: 08/17/2022] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Despite recent meta-analyses of randomized controlled trials (RCTs), there remains no consensus regarding the preferred surgical treatment for humeral shaft fractures. The fragility index (FI) is an emerging tool used to evaluate the robustness of RCTs by quantifying the number of participants in a study group that would need to switch outcomes in order to reverse the study conclusions.
AIM To investigate the fragility index of randomized control trials assessing outcomes of operative fixation in proximal humerus fractures.
METHODS We completed a systematic review of RCTs evaluating the surgical treatment of humeral shaft fractures. Inclusion criteria included: articles published in English; patients randomized and allotted in 1:1 ratio to 2 parallel arms; and dichotomous outcome variables. The FI was calculated for total complications, each complication individually, and secondary surgeries using the Fisher exact test, as previously published.
RESULTS Fifteen RCTs were included in the analysis comparing open reduction plate osteosynthesis with dynamic compression plate or locking compression plate, intramedullary nail, and minimally invasive plate osteosynthesis. The median FI was 0 for all parameters analyzed. Regarding individual outcomes, the FI was 0 for 81/91 (89%) of outcomes. The FI exceeded the number lost to follow up in only 2/91 (2%) outcomes.
CONCLUSION The FI shows that data from RCTs regarding operative treatment of humeral shaft fractures are fragile and does not demonstrate superiority of any particular surgical technique.
Collapse
Affiliation(s)
- Stephen Craig Morris
- Department of Orthopaedic Surgery, Loma Linda University, Loma Linda, CA 92354, United States
| | - Anirudh K Gowd
- Department of Orthopaedic Surgery, Wake Forest University Baptist Medical Center, Winston-Salem, NC 27157, United States
| | - Avinesh Agarwalla
- Department of Orthopaedic Surgery, Westchester Medical Center, Valhalla, NY 10595, United States
| | - Wesley P Phipatanakul
- Department of Orthopaedic Surgery, Loma Linda University, Loma Linda, CA 92354, United States
| | - Nirav H Amin
- Department of Orthopaedic Surgery, Premier Orthopaedic and Trauma Specialists, Pomona, CA 91767, United States
| | - Joseph N Liu
- Department of Orthopedic Surgery, USC Epstein Family Center for Sports Medicine, Los Angeles, CA 90089, United States
| |
Collapse
|
38
|
Carroll AH, Rigor P, Wright MA, Murthi AM. Fragility of randomized controlled trials on treatment of proximal humeral fracture. J Shoulder Elbow Surg 2022; 31:1610-1616. [PMID: 35240302 DOI: 10.1016/j.jse.2022.01.141] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 01/21/2022] [Accepted: 01/23/2022] [Indexed: 02/01/2023]
Abstract
BACKGROUND Proximal humeral fracture represents an increasingly common pathology with evaluation and treatment often guided by evidence from randomized controlled trials (RCTs), but the strength of an RCT must be considered in this process. The purpose of this study was to evaluate the strength of outcomes in RCTs on the management of proximal humeral fractures using the fragility index (FI), a method used with statistically significant dichotomous outcomes to assess the number of patients that would change an outcome measure from significant (P ≤ .05) to nonsignificant if the patient outcome changed. We also aimed to correlate the FI with other measures of study strength. METHODS A systematic review was performed using Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines to evaluate RCTs on the management of proximal humeral fractures. The PubMed, Ovid MEDLINE, Web of Science, and Embase databases were searched from database inception to May 2021. RCTs with at least 1 statistically significant (P ≤ .05) dichotomous outcome were included. The FI was calculated for each included trial using the Fisher exact test. The FI was correlated with the study sample size and journal impact factor. RESULTS Ten RCTs reporting on 656 patients and published between 2011 and 2020 were included. The median patient sample size was 67 (mean, 65.6; range, 40-86). Complications were the most commonly reported dichotomous statistically significant outcome. The median FI was 1 (mean, 2.6; range, 0-18), with 4 studies having an FI of 0. A median FI of 1 indicates that 1 patient experiencing an alternative outcome or having not been lost to follow-up could have changed the pertinent conclusions of the trial for a given outcome. The median number of patients lost to follow-up was 3 (mean, 4.9; range, 0-16) and exceeded the FI in 50% of studies. There was no correlation between the FI and sample size (Spearman coefficient = 0.0592, P = .865) or between the FI and journal impact factor (Spearman coefficient = -0.0229, P = .522). CONCLUSION In most studies of proximal humeral fractures, only 1 or 2 patients experiencing an alternative outcome or lost to follow-up would change the conclusions for the dichotomous outcome studied. Although the FI cannot be used to assess continuous variables, which are often the primary outcome variables of RCTs, it does offer an additional unique measure of study strength that surgeons should consider when evaluating RCTs.
Collapse
Affiliation(s)
| | - Paolo Rigor
- Department of Orthopaedic Surgery, MedStar Union Memorial Hospital, Baltimore, MD, USA
| | - Melissa A Wright
- Department of Orthopaedic Surgery, MedStar Union Memorial Hospital, Baltimore, MD, USA
| | - Anand M Murthi
- Department of Orthopaedic Surgery, MedStar Union Memorial Hospital, Baltimore, MD, USA.
| |
Collapse
|
39
|
Lee ZY, Chin Han Lew C, Stoppe C, Hill A, Ortiz-Reyes A, Dhaliwal R, Heyland DK, Patel JJ. The authors reply. Crit Care Med 2022; 50:e691-e693. [PMID: 35838267 DOI: 10.1097/ccm.0000000000005573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Affiliation(s)
- Zheng-Yii Lee
- Department of Anaesthesiology, Faculty of Medicine, University of Malaya, Kuala Lumpur, Malaysia
| | - Charles Chin Han Lew
- Department of Dietetics and Nutrition, Ng Teng Fong General Hospital, Singapore, Singapore
| | - Christian Stoppe
- Department of Anesthesiology, Intensive Care, Emergency and Pain Medicine, University Hospital Wuerzburg, Wuerzburg, Germany
| | - Aileen Hill
- Departments of Anesthesiology and Intensive Care Medicine, University Hospital Rheinisch-Westfälische Technische Hochschule Aachen, Aachen, Germany
| | - Alfonso Ortiz-Reyes
- Clinical Evaluation Research Unit, Department of Critical Care Medicine, Queen's University, KGH Research Institute, Kingston Health Sciences Centre, Kingston, ON, Canada
| | - Rupinder Dhaliwal
- Clinical Evaluation Research Unit, Department of Critical Care Medicine, Queen's University, KGH Research Institute, Kingston Health Sciences Centre, Kingston, ON, Canada
| | - Daren K Heyland
- Clinical Evaluation Research Unit, Department of Critical Care Medicine, Queen's University, KGH Research Institute, Kingston Health Sciences Centre, Kingston, ON, Canada
| | - Jayshil J Patel
- Department of Medicine, Division of Pulmonary & Critical Care Medicine, Medical College of Wisconsin, Milwaukee, WI
| |
Collapse
|
40
|
Schröder A, Muensterer OJ, Oetzmann von Sochaczewski C. Paediatric surgical trials, their fragility index, and why to avoid using it to evaluate results. Pediatr Surg Int 2022; 38:1057-1066. [PMID: 35524787 PMCID: PMC9162995 DOI: 10.1007/s00383-022-05133-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/24/2022] [Indexed: 12/30/2022]
Abstract
BACKGROUND The fragility index has been gaining ground in the evaluation of comparative clinical studies. Many scientists evaluated trials in their fields and deemed them to be fragile, although there is no consensus on the definition of fragility. We aimed to calculate the fragility index and its permutations for paediatric surgical trials. METHODS We searched pubmed for prospectively conducted paediatric surgical trials with intervention and control group without limitations and calculated their (reverse) fragility indices and respective quotients along with posthoc-power. Relationships between variables were evaluated using Spearman's ρ. We also calculated S values by negative log transformation base-2 of P values. RESULTS Of 516 retrieved records, we included 87. The median fragility index was 1.5 (interquartile range: 0-4) and the median reverse fragility index was 3 (interquartile range: 2-4), although they were statistically not different (Mood's test: χ2 = 0.557, df = 1, P = 0.4556). P values and fragility indices were strongly inversely correlated (ρ = - 0.71, 95% confidence interval: - 0.53 to - 0.85, P < 0.0001), while reverse fragility indices were moderately correlated to P values (ρ = 0.5, 95% confidence interval: 0.37-0.62, P < 0.0001). A fragility index of 1 resulted from P values between 0.039 and 0.003, which resulted in S values between 4 and 8. CONCLUSIONS Fragility indices, reverse fragility indices, and their respective fragility quotients of paediatric surgical trials are low. The fragility index can be viewed as no more than a transformed P value with even more substantial limitations. Its inherent penalisation of small studies irrespective of their clinical relevance is particularly harmful for paediatric surgery. Consequently, the fragility index should be avoided.
Collapse
Affiliation(s)
- Arne Schröder
- Klinik für Kinder- und Jugendmedizin, Klinikum Dortmund, Dortmund, Germany
| | - Oliver J Muensterer
- Kinderchirurgische Klinik und Poliklinik im Dr. von Haunerschen Kinderspital, Ludwig-Maximilians-Universität München, München, Germany
- Klinik und Poliklinik für Kinderchirurgie, Universitätsmedizin der Johannes-Gutenberg-Universität Mainz, Mainz, Germany
| | - Christina Oetzmann von Sochaczewski
- Klinik und Poliklinik für Kinderchirurgie, Universitätsmedizin der Johannes-Gutenberg-Universität Mainz, Mainz, Germany.
- Sektion Kinderchirurgie der Klinik und Poliklinik für Allgemein, Viszeral, Thorax- und Gefäßchirurgie, Universitätsklinikum Bonn, Venusberg-Campus 1, 53127, Bonn, Germany.
| |
Collapse
|
41
|
Lin L, Chu H. Assessing and visualizing fragility of clinical results with binary outcomes in R using the fragility package. PLoS One 2022; 17:e0268754. [PMID: 35648746 PMCID: PMC9159630 DOI: 10.1371/journal.pone.0268754] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2021] [Accepted: 05/02/2022] [Indexed: 12/01/2022] Open
Abstract
With the growing concerns about research reproducibility and replicability, the assessment of scientific results' fragility (or robustness) has been of increasing interest. The fragility index was proposed to quantify the robustness of statistical significance of clinical studies with binary outcomes. It is defined as the minimal event status modifications that can alter statistical significance. It helps clinicians evaluate the reliability of the conclusions. Many factors may affect the fragility index, including the treatment groups in which event status is modified, the statistical methods used for testing for the association between treatments and outcomes, and the pre-specified significance level. In addition to assessing the fragility of individual studies, the fragility index was recently extended to both conventional pairwise meta-analyses and network meta-analyses of multiple treatment comparisons. It is not straightforward for clinicians to calculate these measures and visualize the results. We have developed an R package called "fragility" to offer user-friendly functions for such purposes. This article provides an overview of methods for assessing and visualizing the fragility of individual studies as well as pairwise and network meta-analyses, introduces the usage of the "fragility" package, and illustrates the implementations with several worked examples.
Collapse
Affiliation(s)
- Lifeng Lin
- Department of Statistics, Florida State University, Tallahassee, FL, United States of America
| | - Haitao Chu
- Statistical Research and Innovation, Global Biometrics and Data Management, Pfizer Inc., New York, NY, United States of America
- Division of Biostatistics, University of Minnesota School of Public Health, Minneapolis, MN, United States of America
| |
Collapse
|
42
|
Davey MS, Hurley ET, Doyle TR, Dashti H, Gaafar M, Mullett H. The Fragility Index of Statistically Significant Findings From Randomized Controlled Trials Comparing the Management Strategies of Anterior Shoulder Instability. Am J Sports Med 2022:3635465221077268. [PMID: 35414266 DOI: 10.1177/03635465221077268] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
BACKGROUND Debate centering on the management of anterior shoulder instability (ASI) in recent years has led to many randomized controlled trials (RCTs) being published on the topic. The fragility index (FI) has subsequently emerged as a novel method of assessing significant findings reported in RCTs, particularly those with small sample sizes. PURPOSE To evaluate the FI of statistically significant findings in RCTs that reported the outcomes of management strategies of patients with ASI. STUDY DESIGN Systematic review; Level of evidence, 1. METHODS Using PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, 2 independent reviewers performed a systematic review of RCTs focusing on the outcomes of management strategies of patients with ASI. There were 3 main categories of RCTs included: (1) nonoperative management in internal rotation (IR) versus external rotation (ER), (2) nonoperative management versus a surgical intervention, and (3) surgical management with arthroscopic Bankart repair versus open Bankart repair. The Fisher exact test was utilized to calculate the FI for the reversal of statistical significance in all statistically significant dichotomous outcomes. RESULTS A total of 21 RCTs were included, including 1589 shoulders (mean age, 29.4 years) with a mean follow-up of 26.8 months. There were 10 RCTs (831 shoulders) that reported outcomes after the nonoperative management of ASI in IR versus ER, with a mean FI of 6.8. There were 5 RCTs (324 shoulders) that reported outcomes comparing the nonoperative and operative management of ASI, with a mean FI of 3.5. There were 6 RCTs (434 shoulders) that reported outcomes after the operative management of ASI with either arthroscopic Bankart repair or open Bankart repair, with a mean FI of 9.6. CONCLUSION The overall FI of RCTs reporting the outcomes of management strategies for patients with ASI was high, suggesting a moderate fragility of statistically significant outcomes including recurrence, revision stabilization, and return to play.
Collapse
|
43
|
Itaya T, Isobe Y, Suzuki S, Koike K, Nishigaki M, Yamamoto Y. The Fragility of Statistically Significant Results in Randomized Clinical Trials for COVID-19. JAMA Netw Open 2022; 5:e222973. [PMID: 35302631 PMCID: PMC8933746 DOI: 10.1001/jamanetworkopen.2022.2973] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
IMPORTANCE Interpreting results from randomized clinical trials (RCTs) for COVID-19, which have been published rapidly and in vast numbers, is challenging during a pandemic. OBJECTIVE To evaluate the robustness of statistically significant findings from RCTs for COVID-19 using the fragility index. DESIGN, SETTING, AND PARTICIPANTS This cross-sectional study included COVID-19 trial articles that randomly assigned patients 1:1 into 2 parallel groups and reported at least 1 binary outcome as significant in the abstract. A systematic search was conducted using PubMed to identify RCTs on COVID-19 published until August 7, 2021. EXPOSURES Trial characteristics, such as type of intervention (treatment drug, vaccine, or others), number of outcome events, and sample size. MAIN OUTCOMES AND MEASURES Fragility index. RESULTS Of the 47 RCTs for COVID-19 included, 36 (77%) were studies of the effects of treatment drugs, 5 (11%) were studies of vaccines, and 6 (13%) were of other interventions. A total of 138 235 participants were included in these trials. The median (IQR) fragility index of the included trials was 4 (1-11). The medians (IQRs) of the fragility indexes of RCTs of treatment drugs, vaccines, and other interventions were 2.5 (1-6), 119 (61-139), and 4.5 (1-18), respectively. The fragility index among more than half of the studies was less than 1% of each sample size, although the fragility index as a proportion of events needing to change would be much higher. CONCLUSIONS AND RELEVANCE This cross-sectional study found a relatively small number of events (a median of 4) would be required to change the results of COVID-19 RCTs from statistically significant to not significant. These findings suggest that health care professionals and policy makers should not rely heavily on individual results of RCTs for COVID-19.
Collapse
Affiliation(s)
- Takahiro Itaya
- Department of Healthcare Epidemiology, Graduate School of Medicine and Public Health, Kyoto University, Kyoto, Japan
| | - Yotsuha Isobe
- Department of Human Health Sciences, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| | - Sayoko Suzuki
- Department of Human Health Sciences, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| | - Kanako Koike
- Department of Medical Genetics, International University of Health and Welfare Graduate School, Tokyo, Japan
| | - Masakazu Nishigaki
- Department of Medical Genetics, International University of Health and Welfare Graduate School, Tokyo, Japan
| | - Yosuke Yamamoto
- Department of Healthcare Epidemiology, Graduate School of Medicine and Public Health, Kyoto University, Kyoto, Japan
| |
Collapse
|
44
|
Nostedt S, Joffe AR. Critical Care Randomized Trials Demonstrate Power Failure: A Low Positive Predictive Value of Findings in the Critical Care Research Field. J Intensive Care Med 2022; 37:1082-1093. [PMID: 35179408 DOI: 10.1177/08850666221077203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
BACKGROUND We aimed to determine the post-hoc power of randomized controlled trials (RCTs) in critical care, and describe the implications for long-term positive (PPV) and negative predictive value (NPV) of statistically significant and non-significant findings respectively in the research field. METHODS We reviewed three cohorts of RCTs. "Adult-RCTs" were 216 multicenter RCTs with a mortality outcome from a published systematic review. "Pediatric-RCTs" were 120 RCTs with a mortality outcome, obtained by search of picutrials.net. "Consecutive-RCTs" were 90 recent RCTs obtained by screening publications in 6 journals. Post-hoc power for each study was calculated at α 0.05 and 0.005, for measures of small, medium, and large effect-size, using G*Power software. Long-run expected PPV and NPV of critical care research field findings were then calculated. RESULTS With α 0.05, post-hoc power for small effect-size was very low in all RCT-cohorts (eg, median 24% in Adult-RCTs). For medium effect-size, post-hoc power was low, except for Adult-RCTs (eg, median 9% in Pediatric-RCTs). For large effect-size, post-hoc power for non-human-animal Consecutive-RCTs was low (median 32%). With α 0.005, post-hoc power was even lower. The corollary was that both PPV and NPV were poor for small effect-size, unless α 0.005 was used. Even with α 0.005, with realistic (vs. optimistic) prior probability of the alternative hypothesis, the PPV was low (eg, in Adult-RCTs 57.1% vs. 92.3%). Adding mild bias (0.1) reduced the PPV even further. For medium effect-size both PPV and NPV were better; nevertheless, with α 0.05 and realistic prior probability of the alternative hypothesis the PPV was poor, and with α 0.005 and mild bias (0.1) the PPV was very low (eg, Adult-RCTs median 44.1%). CONCLUSIONS To improve the predictive value of findings in the critical care research field, RCTs should be designed to have 80% power for realistic effect-size at α 0.005.
Collapse
Affiliation(s)
- Sarah Nostedt
- Department of Pediatrics, Division of Critical Care Medicine, University of Alberta, Edmonton, Alberta, Canada.,Stollery Children's Hospital, Edmonton, Alberta, Canada
| | - Ari R Joffe
- Department of Pediatrics, Division of Critical Care Medicine, University of Alberta, Edmonton, Alberta, Canada.,Stollery Children's Hospital, Edmonton, Alberta, Canada
| |
Collapse
|
45
|
When the p Value Doesn't Cut It: The Fragility Index Applied to Randomized Controlled Trials in Colorectal Surgery. Dis Colon Rectum 2022; 65:276-283. [PMID: 34990426 DOI: 10.1097/dcr.0000000000002146] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
BACKGROUND The American Statistical Association, among others, has called for the use of statistical methods beyond p ≤ 0.05. The fragility index is a statistical metric defined as the minimum number of patients for whom if an event rather than a nonevent occurred, then the p value would increase to ≥0.05. Previous reviews have demonstrated that many randomized controlled trials have a low fragility index, suggesting they may not be robust. OBJECTIVE The purpose of this study was to review the fragility indices of randomized controlled trials in colorectal surgery. DATA SOURCES A PubMed search was performed. STUDY SELECTION Colorectal surgery randomized controlled trials with a dichotomous primary outcome p ≤ 0.05 and publication between 2016 and 2018 were systematically identified. INTERVENTIONS All procedural interventions related to colorectal surgery were included. MAIN OUTCOME MEASURES The main measures were the fragility index and the number of patients lost to follow-up for each trial. The percentage of trials with the number of patients lost to follow-up greater than the fragility index was calculated. RESULTS In total, 712 abstracts were reviewed, with 90 trials meeting the inclusion criteria. The median fragility index was 3 (interquartile range of 1 to 10). In 51 of the 90 trials (57%), the number of patients lost to follow-up was greater than the fragility index. LIMITATIONS The fragility index is only one measure of the robustness of a randomized clinical trial. CONCLUSIONS Most colorectal surgery randomized controlled trials have a low fragility index. In 57% of trials, more patients were lost to follow-up than would be required to change the outcome of the trial from "significant" to "nonsignificant" based on the p value. This emphasizes the importance of assessing the robustness of clinical trials when considering their clinical application, rather than relying solely on the p value. See Video Abstract at http://links.lww.com/DCR/B741.CUANDO EL VALOR-P ES INSUFICIENTE: ÍNDICE DE FRAGILIDAD APLICADO EN ESTUDIOS ALEATORIOS CONTROLADOS EN CIRUGÍA COLORECTAL. ANTECEDENTES La Sociedad Estadounidense de Estadística, entre otros, ha pedido el uso de métodos estadísticos más allá de p <0,05. El índice de fragilidad es una medida estadística definida como el número de desenlaces que podrían cambiar para revertir, o conseguir, la significación estadística, así el valor p aumentaría a ≥ 0,05. Las revisiones anteriores han demostrado que muchos estudios aleatorios controlados tienen un índice de fragilidad bajo, lo que sugiere que pueden poco sólidos. OBJETIVO El propósito de la présente investigación fué de revisar los índices de fragilidad de los estudios aleatorios controlados en cirugía colorrectal. FUENTES DE DATOS PubMed. SELECCIN DE ESTUDIOS Se identificaron sistemáticamente estudios aleatorios controlados de cirugía colorrectal con un resultado primario dicotómico, valor de p ≤ 0,05 y publicados entre 2016-2018. INTERVENCIONES Se incluyeron todas aquellas intervenciones con procedimientos relacionados con la cirugía colorrectal. PRINCIPALES MEDIDAS DE RESULTADO Las principales medidas fueron: el índice de fragilidad y el número de pacientes perdidos durante el seguimiento en cada estudio. Se calculó el el índice de fragilidad en porcentaje de estudios con el mayor número de pacientes perdidos durante el seguimiento mas prolongado. RESULTADOS En total, se revisaron 712 resúmenes con 90 ensayos que cumplieron con los criterios de inclusión. La mediana del índice de fragilidad fue de 3 (rango intercuartíl de 1 a 10). En 51 de los 90 estudios (57%), el número de pacientes perdidos durante el seguimiento fue mayor que el índice de fragilidad. LIMITACIONES El índice de fragilidad es solo una medida de la robustez de un estúdio clínico aleatorio. CONCLUSIONES La mayoría de los estudios aleatorios y controlados en cirugía colorrectal tienen un índice de fragilidad bajo. En el 57% de los estudios, se perdieron más pacientes durante el seguimiento de los que se necesitarían para cambiar el resultado del estudios de grado "significativo" a un grado "no significativo" según el valor-p. Este concepto enfatiza la importancia de evaluar la robustez de los estudios clínicos al considerar su aplicación verdadera aplicación clínica, en lugar de depender únicamente del valor-p. Consulte Video Resumen en http://links.lww.com/DCR/B741. (Traducción-Dr. Xavier Delgadillo).
Collapse
|
46
|
Ho AK. The Fragility Index for Assessing the Robustness of the Statistically Significant Results of Experimental Clinical Studies. J Gen Intern Med 2022; 37:206-211. [PMID: 34357573 PMCID: PMC8739402 DOI: 10.1007/s11606-021-06999-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/13/2021] [Accepted: 06/23/2021] [Indexed: 01/03/2023]
Affiliation(s)
- Adrienne K Ho
- Department of Thoracic Oncology, Wythenshawe Hospital, Manchester, UK.
- Present address: Department of Public Health Sciences (Epidemiology), Queen's University, Kingston, Ontario, Canada.
| |
Collapse
|
47
|
Li H, Liang Z, Meng Q, Huang X. The Fragility Index of Randomized Controlled Trials for Preterm Neonates. Front Pediatr 2022; 10:876366. [PMID: 35615631 PMCID: PMC9124941 DOI: 10.3389/fped.2022.876366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Accepted: 04/04/2022] [Indexed: 11/28/2022] Open
Abstract
BACKGROUND As a metric to determine the robustness of trial results, the fragility index (FI) is the number indicating how many patients would be required to reverse the significant results. This study aimed to calculate the FI in randomized controlled trials (RCTs) involving premature. METHODS Trials were included if they had a 1:1 study design, reported statistically significant dichotomous outcomes, and had an explicitly stated sample size or power calculation. The FI was calculated for binary outcomes using Fisher's exact test, and the FIs of subgroups were compared. Spearman's correlation was applied to determine correlations between the FI and study characteristics. RESULTS Finally, 66 RCTs were included in the analyses. The median FI for these trials was 3.00 (interquartile range [IQR]: 1.00-5.00), with a median fragility quotient of 0.014 (IQR: 0.008-0.028). FI was ≤ 3 in 42 of these 66 RCTs (63.6%), and in 42.4% (28/66) of the studies, the number of patients lost to follow-up was greater than that of the FI. Significant differences were found in the FI among journals (p = 0.011). We observed that FI was associated with the sample size, total number of events, and reported p-values (r s = 0.437, 0.495, and -0.857, respectively; all p < 0.001). CONCLUSION For RCTs in the premature population, a median of only three events was needed to change from a "non-event" to "event" to render a significant result non-significant, indicating that the significance may hinge on a small number of events.
Collapse
Affiliation(s)
- Huiyi Li
- Department of Pediatrics, Guangdong Second Provincial General Hospital, Guangzhou, China
| | - Zhenyu Liang
- Department of Pediatrics, Guangdong Second Provincial General Hospital, Guangzhou, China
| | - Qiong Meng
- Department of Pediatrics, Guangdong Second Provincial General Hospital, Guangzhou, China
| | - Xin Huang
- Department of Pediatrics, Guangdong Second Provincial General Hospital, Guangzhou, China.,Center for Clinical Epidemiology and Methodology (CCEM), Guangdong Second Provincial General Hospital, Guangzhou, China
| |
Collapse
|
48
|
Pascoal E, Liu M, Lin L, Luketic L. The fragility of statistically significant results in gynecologic surgery: A systematic review. JOURNAL OF OBSTETRICS AND GYNAECOLOGY CANADA 2021; 44:508-514. [PMID: 34954411 DOI: 10.1016/j.jogc.2021.11.016] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2021] [Revised: 11/20/2021] [Accepted: 11/22/2021] [Indexed: 11/17/2022]
Abstract
OBJECTIVE To use the fragility index (FI) to evaluate the robustness of gynaecologic surgery trials that report statistically significant results. The FI defines the minimum number of patients who must have an alternative outcome to alter statistical significance. DATA SOURCES We searched MEDLINE, Web of Science, Embase, and ClinicalTrials.gov from 2011 to 2021 to identify gynaecologic surgery randomized controlled trials (RCTs). STUDY SELECTION A total of 4775 trials were screened for eligibility. All included studies evaluated benign gynaecologic surgery interventions or peri-operative medical interventions. Only two-arm RCTs with statistically significant dichotomous primary outcomes were included. Ninety-three trials were ultimately included for analysis. DATA EXTRACTION AND SYNTHESIS Data from the included studies, including sample size, loss to follow-up, and number of events, were recorded. The FI of each study was calculated using a predefined technique. The overall FI and FIs by subgroup (clinical subspecialty, country of origin, and statistical test used) are reported as medians alongside their interquartile ranges (IQRs). The Kruskal-Wallis test was applied to find possible statistically significant relationships between FI and the nominal subgroups. Among this cohort, the median FI was 3 (IQR 1-7). The FI was 0 in 13 trials (14%), and in 39 trials (42%), the number of patients lost to follow-up was greater than the FI. The median FI within clinical subspecialty groups (general gynaecology, anaesthesia, urogynaecology, and fertility) did not differ (P = 0.122). CONCLUSION Statistically significant results of RCTs in gynaecologic surgery are fragile, suggesting that clinicians should interpret results with caution. This is particularly true when the number of patients lost to follow-up is greater than the FI. The FI serves as a quality metric that can be used to evaluate robustness of results when applying the outcomes of RCTs to clinical practice or guideline development.
Collapse
Affiliation(s)
- Erica Pascoal
- Department of Obstetrics and Gynecology, McMaster University, Hamilton, ON.
| | - Marina Liu
- Michael G. DeGroote School of Medicine, McMaster University, Hamilton, ON
| | - Lauren Lin
- Michael G. DeGroote School of Medicine, McMaster University, Hamilton, ON
| | - Lea Luketic
- Department of Obstetrics and Gynecology, McMaster University, Hamilton, ON
| |
Collapse
|
49
|
Parisien RL, Trofa DP, Cronin PK, Dashe J, Curry EJ, Eichinger JK, Levine WN, Tornetta P, Li X. Comparative Studies in the Shoulder Literature Lack Statistical Robustness: A Fragility Analysis. Arthrosc Sports Med Rehabil 2021; 3:e1899-e1904. [PMID: 34977646 PMCID: PMC8689245 DOI: 10.1016/j.asmr.2021.08.017] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 08/30/2021] [Indexed: 01/29/2023] Open
Abstract
Purpose Evidenced-based decision-making is rooted in comparative clinical studies; however, a small number of outcome event reversals have the potential to change study significance. The purpose of this study was to determine the utility of applying fragility analysis to comparative studies in the published orthopaedic shoulder literature. Methods Comparative clinical shoulder research studies reporting 1:1 dichotomous categorical data were analyzed in 6 leading orthopaedic journals between 2006 and 2016. Statistical significance was defined as a P value of less than .05. The fragility index (FI) for each study outcome was determined by the number of event reversals required to change the P value to either greater or less than 0.05, thus changing the study conclusions. The associated fragility quotient (FQ) was determined by dividing the FI by the total population comprising a particular outcome. Results Of the 23,897 studies screened, 3,591 met search criteria, with 198 comparative studies ultimately included for analysis, 67 of which were randomized controlled trials. There were 357 total outcome events with 74 reported as significant and 283 as not significant. The FI was 4 (IQR 2-6) with an associated FQ of 0.066 (interquartile range [IQR] 0.038-0.102). There was no difference in statistical fragility between randomized and nonrandomized trials with both revealing a FI of 4 and FQ of 0.068 (IQR 0.044-0.107) and 0.065 (IQR 0.031-0.101), respectively. Conclusions This current analysis reveals that comparative shoulder studies published in six leading orthopaedic journals are at risk of statistical fragility. As such, contemporary clinical shoulder literature may not be as robust as traditionally perceived with the reversal of only a few outcome events required to change study significance. Therefore, we advocate the reporting of both FI and FQ in addition to the P value as statistical complements to all comparative investigations to provide a more comprehensive understanding of trial stability and significance in the published shoulder literature. Clinical Relevance Comparative study designs are commonly employed in shoulder research. Several studies in both the general medical and orthopaedic literature have identified a lack of statistical robustness through comprehensive fragility analysis. Our findings demonstrate the P value may be an inadequate independent statistical metric requiring the complement of a FI and FQ to aid in the interpretation and understanding of study significance for clinical decision-making.
Collapse
Affiliation(s)
| | | | - Patrick K. Cronin
- Harvard-Combined Orthopaedic Residency Program, Boston, Massachusetts
| | - Jesse Dashe
- Boston University Medical Center, Boston, Massachusetts
| | - Emily J. Curry
- Boston University School of Public Health, Boston, Massachusetts
| | | | | | - Paul Tornetta
- Boston University Medical Center, Boston, Massachusetts
| | - Xinning Li
- Boston University Medical Center, Boston, Massachusetts
- Address correspondence to Xinning Li, M.D., Boston University School of Medicine, 850 Harrison Avenue – Dowling 2 North, Boston, MA 02115.
| |
Collapse
|
50
|
Parisien RL, Constant M, Saltzman BM, Popkin CA, Ahmad CS, Li X, Trofa DP. The Fragility of Statistical Significance in Cartilage Restoration of the Knee: A Systematic Review of Randomized Controlled Trials. Cartilage 2021; 13:147S-155S. [PMID: 33969744 PMCID: PMC8808853 DOI: 10.1177/19476035211012458] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
OBJECTIVE The purpose of this study was to utilize fragility analysis to assess the robustness of randomized controlled trials (RCTs) evaluating the management of articular cartilage defects of the knee. We hypothesize that the cartilage restorative literature will be fragile with the reversal of only a few outcome events required to change statistical significance. DESIGN RCTs from 11 orthopedic journals indexed on PubMed from 2000 to 2020 reporting dichotomous outcome measures relating to the management of articular cartilage defects of the knee were included. The Fragility Index (FI) for each outcome was calculated through the iterative reversal of a single outcome event until significance was reversed. The Fragility Quotient (FQ) was calculated by dividing each FI by study sample size. Additional statistical analysis was performed to provide median FI and FQ across subgroups. RESULTS Nineteen RCTs containing 60 dichotomous outcomes were included for analysis. The FI and FQ of all outcomes was 4 (IQR 2-7) and 0.067 (IQR 0.034-0.096), respectively. The average number of patients lost to follow-up (LTF) was 3.9 patients with 15.8% of the included studies reporting LTF greater than or equal to 4, the FI of all included outcomes. CONCLUSIONS The orthopedic literature evaluating articular cartilage defects of the knee is fragile as the reversal of relatively few outcome events may alter the significance of statistical findings. We therefore recommend comprehensive fragility analysis and triple reporting of the P value, FI, and FQ to aid in the interpretation and contextualization of clinical findings reported in the cartilage restoration literature.
Collapse
Affiliation(s)
- Robert L. Parisien
- Department of Orthopaedics, Harvard
Medical School & Boston Children’s Hospital, Boston, MA, USA
| | - Michael Constant
- Department of Orthopaedics, Columbia
University Irving Medical Center, New York, NY, USA
| | - Bryan M. Saltzman
- Ortho Carolina, Sports Medicine, Knee
& Shoulder/Elbow, Charlotte, NC, USA
| | - Charles A. Popkin
- Department of Orthopaedics, Columbia
University Irving Medical Center, New York, NY, USA
| | - Christopher S. Ahmad
- Department of Orthopaedics, Columbia
University Irving Medical Center, New York, NY, USA
| | - Xinning Li
- Department of Orthopaedics, Boston
University Medical Center, Boston, MA, USA
| | - David P. Trofa
- Department of Orthopaedics, Columbia
University Irving Medical Center, New York, NY, USA
| |
Collapse
|