BPG is committed to discovery and dissemination of knowledge
Retrospective Cohort Study Open Access
©Author(s) (or their employer(s)) 2026. No commercial re-use. See Permissions. Published by Baishideng Publishing Group Inc.
World J Crit Care Med. Mar 9, 2026; 15(1): 113684
Published online Mar 9, 2026. doi: 10.5492/wjccm.v15.i1.113684
Use of radiograph scoring systems to assess pulmonary disease severity in patients with COVID-19 pneumonia
Hayder Mohammed, Khalid Y Fadul, Alhady Alfian Yusof, Shabbir Ahmad, Munawar Farooq, Sasha Javid, Department of Emergency Medicine, Hamad Medical Corporation, Qatar 3050, Qatar
Hayder Mohammed, Department of Emergency Medicine, Leeds Teaching Hospital NHS Trust, Leeds LS9 7TF, United Kingdom
Syed G A Naqvi, Department of Radiology, Hamad Medical Corporation, Qatar 3050, Qatar
Nadir Kharma, Department of Intensive Care Medicine, Hamad Medical Corporation, Qatar 3050, Qatar
Munawar Farooq, Department of Internal Medicine, Emergency Medicine Section, College of Medicine and Health Sciences, Al Ain 15551, United Arab Emirates
Ahmed Mohamed, Department of Orthopaedics, Burjeel Medical City, Abu Dhabi 92510, United Arab Emirates
Manar E Abdel-Rahman, Department of Public Health, Professor of Biostatistics, Qatar University, Qatar 2713, Qatar
Tim Harris, Department of Emergency Medicine, Queen Mary University of London, London E1 4NS, United Kingdom
ORCID number: Hayder Mohammed (0009-0006-3048-2081); Khalid Y Fadul (0000-0001-8057-0182); Syed G A Naqvi (0000-0002-7401-8129); Nadir Kharma (0000-0003-4451-8218); Shabbir Ahmad (0009-0002-2804-2555); Ahmed Mohamed (0009-0004-4894-6767); Manar E Abdel-Rahman (0000-0001-9968-9853).
Author contributions: Mohammed H, Fadul KY, Naqvi SGA, Kharma N, Alfian Yusof A, Ahmad S, Farooq M, and Javid S contributed to the study conception, data collection; Mohammed H, Fadul KY, Naqvi SGA, Kharma N, Alfian Yusof A, Ahmad S, Farooq M, Javid S, and Abdel-Rahman ME contributed to data interpretation; Mohamed A contributed to drafting, organizing, and writing the Discussion section; Abdel-Rahman ME performed the statistical analysis; Harris T provided supervision, critical revision, and oversight of the study. All authors reviewed and approved the final version of the manuscript.
Supported by Hamad General Hospital, Qatar, No. MRC-05-233.
Institutional review board statement: This study was approved by the Institutional Medical Research Council at HMC, Qatar (No. MRC-05-233).
Informed consent statement: Our Institutional Review Board waived the requirement for signed informed consent, as the study involved a retrospective chart review and all patient data were de-identified during collection.
Conflict-of-interest statement: All the authors report no relevant conflicts of interest for this article.
STROBE statement: The authors have read the STROBE Statement-checklist of items, and the manuscript was prepared and revised according to the STROBE Statement-checklist of items.
Data sharing statement: The datasets generated during and/or analyzed during the current study are available from the corresponding author upon reasonable request.
Corresponding author: Ahmed Mohamed, Department of Orthopaedics, Burjeel Medical City, 28th Street, Abu Dhabi 92510, United Arab Emirates. ahmedtom11@hotmail.com
Received: September 8, 2025
Revised: October 8, 2025
Accepted: January 7, 2026
Published online: March 9, 2026
Processing time: 175 Days and 18.9 Hours

Abstract
BACKGROUND

Severe acute respiratory syndrome coronavirus 2 causes pneumonia in most hospitalized patients, often leading to hypoxemia and the need for supplemental oxygen. While chest computed tomography is highly sensitive, chest radiographs (CXR) offer a practical alternative in high-volume settings. Scoring systems like Radiographic Assessment of Lung Edema (RALE) and BRIXIA standardize CXR interpretation and quantify severity, but their relationship with oxygen delivery requirements in coronavirus disease 2019 (COVID-19) patients remains unclear.

AIM

To evaluate whether the initial emergency department (ED) radiograph could predict subsequent oxygen support requirements. The secondary aim was to assess inter- and intra-rater agreement of the scoring systems.

METHODS

This retrospective cohort study examined consecutive COVID-19 patients presenting to a large tertiary hospital ED (May-June 2020) who required admission and underwent CXR within 24 hours of arrival. Infiltrate severity on ED radiographs was scored using the BRIXIA and RALE systems. Oxygen support was categorized by delivery device, and associations were examined using logistic regression.

RESULTS

Data was analyzed from 950 COVID-19 patients (90.6% male, mean age: 48.4 ± 12.3 years). Predictive performance showed notable variation: At ED admission, both BRIXIA and RALE scores had the highest discriminatory ability [area under the curve (AUC) = 0.74; 95% confidence interval (CI): 0.69-0.79] for predicting oxygen delivery via high flow nasal cannula/continuous positive airway pressure/Bi-level positive airway pressure. Prediction for non-rebreather mask yielded lower AUCs (BRIXIA: 0.65; RALE: 0.62), with nasal cannula use showing limited discrimination (BRIXIA: 0.56; RALE: 0.54). During hospitalization, predictive performance remained modest across all modalities. The AUCs for intubation were 0.63 (BRIXIA) and 0.62 (RALE), while for high flow nasal cannula/continuous positive airway pressure/Bi-level positive airway pressure, values dropped slightly to 0.62 and 0.59, respectively. Non-rebreather mask prediction maintained an AUC of 0.62 for both scores, and nasal cannula predictions remained low (BRIXIA: 0.56; RALE: 0.52). Inter- and intra-rater agreement was excellent in both scores, with inter-rater agreement at 95% (95%CI: 0.94-0.96) and intra-rater agreement at 97% (95%CI: 0.96-0.98) for BRIXIA and 98% (95%CI: 97-98) for RALE.

CONCLUSION

Both RALE and BRIXIA scores effectively predicted the need for advanced respiratory support in ED COVID-19 patients and demonstrated excellent inter-rater and intra-rater reliability. While their predictive power diminished during hospitalization, both scores remain valuable for initial triage, with BRIXIA particularly useful for ruling out the need for high-level oxygen support.

Key Words: COVID-19; SARS-CoV-2; BRIXIA; Radiographic Assessment of Lung Edema; Radiographic scoring; Reliability; Emergency department

Core Tip: Radiographic Assessment of Lung Edema and BRIXIA chest X-ray scoring systems are effective in predicting the need for advanced respiratory support in coronavirus disease 2019 patients presenting to the emergency department. Both methods show strong prognostic value for identifying patients who require advanced oxygen delivery devices in the emergency department and during their hospital stay. In addition, the two scoring systems demonstrated excellent inter- and intra-rater reliability among emergency medicine physicians, intensivists, and radiologists. Given their comparable performance in clinical practice, the choice between the Radiographic Assessment of Lung Edema and the BRIXIA scores can be determined by institutional preference.



INTRODUCTION

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was originally described in Wuhan, China, in late 2019[1,2]. The World Health Organization (WHO) declared the outbreak a pandemic on March 11, 2020[3]. Hospitals diagnosed pneumonia in around 91% of the admitted patients, with 5% requiring admission to intensive care facilities[4]. The most common presentation requiring hospital admission was hypoxemia, and the majority of hospitalized patients required oxygen therapy[5]. Chest computed tomography (CT) is the most sensitive modality for detecting early coronavirus disease 2019 (COVID-19)-related lung abnormalities[6,7]. Serial CT scans at 3-day to 7-day intervals effectively track disease progression from diagnosis through discharge[8-10]. However, high emergency department (ED) volumes limit its use. Chest radiographs (CXR) offer a practical alternative for pneumonia diagnosis[11], though with variable diagnostic accuracy[12-14]. This variability underscores the potential utility of structured scoring systems to standardize CXR interpretation and severity quantification.

The Radiographic Assessment of Lung Edema (RALE) score is a semi-quantitative scoring system[15] that evaluates lung edema severity by assessing the extent and density of alveolar opacities in each of the four lung quadrants on CXR, with scores ranging from 0 to 48. It has primarily been studied in respiratory distress syndrome patients and serves as a standardized tool for quantifying pulmonary opacification. Developed by Borghesi and Maroldi[16] for COVID-19 assessment, the BRIXIA score quantifies lung involvement on chest X-rays through a six-zone system (upper: I/IV; middle: II/V; lower: III/VI), with each zone scored 0-3 based on abnormality severity. The total score (0-18) correlates with parenchymal damage extent, disease severity, and mortality risk[16]. There is currently no published data correlating BRIXIA and RALE scores with the use of various oxygen delivery devices in COVID-19 patients, either at the time of admission to the ED or during the course of hospitalization. This study aims to investigate the ability of radiographs to predict oxygen support requirements and assess the inter- and intra-rater agreement of BRIXIA and RALE scores among clinicians.

MATERIALS AND METHODS
Study design and setting

This retrospective cohort study analyzed COVID-19 patients presenting to the 390-bed ED of a tertiary referral center in the Gulf region (Qatar) during the initial pandemic wave (May 15, 2020-June 30, 2020).

Study population

The study included consecutive adult patients who: (1) Received a chest radiography within 24 hours of ED assessment; (2) Had RT-PCR-confirmed SARS-CoV-2 infection; and (3) Required supplemental oxygen on admission or during hospitalization. Exclusion criteria consisted of: (1) Patients requiring pre-existing home oxygen supplementation; (2) Those who underwent endotracheal intubation or intercostal drain placement before initial radiography; and (3) Cases with suboptimal radiographic quality precluding accurate interpretation.

Data collection

Trained research assistants systematically extracted clinical data from the Cerner® electronic health record system (Cerner Corporation, Kansas City, MO, United States), including demographic information, physiological parameters, biochemical markers, and oxygen delivery device specifications. To maintain data quality and accuracy, a randomly selected 20% subset of all records underwent independent dual review by senior researchers. All collected data were securely stored in password-protected Excel® spreadsheets (Microsoft, Redmond, WA, United States), limiting access to only authorized study personnel throughout the research period. Any discrepancy was resolved by the leader of the data collection team.

Variables

The primary outcome was the requirement for supplemental oxygen, defined as the use of a nasal cannula, face mask, non-rebreather mask, high-flow nasal cannula, non-invasive ventilation, or invasive mechanical ventilation during the ED stay or hospitalization. Secondary outcomes included escalation in oxygen delivery mode. The main exposures were chest radiographic severity scores, specifically RALE and BRIXIA. Predictors included demographic characteristics (age and sex), comorbidities [no comorbidities, diabetes mellitus, hypertension, coronary artery disease/congestive heart failure, asthma/chronic obstructive pulmonary disease, chronic kidney disease, and others], physiological parameters (temperature, heart rate, respiratory rate, systolic blood pressure, and oxygen saturation), and laboratory biomarkers (C-reactive protein and white blood cell count).

Minimizing bias and standardizing radiograph interpretation

To minimize potential bias, all eligible patients meeting inclusion criteria were included to reduce selection bias. Radiographic assessment was standardized: Eight senior clinicians (five emergency medicine physicians, two critical care physicians, and one radiologist; Supplementary Table 1) received a 3-hour training program on the RALE and BRIXIA scoring systems. Each clinician interpreted 50 radiographs, with 30 reviewed twice to assess reliability. Cases were randomly assigned to clinicians, and scoring was blinded to reduce observer bias. Furthermore, all clinicians adhered to the standardized oxygen therapy guidelines from Hamad General Hospital (Supplementary Figure 1), ensuring consistency in clinical decision-making and eliminating bias arising from variable treatment protocols. Multivariable models were adjusted for potential confounders such as age and baseline oxygen delivery devices.

Statistical analysis

All analyses were conducted using STATA version 16.0 (StataCorp LLC, College Station, TX, United States), with continuous variables expressed as mean ± SD or median (interquartile range) based on distributional characteristics, and categorical variables reported as n (%). The predictive performance of radiographic scores for clinical outcomes was evaluated using receiver operating characteristic curve analysis, with area under the curve (AUC) calculations using the DeLong method[17]. Liu’s method was used to find the optimal CXR score cut-point for predicting oxygen delivery use by identifying the highest value of the product of sensitivity and specificity[18]. To ensure robustness, bootstrap resampling with 1000 replicates was utilized to generate robust confidence intervals for the selected cut-point. For categorical outcomes, including oxygen requirements, we performed logistic regression analyses, presenting both crude and adjusted odds ratios with 95% confidence intervals (CIs). Multivariable models incorporated sequential adjustment for potential confounders, beginning with a demographic factor (age) and progressively including clinical variables (initial oxygen delivery device in the ED on admission) to isolate the independent predictive value of radiographic findings.

To assess the consistency of radiographic scoring among clinicians, we evaluated both inter- and intra-rater reliability using complementary statistical measures. Weighted Scott’s/Fleiss’ Kappa[19] and Gwet’s AC[20] coefficients were calculated to quantify agreement. As a sensitivity analysis, we additionally computed agreement coefficients treating the scores as continuous variables with both linear and quadratic weighting schemes. All reliability estimates were reported with their corresponding 95%CIs. Bland-Altman analysis[21] was performed to evaluate individual clinician scoring consistency. Agreement was interpreted according to Gwet’s probabilistic benchmarking method: < 0 = poor, 0-0.2 = slight, 0.2-0.4 = fair, 0.4-0.6 = moderate, 0.6-0.8 = substantial, and 0.8-1.0 = almost perfect[20].

Ethical considerations

The Institutional Medical Research Council approved this study (No. MRC-05-233). Informed consent was waived, as the study involved a retrospective analysis of anonymized clinical data. All procedures complied with institutional and international ethical standards (Declaration of Helsinki, Good Clinical Practice, and Qatar Ministry of Public Health regulations). Protected health information remained secured in password-encrypted databases accessible only to study investigators.

RESULTS
Baseline characteristics

A final cohort of 950 participants with complete clinical data was obtained following exclusion of 137 datasets from an initial screening of 1087 (Figure 1). These participants comprised the study population, which exhibited a strong male predominance (90.6%), reflecting Qatar’s demographic profile. Patients demonstrated significantly higher oxygen support requirements during hospitalization than at initial ED presentation. Upon their arrival at the ED, the majority of patients (50.6%) were on room air and were able to maintain oxygen saturation levels of 90%-94% without the need for supplemental oxygen. Nasal cannulae were used in 35.6% of cases, while the use of non-rebreather mask (NRBM) (9.4%) and advanced support, such as high flow nasal cannula (HFNC), continuous positive airway pressure (CPAP), or Bi-level positive airway pressure (BiPAP) (2.1%), was limited. During hospitalization, oxygen support intensified: Nasal cannula 93.7%, NRBM 40.3%, and HFNC/CPAP/BiPAP 19.2%. Furthermore, 14.0% required intubation/tracheostomy. Table 1 displays the demographic, clinical, radiographic, and oxygen therapy characteristics of COVID-19 patients.

Figure 1
Figure 1 Patient selection. COVID-19: Coronavirus disease 2019.
Table 1 Summary of demographic characteristics, BRIXIA and Radiographic Assessment of Lung Edema scores, and oxygen use via oxygen delivery device, n (%).
Category
Variable
Summary
Total sample sizen950 (100.0)
DemographicsAge (mean ± SD)48.4 (13.6)
Male861 (90.6)
Chest X-ray scores, median (IQR)BRIXIA 8.0 (6.0)
RALE 12.0 (16.0)
Comorbidities No comorbidities 475 (50.0)
Diabetes mellitus 306 (32.2)
Hypertension293 (30.8)
Coronary artery disease/CHF59 (6.2)
Asthma/COPD35 (3.7)
Chronic kidney disease36 (3.7)
Others68 (7.1)
Clinical variables median (IQR)Temperature 38.0 (1.3)
Respiratory rate 22.0 (6.0)
Heart rate 102.0 (24.3)
Systolic blood pressure 129.0 (23.0)
White blood cells 6.8 (4.0)
C-reactive protein 71.5 (97.9)
Oxygen delivery device use in ED Room air 481 (50.6)
Nasal cannula 338 (35.6)
Hudson mask22 (2.3)
NRBM 89 (9.4)
HFNC/CPAP/BiPAP20 (2.1)
Oxygen device used during hospitalization Nasal cannula 890 (93.7)
Hudson mask14 (1.5)
NRBM383 (40.3)
HFNC/CPAP/BiPAP182 (19.2)
Oral intubation/tracheostomy133 (14.0)
BRIXIA and RALE scores

As shown in Table 2, the diagnostic performance of the BRIXIA and RALE scores varied according to the type of oxygen delivery device used both on ED arrival and during hospitalization.

Table 2 Performance of BRIXIA and Radiographic Assessment of Lung Edema scores for predicting oxygen delivery device and discharge outcomes in the emergency department and during hospitalization.
BRIXIA score
Cut-off point (95%CI)
Sensitivity
Specificity
AUC at cut-point (95%CI)
Oxygen delivery device in ED
Room air7 (6-8)0.430.390.41 (0.38-0.44)
Nasal cannula8 (7-9)0.510.600.56 (0.52-0.59)
Hudson mask9 (6-12)0.360.610.49 (0.38-0.59)
NRBM9 (8-10)0.660.640.65 (0.60-0.70)
HFNC/CPAP/BiPAP12 (9-15)0.650.840.74 (0.64-0.85)
Oxygen delivery device during hospitalization
Nasal cannula6 (4-8)0.600.480.54 (0.47-0.61)
Hudson mask13 (6-20)0.290.860.57 (0.42-0.72)
NRBM8 (7-9)0.580.660.62 (0.59-0.65)
HFNC/CPAP/BiPAP8 (7-9)0.630.610.62 (0.58-0.66)
Oral intubation/tracheostomy8 (6-10)0.650.600.63 (0.58-0.67)
RALE score
Oxygen delivery device in ED
Room air12 (8-16)0.490.570.42 (0.39-0.45)
Nasal cannula10 (7-13)0.580.570.54 (0.51-0.57)
Hudson mask14 (8-20)0.410.610.54 (0.43-0.64)
NRBM16 (10-22)0.540.720.62 (0.57-0.68)
HFNC/CPAP/BiPAP17 (12-22)0.650.770.74 (0.65-0.83)
Oxygen delivery device during hospitalization
Nasal cannula12 (6-18)0.290.850.52 (0.46-0.59)
Hudson mask13 (0-29)0.580.650.55 (0.42-0.69)
NRBM13 (10-16)0.640.610.6 (0.57-0.63)
HFNC/CPAP/BiPAP16 (11-21)0.660.590.59 (0.55-0.64)
Oral intubation/tracheostomy16 (12-20)0.520.360.62 (0.57-0.66)

Oxygen delivery device in ED on arrival: For patients on room air, the BRIXIA score had a cut-off value of 7 (95%CI: 6-8), with a sensitivity of 0.43, specificity of 0.39, and an AUC of 0.41 (95%CI: 0.38-0.44). The RALE score for this group had a cut-off of 12 (95%CI: 8-16), showing slightly better specificity (0.57) and an AUC of 0.42 (95%CI: 0.39-0.45). Among patients receiving oxygen via nasal cannula in the ED, the BRIXIA score cut-off was 8 (95%CI: 7-9), yielding improved sensitivity (0.51), specificity (0.60), and an AUC of 0.56 (95%CI: 0.52-0.59). In the same group, the RALE score cut-off was 10 (95%CI: 7-13), with a sensitivity of 0.58, a specificity of 0.57, and an AUC of 0.54 (95%CI: 0.51-0.57). For patients using a Hudson mask, the BRIXIA score cut-off was 9 (95%CI: 6-12), with a sensitivity of 0.36, specificity of 0.61, and an AUC of 0.49 (95%CI: 0.38-0.59). The RALE score cut-off in this group was 14 (95%CI: 8-20), with an AUC of 0.54 (95%CI: 0.43-0.64). In patients requiring an NRBM, the BRIXIA score cut-off was 9 (95%CI: 8-10), showing a sensitivity of 0.66, a specificity of 0.64, and an AUC of 0.65 (95%CI: 0.60-0.70). The RALE score cut-off was 16 (95%CI: 10-22), with a sensitivity of 0.54, a specificity of 0.72, and an AUC of 0.62 (95%CI: 0.57-0.68).

Oxygen delivery device during hospitalization: For hospitalized patients requiring a nasal cannula, the BRIXIA score had a cut-off of 6 (95%CI: 4-8), with a sensitivity of 0.60, specificity of 0.48, and an AUC of 0.54 (95%CI: 0.47-0.61). The RALE score in this group had a cut-off with a sensitivity of 0.29, a specificity of 0.85, and an AUC of 0.52 (95%CI: 0.46-0.59). For patients needing a Hudson mask, the BRIXIA score had a cut-off of 13 (95%CI: 6-20), with a sensitivity of 0.29, a specificity of 0.86, and an AUC of 0.57 (95%CI: 0.42-0.72). The RALE score had a cut-off of 13 (95%CI: 0-29), a sensitivity of 0.58, a specificity of 0.65, and an AUC of 0.55 (95%CI: 0.42-0.69). For patients requiring oral intubation/tracheostomy, the BRIXIA score had a cut-off of 8 (95%CI: 6-10), with a sensitivity of 0.65, a specificity of 0.60, and an AUC of 0.63 (95%CI: 0.58-0.67). The RALE score had a cut-off of 16 (95%CI: 12-20), a sensitivity of 0.52, a specificity of 0.36, and an AUC of 0.62 (95%CI: 0.57-0.66). Receiver operating characteristic curves for BRIXIA and RALE scores, in relation to oxygen device use and length of hospital stay, are shown in Supplementary Figures 2 and 3.

Adjusted logistic regression outcomes: BRIXIA and RALE scores were also analyzed using three logistic regression models: Unadjusted, adjusted for age (a-odds ratio), and adjusted for both age and the initial oxygen delivery device used in the ED (a-odds ratio) (Table 3).

Table 3 Logistic regression analysis of oxygen delivery requirements by BRIXIA and Radiographic Assessment of Lung Edema scores.

Crude OR (95%CI)
P value
a-OR1 (95%CI)
P value
a-OR2 (95%CI)
P value
BRIXIA score
Oxygen delivery device in ED
Model 1: Room air0.9 (0.87-0.92)< 0.0010.9 (0.87-0.93)< 0.001
Model 2: Nasal cannula1.05 (1.02-1.08)0.0011.05 (1.02-1.08)0.002
Model 3: Hudson mask0.92 (0.84-1.02)0.1080.93 (0.84-1.02)0.134
Model 4: NRBM1.15 (1.1-1.21)< 0.0011.15 (1.09-1.21)< 0.001
Model 5: HFNC/CPAP/BiPAP1.27 (1.14-1.42)< 0.0011.27 (1.14-1.41)< 0.001
Oxygen delivery device during hospitalization
Model 6: Nasal cannula1.01 (0.95-1.07)0.7841 (0.95-1.06)0.8681.04 (0.97-1.1)0.249
Model 7: Hudson mask1.02 (0.91-1.14)0.7611.01 (0.9-1.14)0.8300.99 (0.88-1.12)0.898
Model 8: NRBM1.13 (1.1-1.16)< 0.0011.12 (1.09-1.16)< 0.0011.09 (1.06-1.13)< 0.001
Model 9: HFNC/CPAP/BiPAP1.12 (1.08-1.16)< 0.0011.11 (1.07-1.15)< 0.0011.05 (1.01-1.09)0.018
Model 10: Oral intubation/tracheostomy1.11 (1.07-1.16)< 0.0011.1 (1.06-1.15)< 0.0011.04 (1-1.09)0.049
RALE score
Oxygen delivery device in ED
Model 14: Room air0.96 (0.95-0.97)< 0.0010.96 (0.95-0.97)< 0.001
Model 15: Nasal cannula1.01 (1-1.02)0.0401.01 (1-1.02)0.065
Model 16: Hudson mask0.98 (0.94-1.02)0.4170.99 (0.95-1.03)0.485
Model 17: NRBM1.05 (1.04-1.07)< 0.0011.05 (1.03-1.07)< 0.001
Model 18: HFNC/CPAP/BiPAP1.07 (1.04-1.11)< 0.0011.07 (1.04-1.11)< 0.001
Oxygen delivery device during hospitalization
Model 19: Nasal cannula1 (0.97-1.02)0.8421 (0.97-1.02)0.7601.01 (0.99-1.04)0.358
Model 20: Hudson mask1.02 (0.98-1.07)0.3161.02 (0.98-1.07)0.3561.01 (0.96-1.06)0.733
Model 21: NRBM1.05 (1.04-1.06)< 0.0011.05 (1.03-1.06)< 0.0011.04 (1.02-1.05)< 0.001
Model 22: HFNC/CPAP/BiPAP1.04 (1.03-1.06)< 0.0011.04 (1.02-1.05)< 0.0011.02 (1-1.03)0.037
Model 23: Oral intubation/tracheostomy1.04 (1.03-1.06)< 0.0011.04 (1.02-1.05)< 0.0011.02 (1-1.04)0.042

Reliability of radiographic scores: The reliability analysis demonstrated a distinct pattern between inter- and intra-rater agreement. Inter-rater reliability for both BRIXIA and RALE scores was moderate (κ = 0.57-0.58), indicating notable variability between clinicians. The complete inter-rater statistical results are presented in Table 4, while Figure 2 displays score distributions across raters. Conversely, intra-rater reliability was substantially higher, ranging from substantial (BRIXIA κ = 0.77) to almost perfect (RALE κ = 0.85), demonstrating strong self-consistency over time. These intra-rater results are detailed in Table 5, with corresponding Bland-Altman plots shown in Figure 3.

Figure 2
Figure 2 BRIXIA and Radiographic Assessment of Lung Edema scores. A: Inter-rater distribution of BRIXIA scores; B: Inter-rater distribution of Radiographic Assessment of Lung Edema scores. RALE: Radiographic Assessment of Lung Edema.
Figure 3
Figure 3 BRIXIA and Radiographic Assessment of Lung Edema scores. A: Intra-rater agreement for BRIXIA scores; B: Intra-rater agreement of Radiographic Assessment of Lung Edema scores. RALE: Radiographic Assessment of Lung Edema.
Table 4 Inter-rater agreement coefficients for BRIXIA and Radiographic Assessment of Lung Edema scores (n = 44).
Weight type
Measure
BRIXIA (95%CI)
RALE (95%CI)
Benchmarking interval
Extent of agreement
Ordinal weightsPercent agreement0.95 (0.94-0.96)0.95 (0.94-0.96)(0.80-1.00)Almost perfect
Scott/Fleiss’ Kappa0.57 (0.46-0.67)0.58 (0.46-0.70)(0.40-0.60)Moderate
Gwet’s AC0.74 (0.69-0.80)0.75 (0.70-0.80)(0.60-0.80)Substantial
Linear weightsPercent agreement0.83 (0.81-0.85)0.85 (0.84-0.87)(0.80-1.00)Almost perfect
Scott/Fleiss’ Kappa0.38 (0.03-0.47)0.37 (0.27-0.46)(0.20-0.40)Fair
Gwet’s AC0.54 (0.48-0.59)0.54 (0.50-0.59)(0.40-0.60)Moderate
Quadratic weightsPercent agreement0.95 (0.94-0.96)0.96 (0.96-0.97)(0.80-1.00)Almost perfect
Scott/Fleiss’ Kappa0.59 (0.48-0.70)0.59 (0.46-0.71)(0.40-0.60)Moderate
Gwet’s AC0.76 (0.71-0.82)0.77 (0.72-0.83)(0.60-0.80)Substantial
Table 5 Intra-rater agreement coefficients for BRIXIA and Radiographic Assessment of Lung Edema scores (n = 220).
Score
Measure
Agreement coefficient (95%CI)
Benchmarking interval
Extent of agreement
BRIXIAPercent agreement0.97 (0.96-0.98)(0.80-1.00)Almost perfect
Scott/Fleiss’ Kappa0.77 (0.71-0.83)(0.60-0.80)Substantial
Gwet’s AC0.85 (0.81-0.89)(0.80-1.00)Almost perfect
RALEPercent agreement0.98 (0.97-0.98)(0.80-1.00)Almost perfect
Scott/Fleiss’ Kappa0.85 (0.81-0.90)(0.80-1.00)Almost perfect
Gwet’s AC0.88 (0.85-0.91)(0.80-1.00)Almost perfect
DISCUSSION

This study investigated the utility of two radiographic severity scores (RALE and BRIXIA) in quantifying pulmonary infiltrates on radiographs taken within 24 hours of admission for predicting oxygen support requirements in patients with COVID-19. The findings suggest that these scores are moderately predictive of patients requiring high levels of oxygen support in the ED and, to a lesser extent, in the subsequent hospital admission. Higher scores on either scoring system reflected increasingly dense and widespread infiltrates, which predicted higher levels of oxygen requirements. However, lower levels of infiltrate density did not reliably predict lower levels of oxygen support, especially during admission, and radiographs would not help identify patients suitable for transfer to facilities with less oxygen support.

The predictive accuracy of both scores, as measured by the AUC, increased with the intensity of required respiratory support. The AUC was modest for lower levels of support (BRIXIA AUC: 0.56; RALE AUC: 0.54) but significantly better for predicting the need for HFNC, CPAP, or BiPAP, with an identical AUC of 0.74. This indicates these scores are particularly valuable for identifying patients who will require advanced respiratory interventions rather than distinguishing between mild disease states. Additionally, a significant finding is the BRIXIA score’s high specificity (0.84) at its optimal cut-off of 12 for predicting HFNC/CPAP/BiPAP, suggesting a score below 12 can help exclude the immediate need for high-level support. The predictive power of both systems decreased during hospitalization (e.g., BRIXIA AUC dropped to 0.62), likely because clinical deterioration in inpatients is influenced by factors beyond the initial lung injury, such as secondary infections and comorbidities. Logistic regression confirmed a significant correlation between higher radiographic scores and the need for intensive respiratory support. Each one-point increase in the BRIXIA and RALE scores was associated with 27% and 7% higher odds of requiring HFNC/CPAP/BiPAP in the ED, respectively. This correlation held for patients requiring NRBMs and mechanical ventilation, confirming these scores accurately reflect the severity of pulmonary disease at a granular level.

The findings on the predictive value of CXR scores are strongly supported by existing literature. A study by Toussie et al[22] confirmed that a CXR severity score at initial ED presentation was a powerful predictor of critical outcomes, with a score of ≥ 2 associated with 6.2 times higher odds of hospitalization and a score of ≥ 3 among admitted patients associated with 4.7 times higher odds of intubation. Furthermore, the prognostic value of the specific scoring system we employed is well-established. The original BRIXIA score validation study by Borghesi et al[16] demonstrated that it was a significant independent predictor of in-hospital mortality, with each point increase associated with a 27% increase in the odds of death. This is further reinforced by recent research from Shaima et al[23], which validated specific CXR score cut-offs, finding that a score of ≥ 3 was an independent predictor of severe/critical disease and a score of ≥ 5 was a powerful predictor of in-hospital mortality.

The high inter-rater per cent agreement (95%) achieved in this study cannot be disentangled from the standardized 3-hour training program undertaken by the clinicians. This finding demonstrates that with a structured, concise training intervention, frontline clinicians across specialities (emergency medicine, critical care, and radiology) can achieve a strong consensus, making these scores viable for rapid implementation in clinical practice. The exceptional intra-rater agreement (97%-98%) further confirms that once trained, clinicians can apply these systems with high self-consistency, making them reliable for both initial triage and serial monitoring.

This study demonstrates that CXR-based scoring provides a rapid, pragmatic method for predicting a specific, pressing outcome in the ED: Oxygen requirement. This approach stands in contrast to other powerful but more resource-intensive approaches. For instance, Yucal et al[24] introduced the Clopidogrel for High Atherothrombotic Risk and Ischemic Stabilization, Management, and Avoidance score, which integrates CT imaging and the biomarker surfactant protein D for mortality prediction, representing a comprehensive but less readily available prognostic model. Their study found a strong correlation between serum surfactant protein D levels and the quantitative volume of pulmonary infiltration on CT (r = 0.960), and the Clopidogrel for High Atherothrombotic Risk and Ischemic Stabilization, Management, and Avoidance score demonstrated excellent predictive power for in-hospital mortality (AUC: 0.977). Similarly, purely clinical scores like the Colonoscopy Progression Score[25] effectively predict survival, demonstrating the prognostic value of synthesizing clinical and laboratory data. In this context, the primary advantage of RALE and BRIXIA scoring is its ability to provide immediate, objective risk stratification using a universally available, standard-of-care test, bypassing the need for specialized laboratory assays or CT imaging. In this context, the primary advantage of RALE and BRIXIA scoring lies in its unique balance of speed, availability, and objective quantification. Unlike CT-based models or specialized biomarker assays, it leverages a universally available, standard-of-care test to provide rapid, low-cost risk stratification. While not directly comparable in predicting long-term outcomes like mortality, its strong performance in predicting immediate respiratory support needs makes it an invaluable tool for frontline clinical decision-making.

The study is limited by its single-center, retrospective design and exclusion of chronic oxygen-dependent patients. Lack of standardized radiograph positioning and incomplete documentation of symptom onset and clinical severity may have introduced measurement bias. Mortality data were unavailable, and only age and the initial ED oxygen device were considered as predictors, omitting other potentially relevant variables. Future multicenter studies with standardized imaging, comprehensive outcomes including mortality, and integration of scoring systems with clinical and biomarker data are recommended.

CONCLUSION

This study validates the clinical utility of both RALE and BRIXIA scoring systems as prognostic tools for COVID-19 patients in the ED. Demonstrating excellent inter-rater and intra-rater reliability, both scores effectively predicted the need for advanced respiratory support, with higher scores strongly correlating with increased oxygen support requirements. The BRIXIA score showed particular value in ruling out the need for high-level support due to its high specificity. However, the predictive power of both systems diminished during hospitalization. For emergency practice, the high reliability and predictive utility of either scoring system can provide valuable prognostic information, with the choice between them appropriately based on local preference and workflow considerations. Future research could focus on exploring the integration of these radiographic scores with clinical biomarkers and severity scales to enhance predictive accuracy across the care continuum.

ACKNOWLEDGEMENTS

The authors would like to express their deepest gratitude to all frontline workers in Qatar who dedicated their time, expertise, and unwavering commitment during the COVID-19 pandemic. Their courage, resilience, and selfless service played a vital role in protecting the community and sustaining essential services throughout this challenging period.

References
1.  Zhou P, Yang XL, Wang XG, Hu B, Zhang L, Zhang W, Si HR, Zhu Y, Li B, Huang CL, Chen HD, Chen J, Luo Y, Guo H, Jiang RD, Liu MQ, Chen Y, Shen XR, Wang X, Zheng XS, Zhao K, Chen QJ, Deng F, Liu LL, Yan B, Zhan FX, Wang YY, Xiao GF, Shi ZL. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020;579:270-273.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 15248]  [Cited by in RCA: 14356]  [Article Influence: 2392.7]  [Reference Citation Analysis (10)]
2.  Zhu N, Zhang D, Wang W, Li X, Yang B, Song J, Zhao X, Huang B, Shi W, Lu R, Niu P, Zhan F, Ma X, Wang D, Xu W, Wu G, Gao GF, Tan W; China Novel Coronavirus Investigating and Research Team. A Novel Coronavirus from Patients with Pneumonia in China, 2019. N Engl J Med. 2020;382:727-733.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 18987]  [Cited by in RCA: 17912]  [Article Influence: 2985.3]  [Reference Citation Analysis (2)]
3.  World Health Organization  Listings of WHO’s response to COVID-19 2020. [cited 3 August 2025]. Available from: https://www.who.int/news/item/29-06-2020-covidtimeline.  [PubMed]  [DOI]
4.  Eastin C, Eastin T. Clinical Characteristics of Coronavirus Disease 2019 in China. J Emerg Med. 2020;58:711-712.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 82]  [Cited by in RCA: 99]  [Article Influence: 16.5]  [Reference Citation Analysis (0)]
5.  Palazzuoli A, Ruberto F, De Ferrari GM, Forleo G, Secco GG, Ruocco GM, D'Ascenzo F, Mojoli F, Monticone S, Paggi A, Vicenzi M, Corcione S, Palazzo AG, Landolina M, Taravelli E, Tavazzi G, Blasi F, Mancone M, Birtolo LI, Alessandri F, Infusino F, Pugliese F, Fedele F, De Rosa FG, Emmett M, Schussler JM, McCullough PA, Tecson KM. Inpatient Mortality According to Level of Respiratory Support Received for Severe Acute Respiratory Syndrome Coronavirus 2 (Coronavirus Disease 2019) Infection: A Prospective Multicenter Study. Crit Care Explor. 2020;2:e0220.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 2]  [Cited by in RCA: 2]  [Article Influence: 0.3]  [Reference Citation Analysis (0)]
6.  Zu ZY, Jiang MD, Xu PP, Chen W, Ni QQ, Lu GM, Zhang LJ. Coronavirus Disease 2019 (COVID-19): A Perspective from China. Radiology. 2020;296:E15-E25.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 1106]  [Cited by in RCA: 956]  [Article Influence: 159.3]  [Reference Citation Analysis (2)]
7.  Li Y, Xia L. Coronavirus Disease 2019 (COVID-19): Role of Chest CT in Diagnosis and Management. AJR Am J Roentgenol. 2020;214:1280-1286.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 662]  [Cited by in RCA: 656]  [Article Influence: 109.3]  [Reference Citation Analysis (1)]
8.  Bernheim A, Mei X, Huang M, Yang Y, Fayad ZA, Zhang N, Diao K, Lin B, Zhu X, Li K, Li S, Shan H, Jacobi A, Chung M. Chest CT Findings in Coronavirus Disease-19 (COVID-19): Relationship to Duration of Infection. Radiology. 2020;295:200463.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 1728]  [Cited by in RCA: 1608]  [Article Influence: 268.0]  [Reference Citation Analysis (3)]
9.  Pan F, Ye T, Sun P, Gui S, Liang B, Li L, Zheng D, Wang J, Hesketh RL, Yang L, Zheng C. Time Course of Lung Changes at Chest CT during Recovery from Coronavirus Disease 2019 (COVID-19). Radiology. 2020;295:715-721.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 1617]  [Cited by in RCA: 1782]  [Article Influence: 297.0]  [Reference Citation Analysis (1)]
10.  Wei J, Xu H, Xiong J, Shen Q, Fan B, Ye C, Dong W, Hu F. 2019 Novel Coronavirus (COVID-19) Pneumonia: Serial Computed Tomography Findings. Korean J Radiol. 2020;21:501-504.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 73]  [Cited by in RCA: 73]  [Article Influence: 12.2]  [Reference Citation Analysis (0)]
11.  Jackson CD, Burroughs-Ray DC, Summers NA. Clinical Guideline Highlights for the Hospitalist: 2019 American Thoracic Society/Infectious Diseases Society of America Update on Community-Acquired Pneumonia. J Hosp Med. 2020;15:743-745.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 5]  [Cited by in RCA: 13]  [Article Influence: 2.2]  [Reference Citation Analysis (0)]
12.  Liapikou A, Cillóniz C, Gabarrús A, Amaro R, De la Bellacasa JP, Mensa J, Sánchez M, Niederman M, Torres A. Multilobar bilateral and unilateral chest radiograph involvement: implications for prognosis in hospitalised community-acquired pneumonia. Eur Respir J. 2016;48:257-261.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 10]  [Cited by in RCA: 15]  [Article Influence: 1.5]  [Reference Citation Analysis (0)]
13.  Claessens YE, Debray MP, Tubach F, Brun AL, Rammaert B, Hausfater P, Naccache JM, Ray P, Choquet C, Carette MF, Mayaud C, Leport C, Duval X. Early Chest Computed Tomography Scan to Assist Diagnosis and Guide Treatment Decision for Suspected Community-acquired Pneumonia. Am J Respir Crit Care Med. 2015;192:974-982.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 153]  [Cited by in RCA: 213]  [Article Influence: 19.4]  [Reference Citation Analysis (0)]
14.  Rees CA, Basnet S, Gentile A, Gessner BD, Kartasasmita CB, Lucero M, Martinez L, O'Grady KF, Ruvinsky RO, Turner C, Campbell H, Nair H, Falconer J, Williams LJ, Horne M, Strand T, Nisar YB, Qazi SA, Neuman MI; World Health Organization PREPARE study group. An analysis of clinical predictive values for radiographic pneumonia in children. BMJ Glob Health. 2020;5:e002708.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 7]  [Cited by in RCA: 23]  [Article Influence: 4.6]  [Reference Citation Analysis (0)]
15.  Warren MA, Zhao Z, Koyama T, Bastarache JA, Shaver CM, Semler MW, Rice TW, Matthay MA, Calfee CS, Ware LB. Severity scoring of lung oedema on the chest radiograph is associated with clinical outcomes in ARDS. Thorax. 2018;73:840-846.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 184]  [Cited by in RCA: 243]  [Article Influence: 30.4]  [Reference Citation Analysis (1)]
16.  Borghesi A, Zigliani A, Golemi S, Carapella N, Maculotti P, Farina D, Maroldi R. Chest X-ray severity index as a predictor of in-hospital mortality in coronavirus disease 2019: A study of 302 patients from Italy. Int J Infect Dis. 2020;96:291-293.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 86]  [Cited by in RCA: 112]  [Article Influence: 18.7]  [Reference Citation Analysis (0)]
17.  DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44:837-845.  [PubMed]  [DOI]
18.  Liu X. Classification accuracy and cut point selection. Stat Med. 2012;31:2676-2686.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 320]  [Cited by in RCA: 516]  [Article Influence: 36.9]  [Reference Citation Analysis (0)]
19.  Fleiss JL. Measuring nominal scale agreement among many raters. Psychol Bull. 1971;76:378-382.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 4012]  [Cited by in RCA: 4060]  [Article Influence: 73.8]  [Reference Citation Analysis (0)]
20.  Gwet KL  Handbook of Inter-Rater Reliability: The Definitive Guide to Measuring the Extent of Agreement among Raters. 4th ed. Gaithersburg, MD: Advanced Analytics, LLC, 2014: 104-112.  [PubMed]  [DOI]
21.  Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1:307-310.  [PubMed]  [DOI]
22.  Toussie D, Voutsinas N, Finkelstein M, Cedillo MA, Manna S, Maron SZ, Jacobi A, Chung M, Bernheim A, Eber C, Concepcion J, Fayad ZA, Gupta YS. Clinical and Chest Radiography Features Determine Patient Outcomes in Young and Middle-aged Adults with COVID-19. Radiology. 2020;297:E197-E206.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 183]  [Cited by in RCA: 227]  [Article Influence: 37.8]  [Reference Citation Analysis (0)]
23.  Nahar Shaima S, Haque MA, Sarmin M, Nuzhat S, Jahan Y, Bushra Matin F, Shahrin L, Afroze F, Saha H, Timu RT, Kamal M, Shahid ASMSB, Sultana N, Mamun GMS, Chisti MJ, Ahmed T. Performance of chest X-ray scoring in predicting disease severity and outcomes of patients hospitalised with COVID-19 in Bangladesh. SAGE Open Med. 2024;12:20503121231222325.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in RCA: 1]  [Reference Citation Analysis (0)]
24.  Yucal A, Burak Sayhan M, Salt Ö, Dıbırdık İ, Çalın S. Novel tools for evaluating COVID-19 at the emergency department: Surfactant protein D level and CHARISMA score. Heliyon. 2024;10:e39976.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in RCA: 1]  [Reference Citation Analysis (1)]
25.  Cho SY, Park SS, Song MK, Bae YY, Lee DG, Kim DW. Prognosis Score System to Predict Survival for COVID-19 Cases: a Korean Nationwide Cohort Study. J Med Internet Res. 2021;23:e26257.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 13]  [Cited by in RCA: 13]  [Article Influence: 2.6]  [Reference Citation Analysis (0)]
Footnotes

Provenance and peer review: Unsolicited article; Externally peer reviewed.

Peer-review model: Single blind

Specialty type: Critical care medicine

Country of origin: United Arab Emirates

Peer-review report’s classification

Scientific Quality: Grade A, Grade B

Novelty: Grade B, Grade B

Creativity or Innovation: Grade B, Grade B

Scientific Significance: Grade B, Grade B

Open Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/

P-Reviewer: Yucal A, MD, Post Doctoral Researcher, Türkiye S-Editor: Bai SR L-Editor: A P-Editor: Wang WB