Published online Jun 18, 2026. doi: 10.13105/wjma.v14.i2.121918
Revised: April 21, 2026
Accepted: May 13, 2026
Published online: June 18, 2026
Processing time: 69 Days and 8.2 Hours
Heart failure with preserved ejection fraction (HFpEF) is a growing global health burden with historically limited disease-modifying therapies. Patient-reported outcomes (PROs) and health-related quality of life (QoL) are central therapeutic targets, particularly in obesity-related HFpEF where symptom burden is substan
To systematically review and quantitatively pool the evidence on PRO measure
Following PRISMA 2020 guidelines, MEDLINE, EMBASE, the Cochrane Central Register of Controlled Trials, and ClinicalTrials.gov were searched from inception through January 2026. Eligible studies included randomised controlled trials (RCTs) and prospective cohort studies enrolling adults with HFpEF receiving semaglutide that reported validated QoL or PRO instruments. Risk of bias was assessed using the Cochrane Risk of Bias 2 tool for RCTs and the ROBINS-I framework for the cohort study. A fixed-effects meta-analysis using inverse-variance weighting was performed for the two design-homogeneous RCTs (identical dose, endpoint, and follow-up). A qualitative synthesis additionally appraised the methodological comparability of PRO measurement across all included studies.
Three studies met inclusion criteria: Two double-blind, placebo-controlled, multicentre RCTs [semaglutide treatment effect in people with obesity and HFpEF (STEP-HFpEF), n = 529; STEP-HFpEF and diabetes mellitus, n = 616] and one prospective propensity score-matched cohort study (n = 406 after matching). Fixed-effects meta-analysis of the two RCTs yielded a pooled KCCQ-Clinical Summary Score (CSS) treatment effect of 7.36 points (95%CI: 5.32-9.40; P < 0.001; I² = 0%), exceeding the 5-point MCID. The open-label observational cohort reported a 21-point absolute KCCQ-Total Symptom Score improvement, substantially larger than the RCT estimates-most plausibly attributable to expectation bias, residual confounding, subscale non-equivalence, and dose heterogeneity. Methodological evaluation revealed important gaps in KCCQ psychometric reporting, missing data handling, MCID justification, and standardisation of PRO data collection procedures. Generalisability is constrained by exclusive enrolment of obesity-related HFpEF phenotypes, predominantly White Western populations, and industry-sponsored trial designs.
Semaglutide produces consistent, clinically meaningful PRO improvements in obesity-related HFpEF. The pooled RCT effect of 7.36 KCCQ-CSS points constitutes the most methodologically reliable estimate of pharmacologic benefit. Significant methodological gaps remain across PRO instrument standardisation, psychometric reporting, population diversity, and prospective trial registration, which future HFpEF trials must address.
Core Tip: This systematic review demonstrates that semaglutide improves patient-reported quality of life in obesity-related heart failure with preserved ejection fraction, with blinded randomised trials showing approximately 7-8 Kansas City Cardiomyopathy Questionnaire points of benefit over placebo. The threefold larger effect in the open-label observational cohort reflects expectation bias and residual confounding rather than genuine superiority. Methodological heterogeneity across instruments, doses, and study designs limits cross-study comparisons and highlights the urgent need for standardised patient-reported outcome sets and prospective trial registration in this field.
- Citation: Abdulaal R, Khalil LM, Hteit A, Al Mashtoub E, Allaw M, Hajj L, Taki A, Tlais M. Patient-reported outcome assessment in heart failure with preserved ejection fraction: A systematic review of semaglutide trials using the Kansas City Cardiomyopathy Questionnaire. World J Meta-Anal 2026; 14(2): 121918
- URL: https://www.wjgnet.com/2308-3840/full/v14/i2/121918.htm
- DOI: https://dx.doi.org/10.13105/wjma.v14.i2.121918
Heart failure with preserved ejection fraction (HFpEF) accounts for approximately half of all heart failure (HF) cases globally and is increasing in prevalence as populations age alongside rising rates of obesity, type 2 diabetes (T2D), and hypertension[1]. Despite its growing burden, HFpEF has historically lacked effective disease-modifying therapies, and prognosis in terms of morbidity and quality of life (QoL) is often comparable to, or worse than, HF with reduced ejection fraction[1]. Symptom relief, functional improvement, and enhancement of health-related QoL have therefore become central therapeutic goals.
Obesity defines a clinically distinct and highly prevalent HFpEF phenotype. Epidemiologic and mechanistic evidence suggests that excess adiposity contributes to expanded blood volume, increased cardiac output, systemic inflammation, and abnormal ventricular–arterial coupling[2]. Patients with obesity-related HFpEF frequently experience marked exercise intolerance, elevated filling pressures during exertion, and a high symptom burden despite preserved left ventricular systolic function[2]. These patients report substantial limitations in daily activities and diminished well-being, rendering patient-reported outcomes (PROs) particularly relevant as endpoints.
Semaglutide, a once-weekly GLP-1 receptor agonist, produces approximately 15% weight loss in adults with obesity[3] alongside favourable metabolic, haemodynamic, and anti-inflammatory effects. The convergence of an obesity-driven HFpEF phenotype with semaglutide’s mechanism of action has prompted dedicated clinical trials examining its effects on PROs and QoL in HFpEF populations. The Kansas City Cardiomyopathy Questionnaire (KCCQ) is the most widely used disease-specific PRO instrument in HFpEF trials, capturing symptoms, physical limitations, social function, and perceived QoL across several subscales and summary scores.
Despite the availability of individual trial reports, a rigorous methodological appraisal comparing how PROs are measured, operationalised, and interpreted across semaglutide studies in HFpEF has not been published. Key methodological questions remain unanswered: Are different KCCQ subscales [Clinical Summary Score (CSS) vs Total Symptom Score (TSS)] capturing equivalent constructs? What explains the large discrepancy in effect sizes between blinded randomised controlled trial (RCT) and open-label observational data? How do differences in semaglutide dose, follow-up duration, and study design affect the validity and comparability of reported PRO findings? Addressing these questions is essential for correctly interpreting the evidence base and for designing future trials. Furthermore, the feasibility and appropriateness of quantitative pooling of the two design-homogeneous RCTs-a question not addressed in prior methodological appraisals-requires formal evaluation.
The primary aim of this systematic review and meta-analysis is to synthesise and quantitatively pool the evidence on semaglutide’s impact on health-related QoL and PROs in adults with HFpEF. A secondary and methodologically focused aim is to critically appraise the comparability of PRO measurement approaches-including instrument psychometric properties, KCCQ subscale selection, blinding status, missing data handling, MCID rationale, standardised data collection, and study design-across included studies, and to examine how these methodological factors influence observed effect sizes and their clinical interpretation.
This systematic review and meta-analysis followed PRISMA 2020 guidelines and used the KCCQ as the primary PRO instrument of interest. Risk of bias was assessed using Risk of Bias 2 (RoB 2) (for RCTs) and ROBINS-I (for the observational cohort). A fixed-effects meta-analysis using inverse-variance weighting was performed for the two RCTs, which were judged to be sufficiently homogeneous to permit quantitative pooling. A supplementary qualitative synthesis appraised methodological heterogeneity across all included studies.
This systematic review and meta-analysis was conducted and reported in accordance with the PRISMA 2020 statement (Figure 1). The review question, eligibility criteria, data extraction framework, and statistical analysis plan were specified a priori in a written protocol document finalised before the systematic search was initiated. The protocol was not prospectively registered with PROSPERO or an equivalent publicly accessible registry prior to data collection-an acknowledged methodological limitation detailed in the Limitations section. To enhance transparency, the protocol document is available from the corresponding author upon request, and the a priori written specification of all planned analyses, including the planned fixed-effects meta-analysis for design-homogeneous RCTs, is documented therein. Prospective registration is strongly recommended for all future updates of this review.
We searched MEDLINE (via PubMed), EMBASE, and the Cochrane Central Register of Controlled Trials, and screened ClinicalTrials.gov from database inception to January 2026. The full MEDLINE search strategy is provided in the Supplementary material. Search terms combined controlled vocabulary (MeSH) and free-text keywords relating to HFpEF, semaglutide, and QoL/PROs. Filters for adult human studies, English language, and peer-reviewed articles were applied. Reference lists of included studies and relevant systematic reviews were manually screened for additional records.
Studies were eligible if they fulfilled all of the following Population, Intervention, Comparison, Outcomes and Study criteria.
Population: Adults (≥ 18 years) with HFpEF, defined as left ventricular ejection fraction (LVEF) ≥ 50% or by contemporary guideline-based criteria. Mixed HF populations were eligible only if HFpEF-specific data were clearly separable.
Intervention: Semaglutide at any dose or formulation, alone or added to standard HFpEF therapy, as the primary pharmacologic intervention.
Comparator: Placebo, standard of care, another active treatment, or baseline values.
Outcomes: At least one validated QoL or PRO measure, including but not limited to the KCCQ (any subscale or summary score), Minnesota Living with Heart Failure Questionnaire, 36-Item Short Form Health Survey (SF-36), EuroQoL-5-dimension (EQ-5D).
Study design: RCTs, non-RCT, or prospective cohort studies published as peer-reviewed, full-text articles in English.
Excluded criteria: Case reports and case series with fewer than 10 patients; cross-sectional studies; narrative or systematic reviews; editorials; conference abstracts without full published data; and studies in which HFpEF results could not be distinguished from other HF phenotypes or where semaglutide was not the primary intervention.
Two reviewers independently screened titles and abstracts. Full texts were retrieved and assessed against eligibility criteria in duplicate. Disagreements were resolved by discussion and consensus. Companion publications (e.g., prespecified subgroup analyses) were linked to their parent RCTs and used to supplement data but were not counted as separate primary studies. Standardised data extraction forms captured: Study design, setting, and eligibility criteria; HFpEF definition and LVEF threshold; sample size; baseline characteristics; semaglutide regimen; PRO instrument(s), subscale, and MCID threshold applied; all reported QoL/PRO outcomes with 95%CIs and P values; and key secondary outcomes.
Risk of bias in RCTs was assessed using the Cochrane RoB 2 tool across five domains. For the prospective cohort study, the ROBINS-I framework was applied across seven domains. Discrepancies were resolved by consensus.
Beyond standard risk of bias assessment, a dedicated methodological appraisal of PRO measurement quality was conducted across all included studies. This appraisal addressed the following pre-specified domains:
Instrument selection and psychometric properties: KCCQ subscale used, its validation status, internal consistency (Cronbach’s α), test-retest reliability, construct validity, and responsiveness to clinical change as documented in the HFpEF literature.
MCID rationale: The basis for the MCID threshold applied in each study, whether it was pre-specified, and whether the derivation method (anchor-based vs distribution-based) was reported.
Missing data handling: The statistical approach to missing PRO data (e.g., multiple imputation, last-observation-carried-forward, mixed-effects model for repeated measures), and whether sensitivity analyses were performed.
Blinding status and expectation bias: Open-label vs double-blind PRO collection and the implications for performance and detection bias.
PRO data collection standardisation: Timing of assessments, mode of administration (paper vs electronic), and whether standardised procedures for PRO collection were documented.
Dose comparability: Semaglutide dose and titration schedule across studies.
Follow-up duration adequacy: Whether follow-up was sufficient to capture sustained PRO change.
A fixed-effects meta-analysis using inverse-variance weighting was performed for the two RCTs [semaglutide treatment effect in people with obesity and HFpEF (STEP-HFpEF) and STEP-HFpEF and diabetes mellitus (STEP-HFpEF DM)], which shared identical study design (double-blind, placebo-controlled RCT), identical semaglutide dose (2.4 mg subcutaneous weekly), identical primary PRO endpoint (KCCQ-CSS), and comparable follow-up duration (52 weeks). These design features justified quantitative pooling under a fixed-effects framework, on the assumption that both trials estimated the same underlying treatment effect. The between-group mean difference in KCCQ-CSS and its 95%CI were extracted from each trial. Statistical heterogeneity was quantified using the I2 statistic and Cochran’s Q test. The observational cohort study was excluded from formal pooling due to its fundamentally different design (open-label, non-randomised), PRO subscale (KCCQ-TSS vs KCCQ-CSS), semaglutide dose (0.5-1.0 mg/week), and follow-up duration (24 months). All analyses were performed using standard inverse-variance fixed-effects formulae.
The systematic search across all databases yielded 185 records before duplicate removal. After removal of 42 duplicates and sequential screening, three primary studies met the eligibility criteria: Two RCTs (STEP-HFpEF[4] and STEP-HFpEF DM[5]) and one prospective cohort study[6]. Seven companion publications reporting prespecified subgroup and pooled analyses from the same randomised populations were identified[7-10] and used for supplemental data; these were not counted as primary studies. The PRISMA 2020 flow diagram is presented in Figure 1.
RCT: Both RCTs were global, multicentre, double-blind, placebo-controlled studies evaluating once-weekly subcutaneous semaglutide 2.4 mg for 52 weeks in adults with HFpEF and obesity. STEP-HFpEF enrolled 529 patients with HFpEF [LVEF ≥ 45%, New York Heart Association class II–III, body mass index (BMI) ≥ 30 kg/m²] without diabetes[4], while STEP-HFpEF DM enrolled 616 patients with HFpEF and T2D[5]. Key baseline characteristics were similar: Mean age approximately 69 years, 45%-56% women, mean LVEF 55%-57%, mean BMI approximately 37 kg/m², and mean KCCQ-CSS approximately 60 points, indicating markedly impaired baseline health status. The dual co-primary endpoints were change from baseline in KCCQ-CSS and body weight at 52 weeks[8]. Background guideline-directed HFpEF therapy was continued and balanced between groups.
Observational cohort study: Pérez-Velasco et al[6] reported a prospective, multicentre cohort study from Spain in 632 adults with HFpEF (LVEF ≥ 50%), obesity, and T2D. Semaglutide (0.5-1.0 mg/week) was initiated in 358 patients; 274 patients not receiving any GLP-1 receptor agonist served as controls. After 1:1 propensity score matching, 203 patients remained in each group (mean age approximately 76 years, 59% female, mean BMI approximately 35 kg/m², mean KCCQ-TSS approximately 54). Follow-up was 24 months. Key study characteristics are summarised in Table 1.
| Ref. | Study design | n (sema/ | Semaglutide dose | Follow-up | Mean age (years) | Women (%) | Mean BMI (kg/m²) | Diabetes at baseline | Mean LVEF (%) | Primary PRO endpoint | Baseline KCCQ score | KCCQ treatment effect | Pooled RCT effect (fixed-effects) | Blinding status | Risk of bias (tool) | Population | Sponsorship |
| Kosiborod et al[4] | Double-blind RCT | 263/266 | 2.4 mg SC weekly | 52 weeks | 69 | 56 | 37 | Excluded | 57 | KCCQ-CSS | CSS 60 pts | +7.8 pts (95%CI: 4.8-10.9) | 7.36 pts (95%CI: 5.32-9.40; I2 = 0%) | Double-blind | Low (RoB 2) | White, Western, BMI ≥ 30 | Industry (Novo Nordisk) |
| Kosiborod et al[5] | Double-blind RCT | 310/306 | 2.4 mg SC weekly | 52 weeks | 69 | 45 | 37 | Required (T2D) | 55 | KCCQ-CSS | CSS 60 pts | +7.0 pts (95%CI: 4.3-9.8) | 7.36 pts (95%CI: 5.32-9.40; I2 = 0%) | Double-blind | Low (RoB 2) | White, Western, BMI ≥ 30 | Industry (Novo Nordisk) |
| Pérez-Velasco et al[6] | Prospective cohort (PSM) | 203/203 | 0.5-1.0 mg SC weekly | 24 months | 76 | 59 | 35 | Required (T2D) | ≥ 50 | KCCQ-TSS | TSS 54 pts | +14 pts between-group | NA (excluded from pooling) | Open-label | Moderate-to-high (ROBINS-I) | Spanish, BMI ≥ 30, T2D | Not explicitly reported |
Given the design homogeneity of STEP-HFpEF and STEP-HFpEF DM (identical dose, blinding, primary endpoint, and follow-up duration), a fixed-effects meta-analysis was performed to derive a pooled KCCQ-CSS treatment effect estimate. The individual trial estimates were: STEP-HFpEF, +7.8 points (95%CI: 4.8-10.9; SE = 1.56); and STEP-HFpEF DM, +7.0 points (95%CI: 4.3-9.8; SE = 1.40). Using inverse-variance weighting, the pooled fixed-effects estimate was 7.36 KCCQ-CSS points (95%CI: 5.32-9.40; P < 0.001). Statistical heterogeneity was absent (I2 = 0%; Cochran’s Q = 0.15; P = 0.70), confirming the appropriateness of the fixed-effects model and the high degree of design and effect-size consistency between the two trials. The pooled estimate exceeds the widely accepted 5-point MCID for the KCCQ-CSS and falls within the range associated with moderate clinical improvement. Weight contributions were approximately 45% (STEP-HFpEF) and 55% (STEP-HFpEF DM), reflecting the slightly larger sample size of the latter trial (Figure 2).
KCCQ psychometric properties and instrument validity: The KCCQ is a 23-item, self-administered, disease-specific health status instrument developed and validated specifically for patients with HF. Published psychometric studies have demonstrated high internal consistency (Cronbach’s α ≥ 0.87 across subscales), acceptable test-retest reliability (intraclass correlation coefficients 0.73-0.90), and strong convergent validity with functional class, exercise capacity [six-minute walk distance (6MWD)], and echocardiographic parameters. Responsiveness to clinical change is well-established: KCCQ-CSS scores track New York Heart Association functional class transitions, hospitalisation events, and mortality with discriminative ability superior to generic QoL instruments in HF populations. Crucially, both RCTs employed the KCCQ-CSS as the primary PRO endpoint-a selection that is methodologically defensible given its superior comprehensiveness, capturing both symptom burden (frequency and severity) and physical function limitation. However, neither the STEP-HFpEF nor STEP-HFpEF DM publication provided a formal citation to the KCCQ validation literature or an explicit justification for CSS over Overall Summary Score (OSS), which includes the social function and QoL subscales. Future trials should explicitly document the psychometric rationale for subscale selection in the methods section, with reference to validation studies conducted specifically in HFpEF populations.
The 5-point MCID for KCCQ-CSS is consistently applied across the included studies as the threshold for clinical meaningfulness, and all placebo-corrected treatment effects in the RCTs exceed this threshold. However, there is methodological heterogeneity in how the MCID was derived and applied. The 5-point threshold was originally derived using anchor-based and distribution-based methods in broader HF populations; its direct applicability to obesity-related HFpEF-a phenotypically distinct subgroup with specific baseline symptom profiles and comorbidity burden-has not been formally validated. Distribution-based methods such as 0.5 SD of baseline scores would yield MCID estimates of approximately 6-8 points given mean baseline KCCQ-CSS values of approximately 60 points (SD approximately 15-18), broadly consistent with but not identical to the anchor-based 5-point threshold. Neither trial reported which specific method was used to establish the MCID or conducted sensitivity analyses with alternative thresholds. The observational cohort study applied the same 5-point threshold without adjustment for the KCCQ-TSS subscale, for which separate MCID estimates in HFpEF populations have not been published. Future trial protocols should prespecify the MCID with explicit anchor-based derivation in the target population, and sensitivity analyses should evaluate results across a range of MCID thresholds (e.g., 5, 7, and 10 points).
Missing PRO data were handled using a mixed-effects model for repeated measures in both RCTs-a method that is statistically principled under the missing-at-random assumption and is consistent with regulatory guidance for clinical trials with continuous endpoints. However, neither STEP-HFpEF nor STEP-HFpEF DM published a formal sensitivity analysis under the missing-not-at-random assumption (e.g., tipping-point analysis or pattern-mixture models), which is particularly important given that approximately 10%-13% of semaglutide recipients discontinued treatment due to gas
Standardised procedures for PRO data collection were described inconsistently across included studies. Both RCTs administered the KCCQ electronically at pre-specified timepoints (baseline, week 20, and week 52 for STEP-HFpEF; baseline and week 52 for STEP-HFpEF DM), with assessments conducted prior to clinical evaluations to minimise information contamination. The observational cohort study did not specify the mode of KCCQ administration (paper vs electronic), the personnel responsible for administration, or whether assessments occurred before or after clinical consultations-all factors known to affect PRO score distributions. In open-label settings, proximity to clinical encounters and knowledge of treatment group may systematically inflate or deflate self-reported symptoms. Future HFpEF trials and real-world studies should adopt standardised PRO administration protocols: Electronic, patient-initiated assessment prior to clinical encounter, with blinding of assessors to clinical and biomarker data where feasible.
The two RCTs used the KCCQ-CSS as the primary PRO endpoint, while the observational cohort used the KCCQ-TSS. The KCCQ-CSS averages the symptom frequency and physical limitation subscales and is regarded as the most comprehensive composite of HF-specific health status. The KCCQ-TSS averages symptom frequency and burden subscales, emphasising symptom experience without capturing physical function. These subscale differences preclude direct numeric comparison of reported effect sizes across studies, and there are no established cross-walk or conversion equations between KCCQ-CSS and KCCQ-TSS that would allow pooled analysis across all three included studies.
Both RCTs maintained rigorous double-blinding of patients, investigators, and outcome assessors[4,5]. The observational cohort was open-label by design[6]. Because the KCCQ is entirely patient-reported, open-label studies are inherently susceptible to expectation bias: Patients who know they are receiving a well-publicised, effective weight-loss therapy may unconsciously report greater perceived improvement. The approximately threefold discrepancy in effect sizes between the open-label cohort (14 points between-group) and the blinded RCT estimates (7-8 points) is consistent with the well-documented magnitude of performance and detection bias inflation in open-label PRO studies.
Based on the methodological gaps identified in this review, we propose a minimum core outcome set (COS) for PRO assessment in future HFpEF trials evaluating GLP-1 receptor agonists.
Primary disease-specific instrument: KCCQ-OSS, encompassing all subscales, as the most comprehensive capture of HFpEF-specific health status. If KCCQ-CSS is used, pre-registration of the subscale selection rationale is mandatory.
Generic health-related QoL instrument: EQ-5D-5 L (or EQ-5D-3 L for older populations), providing utility values for health economic analysis and cross-disease comparisons.
Functional status: 6MWD as an objective PRO-adjacent measure.
Symptom-specific measure: Patient global impression of change or patient global assessment as a clinically anchored global PRO.
Mental health domain: Patient Health Questionnaire-9 or Patient Reported Outcome Measurement Information System anxiety/depression short forms, to capture the psychosocial burden of HFpEF.
All five instruments should be pre-specified in the trial protocol and registration with explicit MCID thresholds, administration procedures, and missing data handling strategies documented before the trial begins.
Semaglutide was dosed at 2.4 mg subcutaneous weekly (obesity indication) in both RCTs vs 0.5-1.0 mg subcutaneous weekly (diabetes indication) in the cohort. These doses differ approximately 2.4-4.8-fold in magnitude and are associated with meaningfully different degrees of weight loss and metabolic effects, adding a fundamental reason why direct effect size comparisons across all three studies are methodologically inappropriate.
Follow-up was 52 weeks in the RCTs vs 24 months in the cohort. Longer follow-up in the observational study may partly account for greater cumulative QoL improvement, but may also introduce additional confounding from time-varying factors not controlled by baseline propensity score matching.
RCT: Both RCTs were judged at low overall risk of bias using RoB 2[4,5]. A minor concern was raised in the outcome measurement domain: Semaglutide’s distinctive effects (prominent weight loss, gastrointestinal symptoms) may have allowed some participants and investigators to infer treatment assignment, potentially introducing performance and expectation bias even within the blinded setting. However, objective improvements in 6MWD and measured body weight corroborate the patient-reported findings, supporting the internal validity of the PRO results.
Observational cohort study: The cohort study was rated at moderate-to-high risk of bias using ROBINS-I[6], primarily due to residual confounding, selection bias, and the open-label design. Propensity score matching controlled for many measured covariates, but unmeasured confounders-including health-seeking behaviour, concomitant lifestyle modifications, and differential follow-up intensity in the semaglutide group-cannot be excluded. Results from this study should be interpreted as hypothesis-generating and supportive rather than confirmatory evidence.
RCT evidence: Across both STEP-HFpEF trials, semaglutide produced consistent, clinically meaningful improvements in KCCQ-CSS: Approximately 7.8 points (95%CI: 4.8-10.9; P < 0.001) in STEP-HFpEF[4] and approximately 7.0 points (95%CI: 4.3-9.8; P < 0.001) in STEP-HFpEF DM[5]. The pooled fixed-effects estimate is 7.36 KCCQ-CSS points (95%CI: 5.32-9.40; P < 0.001; I² = 0%). All estimates exceed the 5-point MCID. Objective corroboration was provided by significant improvements in 6MWD (approximately 17-21 m greater than placebo) and reductions in high-sensitivity C-reactive protein.
Observational cohort evidence and effect size interpretation: In the propensity score-matched cohort, semaglutide was associated with a 21-point absolute KCCQ-TSS improvement over 24 months vs approximately 7 points in matched controls, yielding a between-group difference of approximately 14 points[6]. This is approximately twice the pooled RCT estimate. From a methodological standpoint, this discrepancy is most plausibly explained by the combination of expectation bias (open-label design), residual confounding, subscale non-equivalence (TSS vs CSS), lower semaglutide dose (0.5-1.0 mg vs 2.4 mg/week), longer follow-up with accumulating confounding, and a generally older, higher-risk population. The pooled RCT estimate of 7.36 KCCQ points should be regarded as the most methodologically reliable estimate of semaglutide’s true pharmacologic effect on PROs in HFpEF.
Companion and pooled analyses from the STEP-HFpEF programme demonstrated consistent QoL benefits across obesity classes, sex, age strata (≥ 75 years included), and diabetes status, with no subgroup demonstrating loss of effect[4-9]. Patients with more advanced HFpEF-those receiving loop diuretics, with higher NT-proBNP, or with atrial fibrillation-tended to derive larger absolute KCCQ improvements[7]. Background SGLT2 inhibitor therapy did not abolish the QoL advantage of semaglutide[5,7], supporting its potential as an adjunct to contemporary HFpEF management.
Gastrointestinal adverse events (nausea, diarrhoea, vomiting) were more frequent with semaglutide and led to discontinuation in approximately 10%-13% of participants in the RCTs[4,5]. Despite higher gastrointestinal intolerance, serious adverse events-including HF hospitalisations-were significantly less frequent with semaglutide than with placebo or matched controls[4,6,10]. The overall safety and tolerability profile appears clinically acceptable.
This systematic review and meta-analysis confirms that semaglutide produces consistent, clinically meaningful improvements in HF-specific QoL in adults with obesity-related HFpEF, with or without T2D. The two rigorously conducted RCTs provide the highest-quality evidence, and their quantitative pooling-formally demonstrated to be appropriate by I2 = 0% heterogeneity-yields a precise pooled placebo-corrected KCCQ-CSS gain of 7.36 points (95%CI: 5.32-9.40; P < 0.001), exceeding the MCID and robust across a broad range of prespecified subgroups[4,5]. These findings are clinically important given the historically limited treatment options for improving patient-reported health status in HFpEF.
A central contribution of this review is its explicit methodological appraisal of how PROs are measured and interpreted across semaglutide studies in HFpEF. Several important observations emerge beyond those of prior systematic reviews.
First, the fixed-effects meta-analysis formally demonstrates that the two RCTs are statistically homogeneous (I2 = 0%), validating the pooled estimate as a more precise and methodologically rigorous synthesis than either individual trial alone. Prior systematic reviews on this topic either did not perform meta-analysis[11-13] or pooled datasets across heterogeneous study designs; the present analysis restricts pooling to design-homogeneous RCTs and provides a transparent methodological rationale for this decision.
Second, the KCCQ psychometric appraisal reveals that while the instrument’s validity and reliability are well-established in broad HF populations, their specific psychometric performance in the obesity-related HFpEF phenotype-including responsiveness to weight-loss-mediated symptomatic improvement-has not been formally validated. It remains unclear whether KCCQ score changes in these patients reflect improvements in HF symptoms specifically or broader cardiometabolic well-being improvements attributable to significant weight loss and metabolic benefits of semaglutide. This distinction has implications for instrument interpretation and for the generalisability of KCCQ findings to HFpEF patients with different obesity trajectories.
Third, the absence of sensitivity analyses under missing-not-at-random assumptions in both RCTs is a significant methodological gap. Given that approximately 10%-13% of semaglutide recipients discontinued due to gastrointestinal adverse events-a non-random mechanism favouring retention of better-tolerating participants-the reported PRO estimates may carry upward bias. Tipping-point analyses should be incorporated into the statistical analysis plans of future trials.
Fourth, the proposal of a COS addresses the field’s lack of standardisation. The current variability in PRO endpoint selection (CSS vs TSS), MCID application, missing data methods, and administration timing across semaglutide studies in HFpEF prevents meaningful evidence synthesis and limits the clinical utility of the growing evidence base. Adoption of the proposed COS would ensure that future HFpEF trials generate data that are both comparable and sufficient for health technology assessment.
A critical and underappreciated limitation of the current evidence base is its restricted generalisability across HFpEF phenotypes, geographic populations, and healthcare settings.
All three included studies enrolled exclusively patients with obesity-related HFpEF (BMI ≥ 30 kg/m² in the RCTs; obesity present in the cohort). Non-obese HFpEF-which accounts for a substantial proportion of the HFpEF population, particularly in Asian and elderly cohorts-was entirely excluded. The mechanistic rationale for semaglutide in HFpEF is closely linked to weight loss and adipose tissue-mediated inflammatory pathways; whether GLP-1 receptor agonism confers clinically meaningful PRO benefits in non-obese HFpEF patients (in whom these mechanisms are less active) is entirely unanswered by the present evidence. Extrapolation of the pooled KCCQ-CSS estimate of 7.36 points to non-obese HFpEF populations is not supported by the available data.
Ethnic and geographic homogeneity constitutes a second major generalisability constraint. The STEP-HFpEF and STEP-HFpEF DM trials enrolled predominantly White patients from North American and European centres; detailed ethnic breakdown data were not prominently reported, and representation of Asian, African, and Latin American populations appears minimal based on the geographic distribution of trial sites. The observational cohort was conducted exclusively in Spain. HFpEF prevalence, phenotype, and clinical outcomes differ substantially across ethnic groups: Asian HFpEF patients more commonly present without obesity, with greater diastolic dysfunction, and at younger ages, suggesting a distinct underlying biology from the obesity-driven phenotype studied. Whether semaglutide’s PRO benefits generalise to these populations cannot be determined from the current evidence.
Low-resource settings represent a third dimension of unaddressed generalisability. All included studies were conducted in high-income country healthcare systems, where access to subcutaneous semaglutide 2.4 mg weekly (an obesity-indication medication with significant cost) is substantially different from that in low- and middle-income countries (LMICs). In LMICs, where the burden of HFpEF is rising and healthcare infrastructure for complex chronic disease management is limited, the logistical, economic, and systemic barriers to implementing semaglutide therapy are profound. The PRO benefits demonstrated in resource-rich trial settings may not be achievable in LMIC contexts where suboptimal background therapy, irregular follow-up, and nutritional disparities are prevalent. Future research should prioritise effectiveness studies in LMIC settings and assess whether cost-effective dose regimens or generic formulations might deliver comparable PRO benefits.
All three included studies carried significant industry involvement. Both RCTs (STEP-HFpEF and STEP-HFpEF DM) were sponsored by Novo Nordisk, the manufacturer of semaglutide, and were conducted with industry involvement in design, oversight, and data management. The observational cohort received no explicit industry funding declaration, though it was conducted in a context where semaglutide was prescribed as part of routine clinical care. While industry sponsorship does not per se invalidate trial findings, meta-epidemiological evidence consistently demonstrates that industry-sponsored trials report larger treatment effects than independently funded trials, are more likely to report statistically significant primary endpoints, and are subject to publication and outcome reporting biases favouring the sponsor’s product. In the context of the present review, the absence of independently funded replication studies means that the pooled RCT estimate of 7.36 KCCQ-CSS points must be interpreted with awareness of potential sponsorship-related effect size inflation, selective reporting of favourable subgroup analyses, and the possibility that negative or neutral PRO findings from trial substudy analyses may not have been published. The risk of bias assessment using RoB 2-while rating both trials as ‘low risk’ at the level of study conduct-does not capture sponsorship-related reporting bias. Independent replication of semaglutide’s PRO benefits in HFpEF through investigator-initiated or publicly funded trials is a research priority.
From a clinical perspective, semaglutide’s RCT-derived PRO benefits are consistent, meaningful, and generalisable across the range of obesity-related HFpEF phenotypes studied. The concordance between KCCQ improvement, increased 6MWD, and substantial weight reduction supports a genuine mechanistic effect mediated through weight loss and cardiometabolic improvement[4,5,8]. Benefits were observed regardless of diabetes status, suggesting that the predominant mechanism is weight-loss-mediated rather than glycaemia-mediated[4,5,7]. Background SGLT2 inhibitor therapy did not abolish the QoL benefit of semaglutide, suggesting additive or complementary mechanisms.
Several overlapping systematic reviews and meta-analyses have been published. Sobral et al[11] confirmed KCCQ-CSS improvements consistent with the RCT data. Mylavarapu et al[12] similarly concluded that semaglutide significantly reduces body weight and HF events in obesity-related HFpEF. Otmani et al[13] reported a pooled KCCQ-CSS mean difference of 7.72 points across five RCTs. The present review extends these analyses by providing a transparent fixed-effects meta-analysis restricted to design-homogeneous trials, a comprehensive KCCQ psychometric appraisal, a detailed analysis of missing data methodology, an articulated MCID discussion, a proposed COS for future trials, and a systematic critique of generalisability and sponsorship bias-methodological contributions not addressed by prior reviews.
Several limitations must be explicitly acknowledged: Absence of prospective protocol registration. The review protocol was not prospectively registered on PROSPERO or an equivalent platform prior to data collection. This is a significant methodological shortcoming that limits the review’s adherence to the highest standards of systematic review conduct. Although the eligibility criteria, outcomes, and statistical analysis plan (including the planned fixed-effects meta-analysis for design-homogeneous RCTs) were specified a priori in a written document prior to the search, the absence of public registration reduces transparency and increases the theoretical risk of selective reporting. We acknowledge this limitation unambiguously and strongly recommend that all future updates of this review proceed with full prospective PROSPERO registration.
Small number of primary studies. Only three primary studies were eligible, limiting the statistical power of the meta-analysis and the scope of subgroup analyses. The pooled estimate from two trials, while precise within the study population, may not be stable as additional RCT evidence accumulates.
Homogeneous study populations. All enrolled populations had obesity (BMI ≥ 30 kg/m²) and were predominantly White and from high-income Western countries. Non-obese HFpEF patients, Asian populations, elderly patients from LMIC settings, and those not enrolled in industry-sponsored programmes are entirely unrepresented. Generalisability of the findings to these groups cannot be assumed.
Industry-sponsored trials. All RCTs were sponsored by the drug manufacturer, raising the possibility of reporting bias and sponsorship-related effect size inflation that standard risk of bias tools not fully capture. Limited follow-up duration. RCT follow-up was 52 weeks; durability of PRO benefits beyond this horizon is uncertain and cannot be assessed from the available data.
Incomplete PRO standardisation across studies. The exclusive reliance on KCCQ, absence of generic QoL instruments (SF-36, EQ-5D), and inconsistent subscale selection precluded utility value estimation and cross-disease health status comparisons. Missing data sensitivity analyses. Neither RCT provided sensitivity analyses under missing-not-at-random assumptions, introducing potential upward bias in reported PRO estimates given non-random treatment discontinuation.
Future HFpEF PRO research should prioritise: (1) Prospectively registered, long-term, independently funded trials in diverse populations (including non-obese, Asian, and LMIC cohorts) incorporating the proposed COS; (2) Pre-specification of KCCQ subscale selection and MCID thresholds with anchor-based justification; (3) Sensitivity analyses for missing PRO data under missing-not-at-random assumptions; (4) Blinded or electronically administered PRO collection in observational studies; (5) Explicit dose documentation for dose-response analyses; and (6) Cost-effectiveness analyses incorporating EQ-5D utility values to inform reimbursement decisions.
Semaglutide produces consistent, clinically meaningful improvements in health-related QoL and PROs in adults with obesity-related HFpEF, with or without T2D. Fixed-effects meta-analysis of the two design-homogeneous RCTs yields a pooled placebo-corrected KCCQ-CSS improvement of 7.36 points (95%CI: 5.32-9.40; P < 0.001; I2 = 0%), exceeding the MCID and providing a more precise and methodologically rigorous summary of the evidence than either trial alone. The substantially larger effect in the open-label observational cohort is best explained by expectation bias, residual confounding, subscale non-equivalence, and dose heterogeneity. Significant methodological gaps persist across KCCQ psychometric reporting, MCID rationale, missing data handling, PRO data collection standardisation, prospective trial registration, population diversity, and independence from industry sponsorship. These gaps must be addressed in future trials to generate evidence that is both methodologically robust and applicable to the full breadth of patients with HFpEF worldwide.
| 1. | Dunlay SM, Roger VL, Redfield MM. Epidemiology of heart failure with preserved ejection fraction. Nat Rev Cardiol. 2017;14:591-602. [PubMed] [DOI] [Full Text] |
| 2. | Obokata M, Reddy YNV, Pislaru SV, Melenovsky V, Borlaug BA. Evidence Supporting the Existence of a Distinct Obese Phenotype of Heart Failure With Preserved Ejection Fraction. Circulation. 2017;136:6-19. [PubMed] [DOI] [Full Text] |
| 3. | Wilding JPH, Batterham RL, Calanna S, Davies M, Van Gaal LF, Lingvay I, McGowan BM, Rosenstock J, Tran MTD, Wadden TA, Wharton S, Yokote K, Zeuthen N, Kushner RF; STEP 1 Study Group. Once-Weekly Semaglutide in Adults with Overweight or Obesity. N Engl J Med. 2021;384:989-1002. [PubMed] [DOI] [Full Text] |
| 4. | Kosiborod MN, Abildstrøm SZ, Borlaug BA, Butler J, Rasmussen S, Davies M, Hovingh GK, Kitzman DW, Lindegaard ML, Møller DV, Shah SJ, Treppendahl MB, Verma S, Abhayaratna W, Ahmed FZ, Chopra V, Ezekowitz J, Fu M, Ito H, Lelonek M, Melenovsky V, Merkely B, Núñez J, Perna E, Schou M, Senni M, Sharma K, Van der Meer P, von Lewinski D, Wolf D, Petrie MC; STEP-HFpEF Trial Committees and Investigators. Semaglutide in Patients with Heart Failure with Preserved Ejection Fraction and Obesity. N Engl J Med. 2023;389:1069-1084. [PubMed] [DOI] [Full Text] |
| 5. | Kosiborod MN, Petrie MC, Borlaug BA, Butler J, Davies MJ, Hovingh GK, Kitzman DW, Møller DV, Treppendahl MB, Verma S, Jensen TJ, Liisberg K, Lindegaard ML, Abhayaratna W, Ahmed FZ, Ben-Gal T, Chopra V, Ezekowitz JA, Fu M, Ito H, Lelonek M, Melenovský V, Merkely B, Núñez J, Perna E, Schou M, Senni M, Sharma K, van der Meer P, Von Lewinski D, Wolf D, Shah SJ; STEP-HFpEF DM Trial Committees and Investigators. Semaglutide in Patients with Obesity-Related Heart Failure and Type 2 Diabetes. N Engl J Med. 2024;390:1394-1407. [PubMed] [DOI] [Full Text] |
| 6. | Pérez-Velasco MA, Bernal-López MR, Trenas A, Ricci M, López-Carmona MD, García de Lucas MD, Gómez-Huelgas R, Pérez-Belmonte LM. Efficacy of once-weekly semaglutide in patients with heart failure with preserved ejection fraction, obesity and type 2 diabetes. Med Clin (Barc). 2025;165:107019. [PubMed] [DOI] [Full Text] |
| 7. | Mikhail N, Wali S. Semaglutide for Treatment of Obesity-Related Heart Failure with Preserved Ejection Fraction in Patients with and without Diabetes. J Diabetes Clin Res. 2024;6:18-23. [DOI] [Full Text] |
| 8. | Borlaug BA, Kitzman DW, Davies MJ, Rasmussen S, Barros E, Butler J, Einfeldt MN, Hovingh GK, Møller DV, Petrie MC, Shah SJ, Verma S, Abhayaratna W, Ahmed FZ, Chopra V, Ezekowitz J, Fu M, Ito H, Lelonek M, Melenovsky V, Núñez J, Perna E, Schou M, Senni M, van der Meer P, Von Lewinski D, Wolf D, Kosiborod MN. Semaglutide in HFpEF across obesity class and by body weight reduction: a prespecified analysis of the STEP-HFpEF trial. Nat Med. 2023;29:2358-2365. [PubMed] [DOI] [Full Text] |
| 9. | Verma S, Butler J, Borlaug BA, Davies M, Kitzman DW, Shah SJ, Petrie MC, Barros E, Rönnbäck C, Vestergaard LS, Schou M, Ezekowitz JA, Sharma K, Patel S, Chinnakondepalli KM, Kosiborod MN; STEP-HFpEF Trial Committees and Investigators. Efficacy of Semaglutide by Sex in Obesity-Related Heart Failure With Preserved Ejection Fraction: STEP-HFpEF Trials. J Am Coll Cardiol. 2024;84:773-785. [PubMed] [DOI] [Full Text] |
| 10. | Kosiborod MN, Deanfield J, Pratley R, Borlaug BA, Butler J, Davies MJ, Emerson SS, Kahn SE, Kitzman DW, Lingvay I, Mahaffey KW, Petrie MC, Plutzky J, Rasmussen S, Rönnbäck C, Shah SJ, Verma S, Weeke PE, Lincoff AM; SELECT, FLOW, STEP-HFpEF, and STEP-HFpEF DM Trial Committees and Investigators. Semaglutide versus placebo in patients with heart failure and mildly reduced or preserved ejection fraction: a pooled analysis of the SELECT, FLOW, STEP-HFpEF, and STEP-HFpEF DM randomised trials. Lancet. 2024;404:949-961. [PubMed] [DOI] [Full Text] |
| 11. | Sobral MVS, Rodrigues LK, Barbosa AMP, da Rocha NC, Moulaz IR, Dos Santos JPP, Oliveira BHC, Moreira JLML, Pacagnelli FL, Guida CM. Cardiovascular Effects of Semaglutide in Patients with Heart Failure with Preserved Ejection Fraction: A Systematic Review and Meta-Analysis. Am J Cardiovasc Drugs. 2025;25:461-467. [PubMed] [DOI] [Full Text] |
| 12. | Mylavarapu M, Obi O, Abarca Y, Fatima H, Roshni P, Huda NU, Lysak Y, Gandapur A, Vazquez SC, Siddiqui MA, Mowo-Wale A. Semaglutide in patients with obesity and heart failure with preserved ejection fraction: A systematic review and meta-analysis. World J Cardiol. 2026;18:112189. [DOI] [Full Text] |
| 13. | Otmani Z, Elsayed HA, Yassin MNA, Saihi MJ, Aldemerdash MA, Alzawahreh A, Hassan A, Alahmed FB, Gonnah AR, Abdelaziz A. Semaglutide in Patients with Obesity and Heart Failure Irrespective of Their Baseline Ejection Fraction: An Efficacy and Safety Meta-analysis of Randomized Controlled Trials. Cardiol Rev. 2025;. [PubMed] [DOI] [Full Text] |