Jove
Visualize
Contact Us
  1. Home
  2. Research Domains
  3. Mathematical Sciences
  4. Statistics
  5. Stochastic Analysis And Modelling
  6. High-dimensional Iterative Causal Forest (hdicf) For Subgroup Identification Using Health Care Claims Data

High-dimensional Iterative Causal Forest (hdiCF) for Subgroup Identification Using Health Care Claims Data

Tiansheng Wang1, Virginia Pate1, Richard Wyss2

  • 1Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC.

American Journal of Epidemiology|June 13, 2025

Related Experiment Videos

View abstract on PubMed

Summary

A new high-dimensional method improved detecting heterogeneous treatment effects (HTE) in heart failure risk for SGLT2 inhibitors and GLP-1 RAs. It identified patients with frequent loop diuretic use as a key subgroup, outperforming standard methods.

Area of Science:

  • Pharmacovigilance and Pharmacoepidemiology
  • Biostatistics and Health Data Science
  • Cardiovascular Disease Research

Background:

  • Identifying patient subgroups that benefit differently from medications (heterogeneous treatment effects, HTE) is crucial for personalized medicine.
  • Standard high-dimensional propensity score (hdPS) methods face challenges in accurately capturing complex patient characteristics.
  • Novel high-dimensional approaches are needed to improve the detection of HTE in real-world data.

Purpose of the Study:

  • To compare a novel high-dimensional approach with the standard hdPS method for detecting HTE.
  • To identify subgroups of patients experiencing different treatment effects from sodium-glucose cotransporter-2 (SGLT2) inhibitors and glucagon-like peptide-1 receptor agonists (GLP-1 RAs) regarding heart failure risk.
  • To assess the performance of these methods in a large Medicare cohort.

Main Methods:

  • A novel high-dimensional approach using ordinal variables was developed and compared against the standard hdPS method (binary variables).
  • The iterative causal forest (iCF) subgrouping algorithm was employed on a Medicare cohort (2015-2019) of SGLT2 inhibitors (N=8,075) and GLP-1 RAs (N=7,313).
  • Conditional average treatment effects (CATEs) for 2-year risk differences in hospitalized heart failure were estimated using inverse-probability treatment weighting.

Main Results:

  • The novel high-dimensional approach identified patients with ≥2 loop diuretic prescriptions as a subgroup with the largest CATE for reduced heart failure risk (aRD: -2.6%).
  • The standard hdPS method identified patients with chronic kidney disease as a subgroup with a smaller CATE (aRD: -1.7%).
  • Sensitivity analyses confirmed the novel approach's superior accuracy in identifying clinically relevant subgroups with HTE.

Conclusions:

  • The novel high-dimensional method demonstrates enhanced capability in detecting HTE compared to the standard hdPS approach.
  • This improved detection can lead to more precise identification of patient subgroups benefiting from SGLT2 inhibitors and GLP-1 RAs.
  • The findings support the clinical relevance of identifying specific patient characteristics, such as loop diuretic use, for optimizing heart failure risk management.
Keywords:
Iterative causal forestcausal machine learningclaims dataheterogeneous treatment effecthigh-dimensionalpharmacoepidemiologyprecision medicinesubgroup identification

Related Experiment Videos

Related Concept Videos

JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site

Terms & Conditions of Use
Privacy Policy
Policies