Big Data/Machine Learning/AI
Evaluating the Performance of the Parametric G-formula Relative to Causal Forest in Identifying Effect Modifiers and Estimating Subgroup Effects: A Simulation and Application using Look AHEAD trial Yulin Gong* Roch Nianogo Nianogo UCLA Fielding School of Public Health
Causal machine learning algorithms like causal forest have gained popularity for estimating heterogeneous treatment effects (HTEs) at the individual level. However, such models can have low interpretability but high flexibility. The parametric and semi-parametric frameworks such as the generalized heterogeneous treatment effect (G-HTE) approach based on the G-computation algorithm, emphasize interpretability but have limited flexibility in capturing complex interactions. We compared G-computation and causal forest to examine trade-offs between the two methods using simulation studies and data from the Look AHEAD trial of adults with type 2 diabetes evaluating the effectiveness of intensive lifestyle vs education. G-computation identified two effect modifiers (EM) and four subgroups: baseline fasting glucose (FG) and HbA1c as candidate EM and estimated effects. E.g., Estimated effects HbA1c < 8.5% (RD= -9%, 95%CI [-28% to 10%]) vs HbA1c ≥ 8.5% (RD= 28%, 95%CI [-14% to 69%]); P-for interaction = 0.10). In contrast, causal forest revealed more complex EM (n=3) and subgroups (n=6): HbA1c, systolic blood pressure (SBP), and cardiovascular disease (CVD) history. Individuals with lower HbA1c (HbA1c <8%) and no prior CVD derived the greatest benefit from the intervention (RD = -2% [-4% to -1%]) whereas those with higher HbA1c (HbA1c ≥ 8%) and elevated SBP ≥ 140 mmHg experienced the least benefit (RD = 7%, 95%CI, [2% to 11%]). G-computation emphasized simplicity and interpretability by identifying a small number of single-variable EM with clear conclusion: the less advanced (late) the diabetes is, the more effective in reducing FG/HbA1c the intensive lifestyle intervention tends to be (Figure). Causal forest prioritized flexibility and accuracy, revealing multivariable HTEs but complex conclusions. Methodological choice substantially influences HTE characterization and should align with analytic objectives and clinical context.

