LATEBREAKER
Causal Inference
The Utility of Missing Data Graphs in Recovering Causal Effects from Selection Bias Haidong Lu* Haidong Lu Maya Mathur Fan Li
Selection bias remains a central concern in methodological research in epidemiology and other social sciences. Traditional causal diagrams, featuring a single selection node, are sufficient for assessing the presence of many forms of selection bias, and work fine when only considering complete-case estimation. However, this approach falls short in elucidating the recoverability of causal effects from selection bias due to missing data in more general settings, because a single selection node does not distinguish between variables that are fully observed and those that are partially missing. For instance, in scenarios of selection bias due to differential loss to follow-up, it is often required that the information on the exposure and covariates connecting the selection node and the outcome is fully observed for both selected and unselected samples, such that the causal effect in the entire sample can be recovered. To address this limitation, we review the link between selection bias and missing data and introduce the use of missing data graphs (as known as m-graphs), a form of causal diagram, to better characterize selection bias scenarios. Furthermore, we demonstrate the utility of missing data graphs for selection bias scenarios within several special causal structures, and describe simple identification rules to specify which variables’ full distribution is necessary for recovering causal effects from selection bias.