Challenges and Opportunities for Causal Inference in Molecular Epidemiology

Jonathan Huang and Brian Whitcomb

While no definitive, go-to pedagogical text yet exists to teach the ins-and-out of causal inference in molecular epidemiology, we’ve put together a list of 7 papers we think address critical aspects of both the promise and challenges present in this task.

The first three (Mehta, et al; Yang et al; Gadbury, et al) present an overview of the basic challenge of causal inference in the high-dimensional / -omic wide space, notably focusing on the statistical inferential problems that exist even in experimental settings. Underlying this is a recognized need to evaluate methods using scalable techniques that respect the complex structures and size of genomic data, notably “plasmode” base simulation. Read more

Towards sound epistemological foundations of statistical methods for high-dimensional biology
Critical reasoning on causal inference in genome-wide linkage and association studies
Evaluating Statistical Methods Using Plasmode Data Sets in the Age of Massive Public Databases: An Illustration Using False Discovery Rates
Nature as a Trialist?: Deconstructing the Analogy Between Mendelian Randomization and Randomized Trials
Robust Mendelian randomization in the presence of residual population stratification, batch effects and horizontal pleiotropy
A Selective Review of Negative Control Methods in Epidemiology
The emerging landscape of health research based on biobanks linked to electronic health records: Existing resources, statistical challenges, and potential opportunities