Methods/Statistics
The power of mixtures Alexander Keil* Alexander Keil Maria E Kamenetsky Giehae Choi Rena R Jones Jessie P Buckley
Epidemiologic analysis of exposures within a mixture has numerous challenges, which has inspired many strategies for identifying mixtures as health hazards. We have observed three basic strategies, which estimate: 1) an overall association with all exposures, 2) independent associations with all exposures, and 3) an association with a reduced-dimension exposure. However, gaps remain regarding accuracy of these strategies for classifying whether a mixture is harmful to human health.
Using two published examples of exposure mixtures, we performed a plasmode simulation study of the power, type-I-error, and type-S-error (error in the sign of effect) of these strategies under different scenarios that represented varying features of the mixture (effect size, number of causal exposures, and confounding strength). The statistical approaches were: 1) overall effect estimation via quantile-based g-computation (QGC), 2) independent effect estimation for all exposures via multiple linear regression (MLR), and 3) linear regression of a 1-dimensional exposure using principal component analysis (PCA+LR). We assessed overall effect sizes ranging from null to 0.5 outcome standard deviations per quartile increase in all exposures. We also estimated “power gain” (probability of detecting an association in QGC but not MLR).
QGC yielded higher power and lower type-S error than other methods and frequently (up to 90%) found true associations when MLR failed to find any associations (Figure). MLR had the lowest power when using multiplicity corrections and high type-I error otherwise. PCA+LR had negligible type-S error and was intermediate in power.
When making decisions using hypothesis testing, MLR is problematic for decisions about whether a mixture is associated with a health outcome. Using dimension reduction via PCA or overall effect estimation are two more powerful options when used appropriately, though overall effect estimation is more interpretable.