Skip to content

App Abstracts

Methods/Statistics

Statistical Methods to Address Sparse Data at Low PM2.5 Concentrations Alesia Jung* Alesia Jung Ethan Roubenoff Abeer Hasan Megan Leonhard Jessica Lin Tiffani Fordyce

Background: Several epidemiologic studies have reported an association between mortality and exposure to fine particulate matter (PM2.5) at concentrations below the EPA’s standard of 12 μg/m3. A literature review was conducted to examine study specific factors that may explain the inconsistent findings about the concentration response function (CRF) at low PM2.5 and statistical methods for sparse data that could improve the accuracy of modeling the CRF at low exposure levels.

Methods: PubMed was searched for original human epidemiologic studies published in the past ten years that evaluated the relationship between long-term PM2.5 exposure and individual-level, all-cause or non-accidental mortality and the CRF or dose-response function.

Results: Of 5,512 articles, 353 were identified as potentially relevant. Full text review resulted in 42 included articles many of which did not have sufficient data at concentrations below 12 µg/m3.  Studies that focused on modeling CRF curves for low exposure reported conflicting results regarding the shape of the curve and linear versus non-linear relationship.  We identified strengths and limitations of each study and potential sources of bias.

Conclusions: Variations in methods made it difficult to compare and generalize findings. Statistically, there are significant limitations when using a model built for high levels of exposure to study the relationship between mortality and exposure at levels outside the data range; data sparsity may explain the steeper CRF slope reported at lower exposure levels.  More research is needed to understand the impact of models used to quantify exposure and the optimal way to impute missing exposure data.