Skip to content

Abstract Search

COVID-19 Pandemic

Small Area Estimation of County-Level COVID-19 Cases by Incorporating Social Determinants of Health and Spatial Correlation Kathryn Shea* Kathryn Shea J. Sunil Rao Lorena Garcia Lorena Garcia Jiming Jiang

Background: In April 2020, COVID-19 became a nationally notifiable condition, and jurisdictions were asked to report cases to the CDC, though some did not report through the entire public health emergency. Even for reported cases, accuracy can be improved by “borrowing strength” via statistical models. Small area estimation (SAE) is a potentially valuable statistical approach for more accurate estimation in areas where cases were underreported or unreported. Despite its potential, SAE has only been sparsely used in the context of infectious diseases. Objective: This research aims to understand the efficacy of SAE methods in estimating COVID-19 cases using auxiliary information, such as social determinants of health (SDoH) and spatial correlation, to provide robust case estimates for all counties in the contiguous US. Methods: COVID-19 case data are sourced from the February 2023 release of the CDC COVID-19 Case Surveillance Restricted Access Detailed Data, containing over 95 million reported cases in the US and territories from 2020 to 2023. SDoH data are sourced from the Agency for Health Research and Quality Social Determinants of Health Database which compiles county-level administrative data from several federal sources for all five SDoH domains. Specifically, we focus on cases in 2021 and 2022 in the contiguous US, using Poisson mixed effect models with population offset to model the case counts in each year. Results: Preliminary results suggest that several candidate SDoH variables are significantly associated with the case count outcome but are correlated with each other. We use elastic net regularization to identify variable clusters and select a final model. Residual autocorrelation is indicated by Global Moran’s I (p = 0.00167) with a 500km distance band from county centroids. Conclusion: These results suggest that SDoH and spatial correlation contain valuable auxiliary information for SAE models to estimate COVID-19 cases in counties in the contiguous US.