Methods/Statistics
A Latent Variable Model for Predicting Case Counts Kathryn Shea* Kathryn Shea Shea Shea Shea University of California Davis
Objective:
Reliable estimation of true disease burden is essential for identifying vulnerable populations, allocating resources, and guiding public health strategies. During the COVID-19 pandemic, reported case counts often incorrectly estimated true infections, with varying reliability across time and space due to factors such as the availability of testing and symptomatic rates of different variants. To address this challenge, we explore a latent-variable modeling approach that uses auxiliary data to improve estimation of true case counts.
Methods:
We propose a Poisson-Gamma model which links reported and true case counts through a latent variable. An empirical best predictor (EBP) is used to estimate the true case counts across geographic areas. Preliminary work includes the development of a baseline model and simulation studies to evaluate the EBP’s performance relative to unbiased estimators and the naïve approach of using reported counts directly. Ongoing work extends the model to incorporate spatial autocorrelation to capture relationships between neighboring areas with plans to assess performance under more realistic data-generating processes and a potential application using real COVID-19 data.
Results:
Simulation results suggest that the EBP yields lower mean squared prediction error (MSPE) compared to alternative estimators in simple settings. Further evaluation under more complex and spatially correlated conditions is in progress.
Conclusion:
Early results indicate the potential of a Poisson-Gamma latent-variable framework for improving estimates of true case counts. Future extensions incorporating spatial structure and applied analyses will provide insight into its utility for public health surveillance and epidemic modeling.
