COVID-19 Pandemic
Integrating Wastewater-Based Epidemiology and Socio-Behavioral Data to Enhance Predictive Modeling of SARS-CoV-2 Spike Risk in Virginia Isaiah Omari* Isaiah Omari NYU
Background: Wastewater-based epidemiology (WBE) is an increasingly important tool for monitoring SARS-CoV-2 transmission at the population level. However, models relying solely on viral load trends may inadequately capture heterogeneity in population vulnerability. We evaluated whether integrating WBE with socio-behavioral indicators improves prediction of SARS-CoV-2 spike risk.
Methods: We constructed a county-month panel dataset combining SARS-CoV-2 viral concentrations from Virginia wastewater treatment plants with Behavioral Risk Factor Surveillance System (BRFSS) indicators from January 2021 through December 2022. Spike risk was defined using quartiles of the county-level viral load distribution. Predictive performance was assessed using multinomial logistic regression, random forest, and extreme gradient boosting (XGBoost), with stratified hold-out validation. Predictor importance was evaluated using permutation methods and partial dependence. Temporal patterns were assessed using time-series decomposition and forecasting.
Results: Models integrating WBE and BRFSS indicators outperformed models using WBE alone. The XGBoost model demonstrated the highest predictive performance (AUC=0.98; accuracy=94%), followed by random forest (92%) and multinomial regression (90%). Key predictors included prior-month viral load, healthcare access, disability prevalence, and poor self-rated health. Temporal analyses revealed consistent seasonal patterns in viral shedding, and spatial analyses identified higher spike risk in counties with greater social vulnerability and underinsurance.
Conclusions: Integrating wastewater surveillance with socio-behavioral data substantially improves prediction of SARS-CoV-2 spike risk and enhances interpretability of surveillance models. Hybrid approaches may strengthen public health decision-making by identifying communities at elevated risk for transmission surges.
Keywords: wastewater-based epidemiology; SARS-CoV-2; BRFSS; predictive modeling; health
