Infectious Disease
From association to prediction: a scoping review reveals methodological and reporting gaps in human lyme disease forecasting, prediction and projection Danny Szaroz* Danny Szaroz Szaroz Szaroz Szaroz University Of Montreal
Environmental change has become a dominant driver of Lyme disease (LD) incidence in the northern hemisphere. Despite noteworthy progress understanding the effects of climate change on infectious disease, predictive models vary widely in methodological rigor, ecological grounding, and reporting transparency, limiting their usefulness for public health interventions.
We conducted a geographically unrestricted scoping review to assess how predictive models for environmentally driven human LD are constructed, validated, and reported, with particular attention to predictor selection and performance evaluation.
From 5,318 initial search returns, 19 studies met inclusion, highlighting both a substantive gap in predictive literature relative to other infectious disease domains and the conflation of association with prediction, a common reporting problem in epidemiology. Most included studies employed frequentist or Bayesian statistical frameworks (n=13, 68%), followed by machine learning approaches (n=5, 26%), with only one mechanistic model (n=1, 5%). Global Climate Model outputs were used to project LD incidence under future emissions scenarios in nine studies (48%).
Predictor selection was dominated by meteorological (n=56, 33%) and geographic (n=59, 35%) categories, while wildlife host factors and acarological parameters represented fewer than 10% of predictors, signaling that variables critical to transmission are often omitted. Validation practices were inconsistent; the most commonly reported performance metrics were RMSE (n=6) and R² (n=6), and many studies lacked rigorous out-of-sample validation.
This review suggests that LD prediction efforts remain methodologically heterogeneous, ecologically incomplete, and inconsistently reported. Researchers can benefit from integrating multi-host and vector data, combining modeling frameworks, and adopting standardized reporting guidelines. Strengthening ecological realism and methodological rigor in predictive models is essential for anticipating climate-driven shifts in LD transmission and informing evidence-based public health action.
