Methods/Statistics
Enhancing Transportability of Predictive Models: A Novel Quantile Regression Approach to Improve Prediction in the Tails Ariana Mora* Ariana Mora Mora Mora Mora Mora Emory University
Background: Identifying elevated A1c among those without a diabetes diagnosis via predictive modeling presents a fundamental methodologic challenge. Standard regression optimizes prediction of the mean (majority have truly non-diabetic A1c), thus predicts poorly in the tails (those with undiagnosed elevated A1c). We encountered this challenge when attempting to transport models from NHANES to predict undiagnosed pre-pregnancy diabetes among those in NVSS (US births).
Objective: Compare predictive approaches to identify undiagnosed diabetics: survey-weighted GLM, averaged quantile regression, and a novel Adaptive Detection Window (ADW) method.
Methods: Survey-weighted GLM and quantile regression models (Q5-Q95) were fit to females 18-44 years old (NHANES 2013-2023; n=5,013) predicting A1c from shared NHANES/NVSS variables. ADW identifies the first quantile model an individual’s predicted A1c crosses a non-diabetic threshold, then aggregates additional quantile A1c predictions within an adaptive window around this detection point to yield a final predicted A1c. Window parameters allow flexibility to optimize model sensitivity (SE) or specificity (SP).
Results: NHANES data estimates that 4.67 million (9.7%) of reproductive-age US women have undiagnosed prediabetes/diabetes. Survey GLM achieved only 7.9% SE/99.3% SP, missing >90% of cases. Quantile averaging (Q50-Q95) yielded 29.7% SE/94.8% SP. When optimized for detection, ADW achieved 78.2% SE/66.8% SP—a ten-fold improvement over GLM. Alternative ADW parameterizations achieved 66.2% SE/75.9% SP (balanced) and 38.1% SE/91.8% SP (specificity-optimized), demonstrating tunable performance.
Conclusions: Standard regression fundamentally under-detects cases in the tails. ADW leverages quantile regression to dramatically improve identification of undiagnosed diabetics with flexible tuning. ADW enables transportable prediction in the tails and future application to identify undiagnosed disease in maternal health surveillance.

