Injuries/Violence
Predicting Fatal and Non-Fatal Suicide Attempts Among Privately Insured Residents of North Carolina Using Machine Learning Matthew Turnure* Matthew Turnure Turnure Turnure Turnure Turnure Turnure Turnure University of North Carolina at Chapel Hill
Statement of purpose: Suicide is a major public health problem. In North Carolina, age-adjusted suicide mortality rates increased by 31% from 1999 to 2022. While decades of research have identified many risk factors for suicide attempt, interventions in the healthcare settings can be strengthened by identifying high-risk individuals using risk prediction algorithms and linking them to appropriate care. We predicted the risk of suicide attempt in the next month in a privately insured population in North Carolina using machine learning.
Methods: We conducted a nested case-control study of covered people aged 12 and older. Administrative claims data were linked to North Carolina death certificate records for 2006 to 2020. Cases (n=5,276) were all fatal and non-fatal suicide attempts, and controls (n=226,310) were identified using an incidence-density risk-set sampling approach. We used super learning, an ensemble machine-learning approach that formed a weighted average of several algorithms to achieve the best possible predictions given our performance metric of interest, negative log likelihood. To estimate population-based risks, we used inverse probability of sampling weights. Model covariates included diagnoses, procedures, and prescriptions known to be associated with suicide.
Results: In predicting the 1-month risk of suicide attempt, the super learner improved on the best single algorithm by 27%. At 95% specificity, the model had 20.6% sensitivity and 8.8% positive predictive value.
Significance: Although the super learner ensemble model was substantially better at predicting suicide attempt than individual algorithms, risk categorization thresholds that achieved high specificity resulted in low sensitivity. This suggests that future models could benefit from tailoring to subpopulations by age, rurality, and other demographics, and based on suicide subtypes with firearm or intimate partner violence involvement.
