Skip to content

Abstract Search

Big Data/Machine Learning/AI

Machine Learning Algorithms to Predict Subclinical Small Vessel Brain Lesions based on cardiovascular health: Insights from the ELSA-Brasil study Carine Savalli* Marianna Leite Carine Savalli Arão O Belitardo Carlos Leandro S dos Prazeres Paulo A Lotufo Itamar S Santos Isabela Benseñor Claudia C Leite Maria CG Otaduy Adriana B Conforto Alexandre Chiavegatto Alessandra C Goulart

Neuroimaging markers of cerebral small-vessel disease (CSVD) are associated with an
increased risk of cognitive decline, stroke, and death. We applied machine learning (ML)
methods to assess whether lower adherence to the “Life’s Essential 8” proposed by the
American Heart Association (LE8: diet, physical activity, nicotine exposure, BMI, blood
lipids, blood glucose, blood pressure, and sleep) can predict CSVD on 3T brain magnetic
resonance imaging (3T-MRI) in the ELSA-Brasil study, a prospective cohort. Among 233
participants (66.1±9.2 years, 59% women), cardiovascular health information from the three
study waves (2008 to 2019) and a 3T-MRI (2022-2024) were considered. Variables included
were: LE8 adherence (1-3; higher=better), LE8 trajectory change from Wave 1 (2008-2010)
to Wave 3 (2017-2019), and demographics at baseline (age, income, education, race, and
marital status). MRI binary outcomes (yes/no) were the individual components of the CSVD
score (lacunes-LAC, enlarged perivascular spaces-EPS, white matter hyperintensities-WMH,
and microhemorrhages and calcifications-MH/C). Four ML algorithms (Random Forest,
XGBoost, LightGBM, and CatBoost) were trained for each outcome (70%/30% as
training/testing), with hyperparameter tuning via a repeated 5-fold cross-validation. The
performance was evaluated using AUC-ROC. The contribution of variables to the prediction
was accessed by the SHAP (Shapley Additive exPlanations). Frequencies of CSVD were:
LAC, 11.6%; WMH, 28.3%; MH/C, 12.1%; EPS, 64.8%. The predictive performance for
EPS was good (AUC=0.760), variables that most contributed to this prediction were age,
blood lipids in Wave 3, and improved sleep from Wave 1 to 3. The performance was
moderate for WMH (AUC=0.693) and MH/C (AUC=0.615) and, poor for LAC
(AUC=0.492). ML showed a modest predictive capacity for CSVD. XGBoost for EPS was
the most promising model, but these preliminary findings highlight the need for larger
datasets to improve prediction accuracy.