Big Data/Machine Learning/AI
Identifying patterns of tobacco use and associated cardiovascular disease risk through machine learning analysis of urine biomarkers Noah A. Siegel, BS* Noah Siegel Andrew C. Stokes, PhD
Tobacco use constitutes a pressing public health challenge, being the primary cause of the loss of disability-adjusted life years in the United States. The cardiovascular damage caused by tobacco is complex and nonlinear, and there is currently no dependable method for comparing exposure and harm across emerging tobacco product categories. There is a critical need to elucidate how multidimensional biomarkers of tobacco exposure influence the risk of cardiovascular harm. We employed cluster analysis, an unsupervised machine learning technique, to analyze tobacco use biomarkers drawn from the Population Assessment of Tobacco and Health (PATH) dataset with 6 waves of data from 2013 to 2021. We identified two distinctive clusters of individuals exhibiting similar patterns of tobacco exposure biomarkers. One cluster was closely associated with e-cigarettes, and another with cigarettes and dual use. We introduced a reference group composed of individuals who did not use any tobacco products and explored the connection between these clusters and clinical and subclinical cardiovascular outcomes. These clusters are visualized in figure 1. The cigarette and dual use cluster exhibited significantly higher levels of harm-related biomarkers compared to the e-cigarette cluster, which, in turn, displayed elevated biomarkers of harm in comparison to the reference group. Individuals within the cigarette and dual use cluster faced an increased risk of suffering from conditions such as myocardial infarction, stroke, congestive heart failure, or other cardiovascular diseases when compared to those in the reference group, while those in the e-cigarette cluster did not. Establishing a reliable biochemical signature for tobacco product usage holds potential for responsive, multidimensional research into the physiological effects of tobacco. Moreover, such a signature could facilitate the examination of emerging tobacco products before their effects are comprehensively understood.