Big Data/Machine Learning/AI
Exploring Unsupervised Machine Learning for psychometric properties assessment: a 24- hour movement behavior questionnaire validity Shirley Cunha Feuerstein* Shirley Cunha Feuerstein Lorrane Cristine Conceição da Silva Helen Ferreira de Brito Souza Evellyn Ravena da Silva Gomes Letícia Ribeiro Borges Kliver Antonio Marin Francisco Winter Figueiredo Luis Gracia-Marco Heraclito Barbosa de Carvalho Augusto César Ferreira de Moraes Marcus Vinícius Nascimento-Ferreira
Objective: To test the structure solution of an online 24-hour movement behavior questionnaire (24h-MBQ) in college students by unsupervised machine learning.
Methods: We invited 195 college students (68.7% females; 44.6% aged 21 to 25 years; 65.8% majoring in the undergraduate degree courses in health sciences; and 24.5% were in early semesters). We developed a questionnaire with 19 items extracted from previously validated tools, composed by physical activity (6-item), sedentary behavior (10-item) and sleep duration (3-item). We employed the exploratory factor analysis with Varimax rotation. We extracted the factors based on the Kaiser criterion, with eigenvalues greater than 1 necessary for factor retention. Additionally, we carried out unsupervised machine learning, determining cluster numbers via the Calinski/Harabasz index. We retrieved these values based on the number of factors observed in the exploratory factors analysis. After identifying the clusters, we applied the k-median method to create them. Differences among clusters were assessed using Kruskal-Wallis test with Dunn’s post hoc test.
Results: In the exploratory factor analysis, we identified seven factors where the explained variance was 66.80%. We identified three clusters using unsupervised machine learning and this structure was able to distinguish differences in physical activity (physically active vs. long sleeper, p = 0.02), sedentary behavior (all cluster comparison, p < 0.001), and sleep time duration (all cluster comparison, p < 0.001).
Conclusion: The 24h-MBQ has structure solution able to identify differences among clusters related to physical activity, sedentary behavior, and sleep duration. Employing machine learning to generate hierarchical clusters shows potential as a method for assessing the structural solution in psychometric assessments in epidemiological research.
Keywords: Physical Activity; Sedentary behavior; sleep duration; subjective tool.