Big Data/Machine Learning/AI
Feasibility of smartphone-based digital phenotyping in the 3E cohort of emerging adults Julianna C. Hsing* Julianna Hsing Lucia E. Calderon Lindsay T. Hoyt Alison K. Cohen Mathew V. Kiang
Background. Epidemiological research often relies on self-reported behavioral data, which can be prone to recall bias, low granularity, and participant fatigue. Digital phenotyping—using smartphone sensor data to capture human activity—offers a low-burden alternative for capturing objective near-real-time behavioral data, but the feasibility of such methods remains understudied in vulnerable populations. This study presents preliminary findings from an ongoing cohort study of emerging adults (aged 18-24) from two public Hispanic-Serving Institutions to assess the acceptability of smartphone data collection.
Methods. A customized smartphone app collected passive sensor data (e.g., GPS, accelerometer) and self-reported nightly survey data over a 9-day period. Participants responded to smartphone feasibility questions related to comfort, awareness, and behavioral changes due to smartphone data collection on a 5-point Likert scale.
Results. From September 2023 to December 2024, 318 participants were enrolled in a smartphone substudy from a larger cohort study of 871 college students. The majority of students identified as Latine (35%) or Asian (31%) and 55% identified as female. High-density, passive smartphone data collection resulted in 2.1 million GPS observations, 1.5 million accelerometer observations, and 9.1 million activity observations across 6,338 participant-days. 76% of participants reported being comfortable with having their personal data collected via smartphones, 78% did not consider smartphone data collection to be burdensome, and only 3% felt it altered their actual behavior.
Conclusion. Digital phenotyping from user-owned smartphones may provide a feasible method for collecting high-frequency, high-resolution behavioral data in socioeconomically and demographically diverse samples, typically underrepresented in traditional data collection methods. Future directions include evaluating agreement and missingness between smartphone and self-reported data.