EHR
Unique diagnosis days as a proxy for healthcare visits in electronic health record data Maxwell Salvatore* Maxwell Salvatore University of Pennsylvania
Electronic health record (EHR) data, which are not collected for research purposes, are subject to clinical informative observation processes that can introduce bias into their analysis. Therefore, EHR data analyses need to consider measures of healthcare utilization.
The total number of visits (TNV) is one conceptually simple metric that can present logistical challenges. For example, visit data are often stored separately from data of primary interest (e.g., diagnoses), and researchers may need to consider which visit types (e.g., outpatient, telehealth) are relevant. To limit data sharing, researchers may only have access to diagnosis code data. One alternative metric used in published research is the number of days with a diagnosis (NDD).
Using data from the 340,390 NIH All of Us Research Program participants, I explored (a) the partial correlation coefficients (adjusted for demographics) between TNV and each of 92 defined visit types, (b) the Pearson correlation coefficient between TNV and NDD overall and by sex and race/ethnicity, and (c) whether the choice between TNV or NDD substantially changes the regression coefficient for sex on bladder cancer.
Partial correlations between visit types and TNV varied substantially, ranging from less than 0.001 to 0.918 (outpatient visit).
The correlation between TNV and NDD was remarkably strong (0.877 [95% CI: 0.877, 0.878]). Some heterogeneity was observed across sex (0.885 among males and 0.873 among females) and race/ethnicity (Non-Hispanic (NH) Asian: 0.903; NH Black: 0.850; Hispanic/Latino: 0.854; NH White: 0.893).
The choice to adjust for TNV (male OR 2.63 [95% CI: 2.41, 2.86]) or NDD (2.69 [95% CI: 2.47, 2.93]) did not result in substantially different estimates of the association between sex and bladder cancer.
The use of NDD is straightforward to obtain and can serve as a proxy for TNV in analyses. Future work should validate NDD as a proxy across healthcare settings and patient subgroups.
