Methods/Statistics
Mind the Gap: addressing missing person time when estimating time-fixed effects in longitudinal data Jacqueline Rudolph* Jacqueline Rudolph Rudolph Rudolph Rudolph Rudolph Johns Hopkins Bloomberg School of Public Health
Longitudinal data often include gaps in observation when outcomes (and other variables) are unmeasured, due to missed study visits or drop out. Previously we examined how to handle data gaps when estimating outcome incidence; here, we extend that work to examine the estimation of time-fixed effects in longitudinal data. We generated 1000 simulation iterations of 1000 individuals followed across t=5 study visits; individuals were allowed to miss visits or drop out of the study. We generated 3 outcome types: transient, repeated, and permanent. In each simulation, we had two target parameters: (1) average causal risk difference (RD) at t=5 by baseline exposure (E[Y(5)x(0)=1,m ̅(5)=0]-E[Y(5)x(0)=0,m ̅(5)=0]) and (2) time-specific average causal RD by concurrent exposure (E[Y(t)x(t)=1,m ̅(t)=0]-E[Y(t)x(t)=0,m ̅(t)=0]), where Y(t) denotes outcome, X(t) denotes exposure, and M(t) denotes missingness at t. Here, we present results under a data generating mechanism where missingness was informative (affected by a baseline outcome predictor and exposure). We estimated parameter 1 by taking the contrast in the complement of the Kaplan-Meier survival estimators by baseline exposure; we estimated parameter 2 using linear binomial models with generalized estimating equations (exchangeable correlation), obtaining the information-weighted average of the time-specific estimates. We estimated bias and empirical standard errors for crude (e.g., censor at missed visit, include all observed visits) and adjusted (e.g., censor with censoring weights, multiple imputation) approaches to account for missed visits. Results are summarized in the Figure. For parameter 1, most crude and adjusted approaches underestimated RD; multiple imputation was least biased. For parameter 2, bias was minimal regardless of approach with a transient outcome; otherwise, multiple imputation had the lowest bias. In future work, we will examine additional data generating mechanisms, including those with time-varying confounders.

