Methods/Statistics
An Alternative to Bootstrapping When Estimating Confidence Interval Coverage in Simulation Studies Michael Webster-Clark* Michael Webster-Clark Webster-Clark Wake Forest University School of Medicine
When contrasting analytic methods or study designs in simulation studies, calculating estimates’ mean bias and empirical standard error is straightforward. Confidence interval (CI) coverage (i.e., the proportion of simulation iterations where confidence intervals contain the true value) may also be of interest. While CI coverage is easy to calculate if there is a closed-form solution for the variance of an estimate, obtaining CI coverage when bootstrapping is necessary is time intensive. Even with a relatively low number of simulation iterations (e.g., 500), a small number of bootstrap replicates (e.g., 200) drastically increases computation time.
In some cases, using the empirical standard error across simulation iterations to “stand in” for within-iteration standard error to create CIs may allow us to skip bootstrapping. We conducted a simulation study estimating 1) the absolute-scale effect of a binary treatment X on the risk of a binary outcome Y conditional on a confounder C and 2) the marginal effect of X on Y when weighting by C. We examined three simulation scenarios with 100, 300, or 900 individuals per simulation iteration. Within each of 500 simulation iterations we calculated CIs A) obtaining the within-iteration standard error from 200 bootstrap iterations or B) treating the empirical standard error across the simulation iterations as the iteration’s standard error.
Figure 1 consists of histograms and density plots of the ratio of the empirical standard error to the iteration-specific bootstrapped standard error for each estimand. Standard errors from the two approaches become more similar as sample size rose and variance decreased, with minor differences in CI coverage between the two except methods. In simulation studies where variance is relatively small and bootstrapping is not feasible (e.g., plasmode simulations), the standard error across iterations may serve as an approximate substitute for within-iteration bootstrapped standard errors.

