Methods/Statistics
Comparing methods for linking neighborhood-level estimates to longitudinal data from the Breast Cancer Surveillance Consortium (BCSC) Erin J Aiello Bowles* Erin Bowles Bowles Bowles Bowles Bowles Bowles Bowles Bowles Bowles Bowles Bowles Bowles Bowles Bowles Bowles Kaiser Permanente Washington Health Research Institute
Methods for obtaining and linking neighborhood-level estimates of social determinants of health (SDOH) often focus on data from one specific timepoint (e.g. one year of the US Decennial Census or American Community Survey), or data from a single level of aggregation (e.g. census tract). We present methods and results from two area-based methods used to obtain SDOH among people receiving multiple breast imaging exams over time in the Breast Cancer Surveillance Consortium (BCSC). We linked >3.6M exams performed at 135 breast imaging facilities between 2005-2023 on women ages 18+ years from eight regional registries in California, Illinois, New Hampshire, North Carolina, Vermont, and Washington state. Regional BCSC registries geocoded addresses from self-report and administrative data at each exam using ArcGIS. For one method, registries linked geocodes to US census tracts over time (e.g. exams from 2010-2019 used 2010 census tracts) and then linked census tracts to publicly available SDOH indices (e.g. Area Deprivation Index [ADI]) by exam year. The second method linked zip codes to SDOH indices to reduce missingness (~40%) due to addresses that could not be geocoded and PO boxes. Zip-code-level indices were weighted by census tract using US Department of Housing and Urban Development residential ratios that describe the portion of each census tract within a zip code. We compared ADI raw scores from census tract and weighted zip code data to understand concordance between the two methods. Scatter plots of a random sample showed good agreement between the two methods with some variation by registry (see figure). Analyses using area-level data should consider trade-offs between missingness and data variability when linking to different neighborhood-level estimates and SDOH indices.

