Skip to content

Abstract Search

Study Design

Use of ZIP Codes and ZIP Code Tabulation Areas: Bias Analysis and Research Implications Futu Chen* Futu Chen Beau MacDonald Yan Xu Wilma Franco Alberto Campos Lawrence Palinkas Jill Johnston Sandrah P. Eckel Erika Garcia

The U.S. census’ American Community Survey (ACS) offers a geography called ZIP Code Tabulation Areas (ZCTA) which is similar to but different from US Postal Service ZIP Codes (ZIPs). Best practice for combining ZIPs and ZCTA datasets is largely undiscussed in the epidemiology literature. Without a “crosswalk” linkage, if ZCTAs containing ≥1 ZIP without the same 5-digit identifier, then the non-matching ZIPs are dropped. We compared standard crosswalk and non-crosswalk linkage results nationally and in a case study of California zero-emission vehicles (ZEV) adoption.

 

We obtained crosswalk files from the Uniform Data System Mapper. ZCTAs containing ≥1 ZIP without the same 5-digit identifier are dropped in non-crosswalk linkage. Nationally, we related an indicator for a ZCTA containing non-matching ZIPs to 2019 ACS ZCTA population characteristics using logistic regression, adjusted for state. In California, we used linear regression to relate ZIP-level 2019 ZEV adoption per 1,000 population (California Energy Commission) to ZCTA neighborhood socioeconomic status (SES) index quintiles (California Neighborhoods Data System), using crosswalk and non-crosswalk linkages.

 

Nationally, 15% of ZCTAs (range 3%-100%, median=14%) contained non-matching ZIPs. ZCTA with higher % population below poverty level or with higher % renters had a higher odds of non-matching ZIPs (OR=2.0 [95% CI: 1.5,2.7]; OR=26.0 [22.9,31.8], respectively). For California ZEV data, 31.7% of 2,580 ZIPs were excluded without crosswalk linkage. The difference in ZEV adoption rate comparing low SES (Q1-2) to high SES (Q5) ZCTAs was -25.7 [-28.5,-22.8] without cross-walk versus -26.0 [-28.8,-23.1] with cross-walk, an underestimate of 1.15%.

 

Non-crosswalk linkage may cause bias by differentially excluding ZIPs for disadvantaged populations. Crosswalk linkage is recommended with ZCTA as the final unit of analysis. Results are relevant to a wide range of ZIP data (e.g., business, health outcomes, transportation).