Propensity scores are a key tool in the careful design of non-experimental studies. They allow researchers to deal with observed confounders through careful design, usually matching, weighting, or subclassification. To interpret differences in outcomes between “exposed” and “unexposed” (or “treatment” and “control”) groups as causal, the key assumption is that there are no unobserved confounders, given the observed covariates. Propensity score methods essentially help deal with those observed confounders as well as possible, by equating the exposed and unexposed groups on the observed baseline characteristics. (Sensitivity analyses can then assess the robustness of results to a potential unobserved confounder). This playlist aims to summarize the vast propensity score literature that has grown up since their introduction in 1983 by Rosenbaum and Rubin. Since that time they have become a common strategy for estimating causal effects in non-experimental studies in epidemiology and other fields, but with many questions about their optimal use (and some questions still unresolved!). The papers listed below include some of the original work and explanations for propensity scores, some recent methodological papers diving into some of the practical questions regarding their use, and some examples of their use in practice.