2020 Workshops


The following workshops are now full. Registration is no longer available.

  • Utilizing electronic health records for epidemiological analysis (Goldstein)
  • Machine Learning for Epidemiologists: A Statistical Learning Approach (Sippy)

You can register for workshops through the full meeting registration, or register separately here.

PRE-CONFERENCE WORKSHOPS - Times listed are Eastern Time (ET)

Estimating propensity scores for binary, multinomial, and continuous exposures using TWANG
Oct 22 @ 12:00 pm – 4:00 pm

Session Co-Chair: Donna L. Coffman, Temple University
Session Co-Chair: Megan S. Schuler, Rand Corporation

When randomized experiments are infeasible, analysts must rely on observational data in which treatment (or exposure) is not randomly assigned. Although randomized trials are the gold standard, there are many important epidemiological questions that can be addressed using observational data. Drawing unbiased inferences from such data relies on the use of appropriate statistical methods, such as causal inference methods, to account for the non-randomized design. This workshop will introduce the potential outcomes framework and the use of inverse probability (or propensity) of treatment weights (IPTW) to estimate causal effects. We will present step-by-step guidelines on how to estimate and perform diagnostic checks of the weights for settings with two or more treatment groups and for continuous exposures. We will provide an overview on how to implement omitted variable analyses, which are critical to any IPTW analysis as the robustness of causal effects depends on no unobserved confounders. Attendees will gain hands-on experience estimating each type of weight using gradient (or generalized) boosting models (GBM), as well as in how to estimate the causal effects of interest using the IPTW. Running these analyses can be done via the TWANG package/suite of commands in Stata, SAS, or R; code will be shared. We will showcase a new menu-driven free Shiny app. Attendees should be familiar with linear and logistic regression, but prior knowledge of IPTW and GBM is not necessary.

Utilizing electronic health records for epidemiological analysis
Oct 23 @ 1:00 pm – 5:00 pm

Session Chair: Neal Goldstein, Drexel University

Increasingly data mined from the electronic health record (EHR) are being used in epidemiological research. But more data does not equate to better quality research. In this workshop, we will cover the basics of working with EHRs and designing valid epidemiological analyses. The workshop will be a mix of didactic lecture and interactive group exercises. Participants are requested to provide planned or active research questions in advance, as these will form the basis of breakout group exercises.

Lecture topics will include:
1. Designing and analyzing epidemiological studies using EHR data for both inpatient and outpatient settings.
2. Obtaining data from the EHR, including data export, linkage, and variable manipulation (e.g. parsing data from free text).
3. Architecture of the EHR and terminology/data standards.
4. Understanding the clinical population and how this relates to a target/general population.
5. Common pitfalls in working with EHR data and resources for additional reference.

Audience: Researchers interested in EHR data, including proposed and active research projects; students and trainees to seasoned investigators welcome.

Machine Learning for Epidemiologists: A Statistical Learning Approach
Oct 26 @ 12:00 pm – 4:00 pm

Session Chair: Rachel Sippy, University of Florida

Machine learning (ML) is a popular approach for prediction of outcomes, including forecasting and spatial predictions. It is well-suited to large datasets with many potential predictor variables and has been applied to many problems in public health and healthcare. This workshop is intended for participants with some statistical modeling background, interested in using ML for prediction. In this hands-on workshop, you will learn to identify appropriate questions for ML, the principles of ML, and how it relates to other modeling approaches. We will apply ML methods with a sample dataset, understand the tools available for using ML, and other resources for ML. This workshop assumes a working knowledge of R, and a laptop with R and RStudio installed will be required for the workshop.

Reproducible Research in Epidemiology: Why and How
Oct 30 @ 12:00 pm – 4:00 pm

Session Chair: Sam Harper, McGill University

Generating transparent and reproducible research is both ethical and necessary for making epidemiologic science useful. This workshop will provide participants with an overview of the rationale for why funders of epidemiologic research, and investigators and students of epidemiologic studies should aim to make their research transparent and fully reproducible, as well as hands-on experience with a selection of tools needed to do so. The workshop will provide: 1) an introductory, high-level overview of what it means to engage in reproducible research; 2) guidance on how to create a management plan for a research project and a structured workspace for the project that facilitates a reproducible workflow; 3) a discussion of pre-registration and pre-analysis plans for both experimental and observational research designs; 4) an introduction to version control and dynamic documents; and 5) tools and guidance for how to ethically and responsible share the outputs of a research project, including data, code, and research reports. The format for the workshop will be a combination of short lecture material, collaborative group work, as well as hands-on exercises. The workshop will be conducted using both R and Stata, but will focus on general practices and core principles that can be adapted to any software platform. The aim is for participants to leave with a strong grasp of why and how to use transparent and reproducible practices throughout the research life cycle.

Transportability and Data Fusion in Casual Inference Studies
Nov 6 @ 12:00 pm – 4:00 pm

Session Co-Chair: Onyebuchi A. Arah, University of California, Los Angeles
Session Co-Chair: Elias Barenboim, Columbia University

It is becoming increasingly clear that producing causal estimates from studies with acceptable internal validity is not sufficient to guide interventions and policy analysis for population health. External validity is critical for applying internally valid results from a study population to a target population that may or may not have given rise to the study population. Novel developments in causal inference allow us to give the sufficient and necessary conditions for generalizability and transportability. This workshop will provide accessible theoretical and practical introduction to the concepts of internal and external validity and show to generalize or transport internally valid external estimates from study populations to source or target populations. The concept of data fusion will be introduced to workshop participants for the purposes of generalizing or transporting data and effect estimates across populations and settings. The workshop will use structural and graphical language to make it accessible to epidemiologists interested in causal inference for informing interventions and policy. It will show how g-methods, particularly g-computation and inverse-probability-weighting and inverse-odds-weighting with(out) augmentation, can be used to generalize or transport effect estimates. Ample applications using empirical datasets and software codes will be provided in SAS, Stata and R.

How to make a picture worth a thousand words: Effectively communicating your research results using statistical graphics
Nov 9 @ 12:00 pm – 2:00 pm

Session Chair: Mike Jackson, Kaiser Permanente

Epidemiologists can use statistical graphics to understand our data and to guide us toward correct inferences. Well-designed graphics can also be powerful tools for communicating our study findings. However, while statistical software makes it easy to produce certain types of figures, the default options leave much to be desired. Too often, the result is figures that distract, confuse, or even distort data. In this workshop, participants will first learn the fundamentals of effective data visualization. This includes selecting appropriate chart types, drawing attention to the relevant data, using effective visual cues, and providing helpful context. We will discuss how to put these principles into practice, leading viewers to make comparisons, identify trends, and find meaningful correlations. Finally, we will walk through techniques for going beyond the default settings of various software packages to produce well-designed figures.

Creating Inclusive Classrooms and Curricula in Epidemiology
Nov 13 @ 12:00 pm – 4:00 pm

Session Co-Chair: Anjum Hajat, University of Washington
Session Co-Chair: Yvette Cozier, Boston University

Increased interactions with diverse peers enhance students’ educational experiences and bring measurable improvements in learning outcomes for all. Diversity also contributes to the scientific rigor of our scholarship and are necessary for the longevity and robustness of our discipline. Positive classroom climates and teaching practices have been shown to improve persistence and academic and emotional development among diverse students. As instructors, we have a responsibility to level the playing field, so that every student has an equal opportunity to master the learning objectives in our courses.
Building on the wealth of scholarship produced by our colleagues in the social sciences this half-day workshop will employ an active learning approach to developing inclusive classrooms and curricula in Epidemiology (e.g. lectures followed by small group discussions and revising existing syllabi). The following domains are key to the infusion of inclusivity in our courses 1) minding the privilege gap between our students and ourselves when developing our courses, 2) acknowledging and confronting implicit biases, and 3) mitigating stereotype threat in our classrooms. The workshop will feature several faculty serving as presenters and facilitators including: Yvette Cozier (Boston University), Sophie Godley (Boston University), Candice Belanoff (Boston University), and Anjum Hajat (University of Washington). It has been developed in conjunction with the SER Diversity and Inclusion committee.

Anjum Hajat, University of Washington
Yvette Cozier, Boston University
Candice Belanoff, Boston University
Sophie Godley, Boston University

Nov 16 @ 12:00 pm – 2:00 pm

Session Chair: Chuck Huber, Stata

Meta-analysis is a statistical technique for combining the results from multiple similar studies. The talk will provide a brief introduction to meta-analysis and will demonstrate how to perform meta-analysis in Stata 16. The -meta- command offers full support for meta-analysis, from computing various effect sizes and producing basic meta-analytic summaries and forest plots to accounting for between-study heterogeneity and potential publication bias. Examples demonstrating how to conduct meta-analysis within Stata will be provided. These examples will focus on the interpretation of meta-analysis under various models, meta-regression, subgroup analysis, small-study effects and publication bias, and various types of forest, funnel, and other plots.

Data manipulation, visualization, and reproducible documents with R and the Tidyverse
Nov 19 @ 12:00 pm – 4:00 pm

Session Co-Chair: Malcolm Barrett, University of Southern California
Session Co-Chair: Corinne Riddell, University of California, Berkeley

Recent developments by the R community have revolutionized the data analysis pipeline in R, from manipulating and visualizing data to communicating results. Our workshop will provide hands-on training in tools from the tidyverse ecosystem, using real epidemiologic data. In the first section, we will teach data manipulation with dplyr, a package that makes data cleaning easy, flexible, and enjoyable. In the next section, we will teach data visualization with ggplot2, the most popular plotting package in R, with a focus on creating publication-quality plots. We will then put these tools together to make reproducible documents. Using R Markdown, we will weave code and text together and learn to write papers and reports, exported to PDF, Word, or HTML, entirely in R. This workflow easily propagates upstream changes to data or analyses throughout a document and eliminates copy and paste errors. Together, these tools form a data analysis pipeline for reproducible, publication-ready work.

Algorithms, Bootstrapping and Cross-Validation: The ABCs of Machine Learning for Epidemiologists
Dec 4 @ 1:00 pm – 5:00 pm

Session Co-Chair: Jeanette Stingone, Columbia University
Session Co-Chair: Eric Lofgren, Washington State University

Machine learning, broadly defined as analytic techniques that fit models algorithmically by adapting to patterns in data, is growing in use within epidemiology. This workshop will explore how epidemiologists can use machine learning to advance their research and practice, while reflecting on some of the ethical and scientific considerations that arise from the use of data-driven techniques. The workshop will use a flipped classroom format to maximize time for discussion and programming activities during the SER workshop. Prior to the workshop, attendees will be sent 2-3 readings and links to 2-3 30 minute videos. These videos will introduce key terms, commonly-used algorithms, evaluation techniques and examples of epidemiologic studies that incorporated machine learning. During the workshop, these topics will be reinforced through a review of concepts, guided discussions, presentations of case-studies and demonstrations of analytic pipelines using R/R Studio. Attendees will work individually and in small groups on hands-on programming exercises of publicly available data, while also discussing the ethical and scientific challenges presented by different research scenarios. At the conclusion of this workshop, attendees will be able to discuss scenarios where machine learning can benefit epidemiologic analysis, analyze public health data using commonly-used algorithms, and feel empowered to pursue additional training or collaborate with scientists with expertise in machine learning.

Estimation and interpretation: Introduction to parametric and semi-parametric estimators for causal inference
Dec 7 @ 12:00 pm – 4:00 pm

Session Co-Chair: Laura B. Balzer, University of Massachusetts
Session Co-Chair: Jennifer Ahern, University of California, Berkeley

This workshop will introduce participants to the Causal Roadmap for epidemiologic questions: 1) clear statement of the scientific question, 2) definition of the causal model and parameter of interest, 3) assessment of identifiability – that is, linking the causal effect to a parameter estimable from the observed data distribution, 4) choice and implementation of estimators including parametric and semi-parametric, and 5) interpretation of findings. The focus will be on estimation with a simple substitution estimator (parametric G-computation), inverse probability of treatment weighting (IPTW), and targeted maximum likelihood estimation (TMLE) with Super Learner. Participants will work through the Roadmap using an applied example and implement these estimators in R during the workshop session.

Causal inference for multiple time-point (longitudinal) exposures
Dec 10 @ 12:00 pm – 4:00 pm

Session Co-Chair: Laura B. Balzer, University of Massachusetts
Session Co-Chair: Maya L. Petersen, University of California at Berkeley

This workshop applies the Causal Roadmap to estimate the causal effects with multiple intervention variables, such as the cumulative effect of an exposure over time, controlled direct effects, and effects on survival-type outcomes with right-censoring. We will cover longitudinal causal models, identification in the presence of time-dependent confounding; and estimation of joint treatment effects using G-computation, inverse probability weighting (IPW), and targeted maximum likelihood estimation (TMLE). During the workshop session, participants will work through the Roadmap using an applied example and implement these estimators with the ltmle R package. Prior training in causal inference in a single time-point setting is recommended, but not required.

E-values, Unmeasured Confounding, Measurement Error, and Selection Bias
Dec 11 @ 1:00 pm – 5:00 pm

Session Co-Chair: Maya Mathur, Stanford University
Session Co-Chair: Louisa Smith, Harvard University

The workshop will consider sensitivity analysis for different forms of bias in epidemiology. It will begin with confounding, focusing on a new metric to evaluate sensitivity to unmeasured confounding called the E-value. The E-value is the minimum strength of association, on the risk ratio scale, that an unmeasured confounder would need to have with both the exposure and the outcome, conditional on the measured covariates, to fully explain away the exposure-outcome association. E-value calculations for risk ratios, outcomes differences, odds ratios, and hazard ratios will be discussed. The E-value can be calculated in a straightforward way from study results and its use could help unify assessment of unmeasured confounding. The workshop will proceed by describing very recent analogous easy-to-implement approaches to also address differential measurement error and selection bias. We will conclude by presenting recent extensions allowing sensitivity analysis for all three forms of bias. The methods, taken as a whole, will constitute a straightforward comprehensive approach to bias analysis.

An introduction to transporting treatment effects from randomized clinical trials to clinical practice
Jan 8 @ 12:00 pm – 2:00 pm

Session Chair: Jennifer Lund, University of North Carolina at Chapel Hill

Randomized clinical trials (RCTs) are considered the gold standard for assessing efficacy of new therapies and are required for regulatory approval. However, patients enrolled on trials are often not representative of patients in whom treatment will ultimately be delivered in clinical practice. When response to therapy varies across subgroups, differences between trial and clinical populations can contribute to the “efficacy-effectiveness gap” – where a treatment’s efficacy in a trial differs from its effectiveness in clinical practice. Methods for generalizability and transportability can help bridge this gap. These methods combine RCT and clinical practice data to generate evidence that directly addresses therapy effectiveness in target populations. Such approaches leverage the internal validity of RCTs with the external validity of clinical practice data to better inform real-world decision-making.

In this workshop, we will provide an overview of methods for generalizing and transporting treatment effects from RCTs to defined target populations. Participants will receive SAS and R code to combine publicly available RCT and real-world data. Participants will gain an understanding of the theory underlying external validity. Using graphics and quantitative metrics, participants will evaluate the suitability of and compare effect estimates transported to various target populations.

This workshop requires an introductory level of epidemiology training and is relevant for all interested in expanding their epidemiological toolkit. This workshop may be of particular interest to those focused on causal inference methods, pharmacoepidemiology, and comparative effectiveness research.

An Introduction to R for Epidemiologists
Jan 11 @ 12:00 pm – 4:00 pm

Session Chair: Steve Mooney, University of Washington

This workshop will introduce participants to the R statistical computing platform for use in epidemiologic analysis. It is not intended to transform untested novices into R wizards in a mere half-day; rather, the goal will be to introduce the conceptual underpinnings, tools, and external resources that participants will need to overcome barriers to using R that they might encounter on their own, later. The material is designed for epidemiologists who are already familiar performing analyses using other statistical software (e.g. SAS/Stata/SPSS) but who have no first-hand experience with the R language. More specifically, the course will cover 1) basic R syntax, 2) importing data, 3) constructing, cleaning, and manipulating data objects, 4) loading and using external packages, 5) simple statistical modeling, and 6) graphics. Participants must bring a laptop with R installed; the instructor will be available by email beforehand to assist with R installation if difficulties arise.

Scientific Manuscript Writing for Peer Review Journals: Communicating Results of Studies
Jan 15 @ 12:00 pm – 4:00 pm

Session Chair: Moyses Szklo, JHSPH

In this half-day workshop, participants will critically review a paper as initially submitted to the American Journal of Epidemiology, but not yet published. The paper will be sent to participants in advance of the workshop for their critical review. During the workshop, a presentation will be made regarding some of the main points to be considered when preparing or reviewing a manuscript. Small-group work will follow the presentation so that participants can compare their reviews and prepare a consolidated list of critical comments on the paper. Each group will designate a leader who will present the group’s review of the paper to the whole group of participants. At the end of the workshop, students will receive copies of the manuscript’s AjE reviews, the initial editorial decision, and the final accepted version of the paper.

Confounding control for estimating causal effects: Looking under the hood
Jan 22 @ 12:00 pm – 4:00 pm

Session Co-Chair: Nicolle Gatto, Pfizer Inc.
Session Co-Chair: Ulka Campbell, Pfizer Inc.

This workshop introduces concepts of causal inference and confounding control for causal effect estimation. We will introduce potential outcomes, and articulate the conceptual basis and assumptions for two g-methods – standardization via g-computation and inverse probability weighting. Starting with a simple point-treatment setting we will explore how these methods estimate a causal effect, comparing them to more conventional techniques such as multivariable regression and propensity score control. We will then build to the more complex scenario of time-dependent confounding. Participants will learn how to apply these methods in SAS and R using an observational dataset with the primary goal of unpacking any “black boxes” to clarify the links among the causal effect of interest, the mechanics of these g-methods, and the programming code.