Quasi-Experimental Designs & Design Strategy

Design of Experiments — Lecture 4 of 4

Jonathan D. Stallings, PhD, MS

Data InDeed | dataindeed.org

2026-01-01

When randomization is impossible, nature sometimes runs the experiment for you. The discipline is in recognizing it — and exploiting it honestly.

What You’ll Learn Today

Post 10 — Quasi-Experimental Designs

  • Interrupted time series (ITS)
  • Segmented regression analysis
  • Difference-in-differences (DiD)
  • Parallel trends assumption
  • Regression discontinuity (RD)
  • Natural experiments

Series Synthesis — Design Strategy

  • Matching design to question
  • The design hierarchy
  • Internal vs. external validity tradeoffs
  • Choosing your design: a decision framework
  • Full series reading list

Part 1

Interrupted Time Series

Using time itself as the comparison

ITS: The Logic

# Simulate: policy change at month 24 reduces outcome
n_months <- 48; change_pt <- 24
df_its <- tibble(
  month    = 1:n_months,
  post     = as.integer(month > change_pt),
  time_after = pmax(0, month - change_pt),
  # Pre-period: upward trend; post: level shift down + slope change
  outcome  = 60 + 0.4*month - 15*post - 0.5*time_after + rnorm(n_months, 0, 3)
)

# Segmented regression
fit_its <- lm(outcome ~ month + post + time_after, data=df_its)
df_its$fitted <- fitted(fit_its)

ggplot(df_its, aes(month, outcome)) +
  geom_point(color="#64748b", size=2, alpha=0.7) +
  geom_line(aes(y=fitted), color="#0891b2", linewidth=1.4) +
  geom_vline(xintercept=change_pt + 0.5, linetype=2, color="#e63946", linewidth=1) +
  annotate("text", x=change_pt+1.5, y=72, label="Protocol\nchange",
           color="#e63946", size=3.5, hjust=0) +
  labs(title="Interrupted time series: level shift + slope change after policy implementation",
       x="Month", y="Outcome rate") +
  theme_di()

ITS detects two types of effect: an immediate level shift (step change) and a slope change (trajectory change). Both can be estimated simultaneously with segmented regression.

Segmented Regression: Reading the Model

broom::tidy(fit_its, conf.int=TRUE) |>
  mutate(
    term = recode(term,
      "(Intercept)" = "Baseline (month 0)",
      "month"       = "Pre-intervention slope (per month)",
      "post"        = "Level shift at policy change",
      "time_after"  = "Change in slope post-intervention"
    ),
    across(where(is.numeric), round, 3)
  ) |>
  select(term, estimate, std.error, conf.low, conf.high, p.value)
# A tibble: 4 × 6
  term                             estimate std.error conf.low conf.high p.value
  <chr>                               <dbl>     <dbl>    <dbl>     <dbl>   <dbl>
1 Baseline (month 0)                 59.9       1.10    57.7      62.1     0    
2 Pre-intervention slope (per mon…    0.39      0.077    0.235     0.545   0    
3 Level shift at policy change      -16.2       1.51   -19.3     -13.2     0    
4 Change in slope post-interventi…   -0.392     0.109   -0.611    -0.172   0.001

Interpretation:

  • post coefficient: immediate drop in outcome at the policy change point
  • time_after coefficient: additional monthly change after the policy (negative = continued decline)
  • Counterfactual: extend the pre-intervention line forward to estimate what would have happened without the policy

ITS Strengths and Threats

Strengths:

  • Uses routinely collected data
  • No control group required
  • Detects both immediate and gradual effects
  • Visual — easy to communicate to leadership
  • Well-suited to policy and protocol changes in registry data

Threats to validity:

  • Concurrent events (another change at the same time)
  • Secular trends (outcome was already changing)
  • Regression to the mean
  • Autocorrelation in monthly data requires correction

DoDTR application:

ITS ideal for evaluating CPG implementation:

  • When was the CPG issued?
  • Mark that month as the intervention point
  • Model compliance or outcome trajectory before vs. after

Concurrent deployment cycles or TCCC curriculum changes are the main threats — document them.

Part 2

Difference-in-Differences

Adding a control group to handle secular trends

DiD: The Core Idea

# Treated and control groups, pre/post policy
df_did <- tibble(
  time  = rep(c("Pre","Post"), each=2),
  group = rep(c("Treated","Control"), 2),
  mean  = c(55, 50, 45, 48)   # treated drops 10, control drops 2
) |> mutate(
  time = factor(time, levels=c("Pre","Post")),
  counterfactual = c(NA, NA, NA, 55 - (50-48))  # what treated would have been
)

ggplot(df_did, aes(time, mean, group=group, color=group)) +
  geom_line(linewidth=1.4) +
  geom_point(size=5) +
  # Counterfactual dashed line for treated
  geom_segment(aes(x="Pre", xend="Post", y=55, yend=53),
               linetype=2, color="#0891b2", linewidth=1, inherit.aes=FALSE) +
  annotate("text", x=2.05, y=53, label="Counterfactual\n(if no treatment)",
           color="#0891b2", size=3.2, hjust=0) +
  annotate("segment", x=2, xend=2, y=45, yend=53,
           arrow=arrow(length=unit(0.2,"cm"), ends="both"), color="#f59e0b", linewidth=1) +
  annotate("text", x=2.05, y=49, label="DiD\nestimate", color="#f59e0b", size=3.5, hjust=0) +
  scale_color_manual(values=c("#94a3b8","#e63946")) +
  labs(title="DiD: treated group − control group trend = causal effect",
       x=NULL, y="Outcome mean", color=NULL) +
  theme_di()

\[\hat\delta_{DiD} = (\bar{Y}^{Treated}_{Post} - \bar{Y}^{Treated}_{Pre}) - (\bar{Y}^{Control}_{Post} - \bar{Y}^{Control}_{Pre})\]

Regression Discontinuity: Threshold-Based Assignment

RD logic: When treatment is assigned by a threshold on a continuous variable, patients just above and below the cutoff are comparable — as if randomly assigned near the boundary.

Example: ISS ≥ 25 triggers a different care protocol. Patients with ISS = 24 vs. ISS = 26 are nearly identical — but receive different treatment. The discontinuity at ISS = 25 identifies the causal effect.

Assumptions:

  • Running variable (ISS) cannot be precisely manipulated
  • No other threshold or policy change at the same cutoff
  • Continuity of potential outcomes at the threshold

DoDTR opportunity:

Any triage threshold is an RD design candidate:

  • ISS cutoffs for evacuation priority
  • GCS thresholds for intubation protocol
  • SBP thresholds for transfusion trigger
  • Age-based protocol differences

The key: patients near the threshold must not be able to game their score.

# Regression discontinuity: ISS threshold at 25
n <- 500
df_rd <- tibble(
  iss     = runif(n, 10, 45),
  treated = as.integer(iss >= 25),
  outcome = 20 - 0.3*iss - 4*treated + rnorm(n, 0, 3)
)

ggplot(df_rd, aes(iss, outcome)) +
  geom_point(aes(color=factor(treated)), alpha=0.4, size=1.5) +
  geom_smooth(data=filter(df_rd, iss < 25),  method="lm", se=TRUE,
              color="#2563eb", fill="#2563eb", alpha=0.2) +
  geom_smooth(data=filter(df_rd, iss >= 25), method="lm", se=TRUE,
              color="#e63946", fill="#e63946", alpha=0.2) +
  geom_vline(xintercept=25, linetype=2, color="#f59e0b", linewidth=1) +
  scale_color_manual(values=c("#2563eb","#e63946"),
                     labels=c("Control (ISS < 25)","Treated (ISS ≥ 25)")) +
  labs(title="Regression discontinuity: gap at ISS = 25 estimates the local average treatment effect",
       x="ISS (running variable)", y="Outcome", color=NULL) +
  theme_di()

The discontinuity at ISS = 25 estimates the LATE — the causal effect for patients near the threshold. Not the average effect for all patients.

Part 3

Design Strategy

Choosing the right design for the right question

The Design Decision Framework

When to Use Which Design

Question type Best design When randomization impossible
“Does X cause Y?” RCT Propensity score cohort, IV
“What’s the prevalence?” Cross-sectional Cross-sectional
“How does Y change over time?” Longitudinal cohort Longitudinal registry
“What happened after the policy?” Pragmatic RCT ITS or DiD
“Is rare outcome caused by X?” Case-control Nested case-control
“Does threshold assignment matter?” Not applicable Regression discontinuity
“Does treatment work in practice?” Pragmatic RCT ITS with control arm

The DoDTR is most compatible with: Retrospective cohort (comparative effectiveness), ITS (CPG implementation), DiD (policy changes across facilities), and cross-sectional (prevalence of care practices). RCTs are rare but possible for non-urgent protocol comparisons.

Design of Experiments Series — Complete

Lecture 1: Design Foundations

  • RCT: randomization breaks confounding
  • Observational: cohort vs. case-control
  • Cross-sectional: prevalence, not causation

Lecture 2: Planning

  • Longitudinal: trajectories + attrition
  • Power: 4 inputs, simulation, sensitivity
  • Randomization: blocks, strata, minimization

Lecture 3: Trial Integrity

  • Blinding: expectation bias by design
  • Adaptive: pre-specified interim rules
  • Pragmatic: effectiveness in real settings

Lecture 4: Quasi-Experimental

  • ITS: time as comparison structure
  • DiD: parallel trends + control group
  • RD: threshold assignment as instrument

The series thesis: Design is the upstream decision that determines what questions can be answered. Choose the design before choosing the analysis.

Full Series Reading List