Quasi-Experimental Designs & Design Strategy

Design of Experiments — Lecture 4 of 4

Jonathan D. Stallings, PhD, MS

Data InDeed | dataindeed.org

2026-01-01

When randomization is impossible, nature sometimes runs the experiment for you. The discipline is in recognizing it — and exploiting it honestly.

What You’ll Learn Today

Post 10 — Quasi-Experimental Designs

Interrupted time series (ITS)
Segmented regression analysis
Difference-in-differences (DiD)
Parallel trends assumption
Regression discontinuity (RD)
Natural experiments

Series Synthesis — Design Strategy

Matching design to question
The design hierarchy
Internal vs. external validity tradeoffs
Choosing your design: a decision framework
Full series reading list

Part 1

Interrupted Time Series

Using time itself as the comparison

ITS: The Logic

# Simulate: policy change at month 24 reduces outcome
n_months <- 48; change_pt <- 24
df_its <- tibble(
  month    = 1:n_months,
  post     = as.integer(month > change_pt),
  time_after = pmax(0, month - change_pt),
  # Pre-period: upward trend; post: level shift down + slope change
  outcome  = 60 + 0.4*month - 15*post - 0.5*time_after + rnorm(n_months, 0, 3)
)

# Segmented regression
fit_its <- lm(outcome ~ month + post + time_after, data=df_its)
df_its$fitted <- fitted(fit_its)

ggplot(df_its, aes(month, outcome)) +
  geom_point(color="#64748b", size=2, alpha=0.7) +
  geom_line(aes(y=fitted), color="#0891b2", linewidth=1.4) +
  geom_vline(xintercept=change_pt + 0.5, linetype=2, color="#e63946", linewidth=1) +
  annotate("text", x=change_pt+1.5, y=72, label="Protocol\nchange",
           color="#e63946", size=3.5, hjust=0) +
  labs(title="Interrupted time series: level shift + slope change after policy implementation",
       x="Month", y="Outcome rate") +
  theme_di()

ITS detects two types of effect: an immediate level shift (step change) and a slope change (trajectory change). Both can be estimated simultaneously with segmented regression.

Segmented Regression: Reading the Model

broom::tidy(fit_its, conf.int=TRUE) |>
  mutate(
    term = recode(term,
      "(Intercept)" = "Baseline (month 0)",
      "month"       = "Pre-intervention slope (per month)",
      "post"        = "Level shift at policy change",
      "time_after"  = "Change in slope post-intervention"
    ),
    across(where(is.numeric), round, 3)
  ) |>
  select(term, estimate, std.error, conf.low, conf.high, p.value)

# A tibble: 4 × 6
  term                             estimate std.error conf.low conf.high p.value
  <chr>                               <dbl>     <dbl>    <dbl>     <dbl>   <dbl>
1 Baseline (month 0)                 59.9       1.10    57.7      62.1     0    
2 Pre-intervention slope (per mon…    0.39      0.077    0.235     0.545   0    
3 Level shift at policy change      -16.2       1.51   -19.3     -13.2     0    
4 Change in slope post-interventi…   -0.392     0.109   -0.611    -0.172   0.001

Interpretation:

post coefficient: immediate drop in outcome at the policy change point
time_after coefficient: additional monthly change after the policy (negative = continued decline)
Counterfactual: extend the pre-intervention line forward to estimate what would have happened without the policy

ITS Strengths and Threats

Strengths:

Uses routinely collected data
No control group required
Detects both immediate and gradual effects
Visual — easy to communicate to leadership
Well-suited to policy and protocol changes in registry data

Threats to validity:

Concurrent events (another change at the same time)
Secular trends (outcome was already changing)
Regression to the mean
Autocorrelation in monthly data requires correction

DoDTR application:

ITS ideal for evaluating CPG implementation:

When was the CPG issued?
Mark that month as the intervention point
Model compliance or outcome trajectory before vs. after

Concurrent deployment cycles or TCCC curriculum changes are the main threats — document them.

Part 2

Difference-in-Differences

Adding a control group to handle secular trends

DiD: The Core Idea

# Treated and control groups, pre/post policy
df_did <- tibble(
  time  = rep(c("Pre","Post"), each=2),
  group = rep(c("Treated","Control"), 2),
  mean  = c(55, 50, 45, 48)   # treated drops 10, control drops 2
) |> mutate(
  time = factor(time, levels=c("Pre","Post")),
  counterfactual = c(NA, NA, NA, 55 - (50-48))  # what treated would have been
)

ggplot(df_did, aes(time, mean, group=group, color=group)) +
  geom_line(linewidth=1.4) +
  geom_point(size=5) +
  # Counterfactual dashed line for treated
  geom_segment(aes(x="Pre", xend="Post", y=55, yend=53),
               linetype=2, color="#0891b2", linewidth=1, inherit.aes=FALSE) +
  annotate("text", x=2.05, y=53, label="Counterfactual\n(if no treatment)",
           color="#0891b2", size=3.2, hjust=0) +
  annotate("segment", x=2, xend=2, y=45, yend=53,
           arrow=arrow(length=unit(0.2,"cm"), ends="both"), color="#f59e0b", linewidth=1) +
  annotate("text", x=2.05, y=49, label="DiD\nestimate", color="#f59e0b", size=3.5, hjust=0) +
  scale_color_manual(values=c("#94a3b8","#e63946")) +
  labs(title="DiD: treated group − control group trend = causal effect",
       x=NULL, y="Outcome mean", color=NULL) +
  theme_di()

\[\hat\delta_{DiD} = (\bar{Y}^{Treated}_{Post} - \bar{Y}^{Treated}_{Pre}) - (\bar{Y}^{Control}_{Post} - \bar{Y}^{Control}_{Pre})\]

The Parallel Trends Assumption

n_months <- 36; change <- 18
df_pt <- tibble(
  month = rep(1:n_months, 2),
  group = rep(c("Treated","Control"), each=n_months),
  post  = as.integer(month > change)
) |> mutate(
  # Parallel pre-trends, diverge post
  outcome = case_when(
    group=="Control"  ~ 50 + 0.3*month + rnorm(n_months, 0, 2),
    group=="Treated" & post==0 ~ 55 + 0.3*month + rnorm(n_months, 0, 2),
    group=="Treated" & post==1 ~ 55 + 0.3*change + (month-change)*(0.3-0.5) - 8 + rnorm(n_months, 0, 2)
  )
)

ggplot(df_pt, aes(month, outcome, color=group)) +
  geom_line(linewidth=1.1) +
  geom_smooth(data=filter(df_pt, month<=change), method="lm", se=FALSE,
              linewidth=0.7, linetype=3) +
  geom_vline(xintercept=change, linetype=2, color="#94a3b8") +
  scale_color_manual(values=c("#94a3b8","#e63946")) +
  labs(title="Parallel pre-trends (dotted) support DiD validity — test this, don't assume it",
       x="Month", y="Outcome", color=NULL) +
  theme_di()

Parallel trends assumption: in the absence of the intervention, treated and control groups would have followed the same trajectory. Not testable post-intervention — but pre-intervention trend similarity is evidence (not proof).

Regression Discontinuity: Threshold-Based Assignment

RD logic: When treatment is assigned by a threshold on a continuous variable, patients just above and below the cutoff are comparable — as if randomly assigned near the boundary.

Example: ISS ≥ 25 triggers a different care protocol. Patients with ISS = 24 vs. ISS = 26 are nearly identical — but receive different treatment. The discontinuity at ISS = 25 identifies the causal effect.

Assumptions:

Running variable (ISS) cannot be precisely manipulated
No other threshold or policy change at the same cutoff
Continuity of potential outcomes at the threshold

DoDTR opportunity:

Any triage threshold is an RD design candidate:

ISS cutoffs for evacuation priority
GCS thresholds for intubation protocol
SBP thresholds for transfusion trigger
Age-based protocol differences

The key: patients near the threshold must not be able to game their score.

# Regression discontinuity: ISS threshold at 25
n <- 500
df_rd <- tibble(
  iss     = runif(n, 10, 45),
  treated = as.integer(iss >= 25),
  outcome = 20 - 0.3*iss - 4*treated + rnorm(n, 0, 3)
)

ggplot(df_rd, aes(iss, outcome)) +
  geom_point(aes(color=factor(treated)), alpha=0.4, size=1.5) +
  geom_smooth(data=filter(df_rd, iss < 25),  method="lm", se=TRUE,
              color="#2563eb", fill="#2563eb", alpha=0.2) +
  geom_smooth(data=filter(df_rd, iss >= 25), method="lm", se=TRUE,
              color="#e63946", fill="#e63946", alpha=0.2) +
  geom_vline(xintercept=25, linetype=2, color="#f59e0b", linewidth=1) +
  scale_color_manual(values=c("#2563eb","#e63946"),
                     labels=c("Control (ISS < 25)","Treated (ISS ≥ 25)")) +
  labs(title="Regression discontinuity: gap at ISS = 25 estimates the local average treatment effect",
       x="ISS (running variable)", y="Outcome", color=NULL) +
  theme_di()

The discontinuity at ISS = 25 estimates the LATE — the causal effect for patients near the threshold. Not the average effect for all patients.

Part 3

Design Strategy

Choosing the right design for the right question

The Design Decision Framework

When to Use Which Design

Question type	Best design	When randomization impossible
“Does X cause Y?”	RCT	Propensity score cohort, IV
“What’s the prevalence?”	Cross-sectional	Cross-sectional
“How does Y change over time?”	Longitudinal cohort	Longitudinal registry
“What happened after the policy?”	Pragmatic RCT	ITS or DiD
“Is rare outcome caused by X?”	Case-control	Nested case-control
“Does threshold assignment matter?”	Not applicable	Regression discontinuity
“Does treatment work in practice?”	Pragmatic RCT	ITS with control arm

The DoDTR is most compatible with: Retrospective cohort (comparative effectiveness), ITS (CPG implementation), DiD (policy changes across facilities), and cross-sectional (prevalence of care practices). RCTs are rare but possible for non-urgent protocol comparisons.

Design of Experiments Series — Complete

Lecture 1: Design Foundations

RCT: randomization breaks confounding
Observational: cohort vs. case-control
Cross-sectional: prevalence, not causation

Lecture 2: Planning

Longitudinal: trajectories + attrition
Power: 4 inputs, simulation, sensitivity
Randomization: blocks, strata, minimization

Lecture 3: Trial Integrity

Blinding: expectation bias by design
Adaptive: pre-specified interim rules
Pragmatic: effectiveness in real settings

Lecture 4: Quasi-Experimental

ITS: time as comparison structure
DiD: parallel trends + control group
RD: threshold assignment as instrument

The series thesis: Design is the upstream decision that determines what questions can be answered. Choose the design before choosing the analysis.