Emulating Trials with Real Data: A Game-Changer for AI Evidence

Advanced Statistics

A practical introduction to target trial emulation, time zero, eligibility, estimands, and g-computation for observational causal inference.

Published

February 1, 2026

Modified

June 9, 2026

Executive Summary

One of the biggest challenges in real-world evidence is that observational studies are often asked to answer causal questions they were never explicitly designed to answer.

That creates familiar problems:

immortal time bias,
ambiguous time zero,
post-treatment selection,
inappropriate eligibility windows,
and treatment comparisons that do not reflect a well-defined intervention.

This is where target trial emulation becomes so important (Hernán and Robins 2016, 2022).

The core idea is simple but powerful:

before analyzing observational data, first define the randomized trial you wish you had run.

That hypothetical trial is the target trial (Hernán and Robins 2016).

Once it is specified clearly, the observational study is designed to emulate it as closely as possible.

This matters because many bad observational analyses fail not only because of poor modeling, but because they never defined the causal question with trial-level clarity in the first place.

This post introduces:

what a target trial is,
how to specify eligibility, treatment strategies, assignment, follow-up, outcomes, and estimands,
why target trial emulation improves causal credibility,
and how g-methods such as g-computation fit into the workflow (Hernán and Robins 2016, 2022).

Target trial emulation matters because real-world data do not become causal evidence simply by being large — they become more credible when the observational analysis is designed to answer a trial-like question explicitly.

Observational Studies Often Fail Before the Model Even Begins

A common instinct in applied analytics is to start with the available dataset and ask:

what treatments are recorded?
what outcomes are present?
what covariates can we adjust for?

That seems practical, but it often leads to a poorly defined causal study.

For example:

when exactly does follow-up begin?
who is eligible at baseline?
what counts as treatment assignment?
are we comparing initiation strategies or observed exposure patterns?
what happens to people who switch, discontinue, or never fully adhere?

If these design questions are vague, the final model may still be elegant but the causal interpretation can be weak.

That is why target trial emulation begins with design, not software (Hernán and Robins 2016).

The Target Trial Is the Ideal Randomized Trial You Wish You Had

A target trial is a hypothetical randomized trial that would answer the causal question of interest.

It is not an actual experiment. It is a design template.

The target trial should specify (Hernán and Robins 2016, 2022):

eligibility criteria,
treatment strategies,
treatment assignment,
start of follow-up,
outcome definitions,
causal estimand,
and analytic plan.

This structure forces clarity.

Instead of saying vaguely:

“we want to know whether Drug A is better than Drug B,”

the analyst must say something more precise, such as:

among eligible adults with condition X at treatment initiation,
what is the 180-day risk difference in hospitalization under a strategy of initiating Drug A versus initiating Drug B?

That level of specificity is a major improvement.

A Drug Study Example Makes the Idea Concrete

Suppose we want to compare two treatment strategies for a chronic condition using observational data.

The causal question is:

among newly eligible patients, what is the effect of initiating Drug A versus Drug B on 1-year adverse outcome risk?

That sounds straightforward, but many details matter.

For example:

Are prevalent users included or excluded?
Is time zero the prescription date, diagnosis date, or first clinic visit?
Are patients who start later still eligible?
How are treatment switches handled?
Is the estimand intention-to-treat-like or per-protocol-like?

The target trial framework forces us to answer these explicitly.

The First Step Is to Write the Trial Protocol You Wish You Had Run

A helpful way to think about target trial emulation is to write a protocol table with trial components.

Typical components include:

Eligibility criteria
Treatment strategies
Assignment procedures
Start of follow-up
Outcome
Causal contrast / estimand
Analysis plan

This is not bureaucratic overhead. It is one of the strongest ways to prevent design ambiguity.

A vague observational study becomes much clearer once rewritten as an explicit trial protocol.

Eligibility Criteria Define Who Enters the Trial at Time Zero

The target trial should define who is eligible at baseline.

This is often one of the most overlooked steps in observational work.

For example, eligibility may require:

a confirmed diagnosis,
no prior exposure to the study drugs,
a minimum enrollment history,
no outcome occurrence before baseline,
and sufficient measurement of baseline covariates.

These criteria matter because causal questions depend on who is considered at risk and comparable at the moment of assignment.

Target trial emulation often favors a new-user design because prevalent users can distort baseline comparability.

Treatment Strategies Must Be Defined as Interventions, Not Merely Exposures

A key distinction in causal design is the difference between:

an observed exposure pattern,
and a treatment strategy.

The target trial should define interventions such as:

initiate Drug A at baseline
initiate Drug B at baseline

or more complex dynamic rules.

This is important because observational data often contain messy real-world behavior:

delayed starts,
switching,
discontinuation,
partial adherence,
combination therapy,
or rescue medications.

If the strategy is not clearly defined, the comparison becomes ambiguous.

That is why target trial emulation asks what intervention is being compared, not merely what exposure happens to appear in the data.

Time Zero Is One of the Most Important Design Decisions

A common problem in observational studies is poorly aligned time zero.

Time zero should be the moment when:

eligibility is assessed,
treatment strategy is assigned,
and follow-up begins.

If those are not aligned, the study may suffer from serious biases, including immortal time bias.

For example, if treatment status is defined using future information after cohort entry, then survival time that occurred before treatment may be wrongly attributed to the treatment group.

Target trial emulation makes time zero explicit precisely to prevent this kind of design error.

Follow-Up Must Reflect the Question Being Asked

Once time zero is defined, the follow-up period must also be specified clearly.

Questions include:

how long are participants followed?
what ends follow-up?
do we censor at treatment discontinuation?
do we allow treatment switching?
what is the administrative end date?

These choices depend partly on the estimand.

For example:

an intention-to-treat-like analysis may follow patients regardless of later adherence
a per-protocol-like analysis may censor when treatment deviates from the assigned strategy

These are different causal questions. Target trial emulation makes that difference explicit.

Outcomes Must Be Defined Without Post-Baseline Ambiguity

The outcome definition should be specified exactly as it would be in a trial.

For example:

hospitalization within 180 days
all-cause mortality within 1 year
first severe exacerbation after treatment initiation
treatment failure within follow-up

This sounds obvious, but observational studies often mix together:

multiple definitions,
poorly timed ascertainment,
or outcomes that occur before treatment assignment is fully defined.

The target trial framework reduces that ambiguity by forcing trial-style outcome specification at the design stage.

The Estimand Should Be Stated Clearly

A target trial is not fully specified until the estimand is defined.

That means clarifying what causal effect is being targeted.

Examples include:

risk difference at 1 year
risk ratio at 180 days
hazard ratio under an intention-to-treat-like contrast
per-protocol difference in cumulative incidence

This matters because different analytic choices target different estimands.

A major strength of target trial emulation is that it forces analysts to state the causal contrast clearly rather than treating it as an afterthought.

A Simple Target Trial Specification Table Helps Readers Immediately

A practical blog-friendly way to present the design is with a protocol-style table.

trial_spec <- tibble::tribble(
  ~Component, ~Specification,
  "Eligibility", "Adults with new diagnosis, no prior study drug use, no prior outcome, adequate baseline data",
  "Treatment strategies", "Initiate Drug A at baseline versus initiate Drug B at baseline",
  "Assignment", "Hypothetical random assignment in target trial; observed treatment initiation in emulation",
  "Time zero", "Date of treatment initiation / eligibility confirmation",
  "Outcome", "Hospitalization within 365 days",
  "Follow-up", "From baseline until outcome, censoring, death, or administrative end",
  "Estimand", "1-year risk difference under initiation of Drug A versus Drug B",
  "Analysis", "Adjustment with g-computation or weighting using baseline confounders"
)

trial_spec

# A tibble: 8 × 2
  Component            Specification                                            
  <chr>                <chr>                                                    
1 Eligibility          Adults with new diagnosis, no prior study drug use, no p…
2 Treatment strategies Initiate Drug A at baseline versus initiate Drug B at ba…
3 Assignment           Hypothetical random assignment in target trial; observed…
4 Time zero            Date of treatment initiation / eligibility confirmation  
5 Outcome              Hospitalization within 365 days                          
6 Follow-up            From baseline until outcome, censoring, death, or admini…
7 Estimand             1-year risk difference under initiation of Drug A versus…
8 Analysis             Adjustment with g-computation or weighting using baselin…

This kind of table often does more to clarify causal design than several paragraphs of narrative alone.

Simulating a Trial-Like Observational Dataset Helps Show the Logic

Let us build a simple observational dataset where treatment assignment depends on baseline severity, so naïve comparison will be confounded.

n <- 1200

tte_df <- tibble::tibble(
  age = rnorm(n, mean = 63, sd = 11),
  severity = rnorm(n, mean = 0, sd = 1),
  comorbidity = rnorm(n, mean = 0, sd = 1)
) |>
  dplyr::mutate(
    treat_prob = plogis(-0.7 + 0.9 * severity + 0.6 * comorbidity + 0.02 * age),
    treatment = rbinom(n, size = 1, prob = treat_prob),
    outcome_prob = plogis(-2.2 + 0.5 * treatment + 1.0 * severity + 0.8 * comorbidity + 0.02 * age),
    outcome = rbinom(n, size = 1, prob = outcome_prob)
  )

This is a typical real-world evidence situation:

treatment is not randomized
severity drives both treatment and outcome
the crude comparison will be biased

The Naïve Analysis Often Answers the Wrong Question Badly

Here is the crude treatment comparison.

tte_df |>
  dplyr::group_by(treatment) |>
  dplyr::summarise(
    outcome_risk = mean(outcome),
    .groups = "drop"
  ) |>
  tidyr::pivot_wider(
    names_from = treatment,
    values_from = outcome_risk
  ) |>
  dplyr::mutate(
    crude_risk_difference = `1` - `0`
  )

# A tibble: 1 × 3
    `0`   `1` crude_risk_difference
  <dbl> <dbl>                 <dbl>
1 0.223 0.540                 0.316

Because treatment is preferentially given to sicker patients, the crude risk difference does not isolate the treatment effect.

This is exactly why target trial emulation matters: it reframes the analysis around a trial-like comparison rather than a raw exposed-versus-unexposed contrast.

G-Computation Is a Natural Analysis Tool for Trial Emulation

One of the core g-methods is g-computation.

The basic idea is:

model the outcome as a function of treatment and baseline confounders
predict each person’s outcome under treatment
predict each person’s outcome under no treatment
average those counterfactual predictions over the study population

This directly targets the causal contrast implied by the target trial.

That makes g-computation especially natural in target trial emulation when the causal question is framed clearly.

A Simple G-Computation Example Shows the Workflow

We first fit an outcome model.

fit_gcomp <- glm(
  outcome ~ treatment + age + severity + comorbidity,
  data = tte_df,
  family = binomial()
)

summary(fit_gcomp)$coefficients

              Estimate  Std. Error   z value     Pr(>|z|)
(Intercept) -2.0461903 0.408309677 -5.011369 5.404427e-07
treatment    0.5689088 0.153680467  3.701894 2.139962e-04
age          0.0186471 0.006264698  2.976536 2.915252e-03
severity     0.9605031 0.085649775 11.214310 3.468826e-29
comorbidity  0.7399034 0.078233386  9.457643 3.149629e-21

Now create counterfactual datasets under each strategy.

treated_df <- tte_df |>
  dplyr::mutate(treatment = 1)

untreated_df <- tte_df |>
  dplyr::mutate(treatment = 0)

risk_if_treated <- mean(predict(fit_gcomp, newdata = treated_df, type = "response"))
risk_if_untreated <- mean(predict(fit_gcomp, newdata = untreated_df, type = "response"))

tibble::tibble(
  risk_if_treated = risk_if_treated,
  risk_if_untreated = risk_if_untreated,
  risk_difference = risk_if_treated - risk_if_untreated
)

# A tibble: 1 × 3
  risk_if_treated risk_if_untreated risk_difference
            <dbl>             <dbl>           <dbl>
1           0.453             0.347           0.106

This is a straightforward g-computation estimate of the target trial contrast.

Target Trial Emulation Forces Alignment Between Design and Estimation

A major strength of target trial emulation is that it aligns:

eligibility,
treatment strategy,
time zero,
follow-up,
estimand,
and analytic method.

Without that alignment, observational analyses often drift into ambiguity.

For example:

using post-baseline information to define exposure,
comparing prevalent rather than incident users,
or starting follow-up before treatment status is well defined.

These issues are often not “minor technicalities.” They fundamentally change the question being answered.

Target trial emulation helps prevent this drift.

Actual RCTs Still Have Advantages, but Emulation Improves Observational Design

Target trial emulation is not a replacement for an actual randomized trial.

A real RCT still has major advantages:

genuine randomization,
prospective protocol control,
clearer adherence tracking,
and more direct causal identification.

But in many real-world settings:

the RCT does not exist,
would be too slow,
too expensive,
or ethically infeasible.

In those settings, target trial emulation is valuable because it makes the observational design more disciplined and more transparent.

It improves observational causal analysis by forcing it to behave more like a trial protocol.

AI and Healthcare Intervention Modeling Benefit from Trial Emulation Thinking

Target trial emulation also matters for AI.

Many healthcare AI systems are increasingly asked not just to predict outcomes, but to simulate or support interventions.

That requires causal thinking.

An ML model that predicts risk well is not automatically a model that predicts what will happen under treatment strategy A versus treatment strategy B.

That is why target trial emulation is so useful in healthcare AI:

it creates clearer intervention definitions,
supports causal estimands,
and makes simulation-based intervention analysis more credible.

This is one of the strongest bridges between real-world evidence and causal AI.

Common Pitfalls in Target Trial Emulation Are Usually Design Pitfalls

Some of the most common problems include:

vague eligibility criteria
misaligned time zero
immortal time bias
use of post-treatment variables as baseline covariates
unclear estimands
ambiguous censoring rules
poor handling of treatment switching

These are often not failures of modeling sophistication. They are failures of trial specification.

That is one reason target trial emulation is such a useful discipline: it catches design ambiguity before it hardens into biased analysis.

Trauma Registry Application: Emulating Trials with Registry Data

The target trial framework is especially valuable for trauma registry research because conducting true randomized trials in trauma is often logistically or ethically impossible.

Several high-stakes trauma questions are natural target trial candidates:

Damage control versus early definitive surgery: what would survival look like if all eligible patients received one strategy versus the other?
Resuscitation ratios: what is the effect of a 1:1:1 versus 1:1:2 blood product ratio, holding eligibility criteria and time zero constant?
Transfer protocols: what is the effect of direct transport to a Level I center versus closest facility stabilization and transfer?

Each of these requires explicit design decisions — eligibility criteria, time zero, follow-up rules, and estimand definition — before any model is fit (Hernán and Robins 2016, 2022).

Without those design choices, registry analyses often answer the wrong question, mix prevalent and incident cases, or allow immortal time bias to corrupt the comparison.

Target trial emulation turns the trauma registry into a more principled study design tool.

A Practical Checklist for Applied Work

Before claiming a target trial emulation analysis, ask:

What is the hypothetical trial being emulated?
Who is eligible, exactly?
What are the treatment strategies?
When is time zero?
How does follow-up begin and end?
What is the causal estimand?
Is the analysis intention-to-treat-like or per-protocol-like?
Are the data and adjustment strategy sufficient to support the emulation credibly?

These questions often matter more than the exact package used for estimation.

Where This Shows Up in AI/ML

DoDTR linked to MHS GENESIS creates the observational infrastructure to emulate trials that could never be conducted prospectively in military populations — such as a randomized trial of whole blood versus component therapy resuscitation across deployed settings. The target trial framework forces analysts to specify eligibility criteria, treatment strategies, assignment mechanism, follow-up, and outcome before touching the data, preventing the garden-of-forking-paths analytic choices that inflate observational effect estimates. MAVEN is positioned to operationalize this approach at scale across MTFs. The failure mode is immortal time bias: studies that do not carefully define time zero in the emulated trial allow person-time before treatment initiation to be credited to a treatment arm, inflating apparent treatment benefit.

Closing: Target Trial Emulation Turns Observational Data into a Better Causal Design

Target trial emulation matters because observational analyses often fail when the causal question is underspecified.

By first defining the trial we wish had been run, we improve clarity about:

eligibility,
interventions,
assignment,
follow-up,
outcomes,
and estimands.

That design clarity then guides the observational emulation and the analytic method, including tools such as g-computation.

Target trial emulation matters because better causal inference does not begin with a more complex model, but with a better-defined question and a design that makes the observational study look as much like the ideal trial as possible.

📚 Go Deeper: Real-World Evidence Toolkit

This post is part of the Real-World Evidence Toolkit — a companion reference with target trial specification templates, g-computation scaffolds, time-zero alignment code, and observational study design checklists.

→ Open the Real-World Evidence Toolkit

Series Callout

Note

This post is part of a broader Advanced Topics in Applied Statistics for AI and Clinical Decision-Making Series:

Missing data methods
Imputation techniques
Sensitivity analysis for missing data
Causal inference methods
Propensity score methods
Instrumental variables
Confounding and bias adjustment in RWE
Target trial emulation
Meta-analysis and evidence synthesis
External validity and generalizability in RWE

Series: Advanced Statistics

← Taming Confounders: Bias-Free Insights from Real-World Data | Meta-Analysis Mastery: Combining Studies for Stronger AI →

References

Hernán, Miguel A., and James M. Robins. 2016. “Using Big Data to Emulate a Target Trial When a Randomized Trial Is Not Available.” American Journal of Epidemiology 183 (8): 758–64. https://doi.org/10.1093/aje/kwv254.

Hernán, Miguel A., and James M. Robins. 2022. “Target Trial Emulation: A Framework for Causal Inference from Observational Data.” JAMA 328 (24): 2446–47. https://doi.org/10.1001/jama.2022.21383.