Medical Education Curriculum

This curriculum maps the Data InDeed blog series into a structured, clinically grounded learning pathway for physicians at every stage of training. It is not designed for PhD statisticians. It is designed for clinicians who need to read, evaluate, and act on data — during training, in practice, and when patients are affected by AI tools, registry benchmarks, or observational research.

No coding required. Every module emphasizes interpretation and critical evaluation, not derivation or implementation. Estimated time per module: 3–5 hours of reading + 1–2 hours for the practical exercise.

🎞️ Lecture slides are available for all six blog series — RevealJS slide decks with live R visualizations, speaker notes, and chalkboard mode. See each module below for direct slide links, or browse the full lecture library.

How to Use This Curriculum

Self-paced: Read one post per day, 30–45 minutes. Work through modules in any order within a tier. Complete the practical exercise before moving to the next module.

Course-based: Each tier maps to one semester of protected seminars (6 sessions for Foundation, 5 for Applied, 5 for Advanced). Assign readings in advance; use the exercise as the in-session case discussion anchor.

Tier	Audience	Modules	Est. Hours	Lecture Series
Foundation	MS1–MS2	6	25–35 hrs	Applied Stats (10 lectures) · DOE (4) · Advanced Stats (4)
Applied	MS3–MS4, PGY-1	5	20–30 hrs	Trauma Registry (5 lectures) · Ethics (4) · Advanced Stats (4)
Advanced	Fellows, clinician-researchers	5	25–35 hrs	Advanced Stats (4 lectures) · Trauma Registry (5) · OMOP (2) · Ethics (4)

Tier 1 — Foundation

For MS1–MS2 · Build statistical vocabulary and the habit of asking "how do we know this?" before accepting clinical evidence at face value.

F1 · How Evidence Is Made — Probability, Uncertainty, and the Limits of Knowing

When a test result comes back, what does it actually mean? Uncertainty is not a flaw in medicine — it is the medium clinicians work in.

~4 hrs Applied Statistics series

Readings

Lecture Slides

Interactive Apps

L1 ① Probability — manipulate event probabilities and see joint/conditional/marginal relationships update live; anchors the diagnostic reasoning framing
L1 ② Bayes — adjust prior and likelihood; watch the posterior shift; directly maps to sensitivity/specificity and pre-test probability
L1 ③ Random Variables — vary distribution parameters and observe how expectation and variance change
L1 ⑤ Distributions — overlay clinical examples (LOS, lab values, mortality) on named distributions; use to illustrate shape → family choice
L2 ① CLT Explorer — sample repeatedly from a skewed population; watch the sampling distribution converge to normal as n grows
L2 ③ LLN Convergence — show how the running mean stabilizes as n increases; grounds expected value in simulation

Find a recent clinical trial. Identify how the sample was drawn, the reported confidence interval, and write two sentences about which patients are likely not represented in that evidence.

F2 · Reading a Study Without Being Fooled — p-Values, Confidence Intervals, and Statistical Significance

Journal clubs and drug reps invoke p-values constantly. Most clinicians accept them uncritically. This module builds the vocabulary to push back.

~4 hrs Applied Statistics + Trauma Registry

Readings

Lecture Slides

Interactive Apps

L3 ① p-value Explorer — shift effect size, sample size, and variance; watch the p-value move; illustrates why large n produces small p even for trivial effects
L3 ② Type I / II Error — move the α threshold and see the trade-off between false positives and false negatives in real time; essential for the arbitrary α = 0.05 discussion
L3 ③ CI Coverage — repeatedly sample and observe how often the interval contains the true value; corrects the common misconception about what 95% means
L3 ④ CI Precision — show how n and variance drive interval width independent of the p-value

Take a published RCT abstract. Rewrite the results section replacing every "statistically significant" with a precise description of effect size and confidence interval. Does the conclusion change?

F3 · Study Design — Why the Architecture of Evidence Determines What You Can Conclude

Before reading results, ask: what kind of study is this, and what can that design actually prove? Not all study types answer the same question.

~5 hrs Design of Experiments series

Readings

Lecture Slides

Interactive Apps

L2 ⑤ Power & Sample Size — vary effect size, α, and power; show how each design choice has a sample size consequence; connects study design decisions to resource constraints
L2 ④ Survivorship Bias — illustrate how selection at enrollment vs. follow-up distorts study findings; applicable to registry-based observational designs

Read one RCT and one observational study on the same clinical question. List the design differences and explain which question each can answer — and which it cannot.

F4 · Regression and Prediction — What Clinical Models Actually Do

Every clinical risk score — HEART, Wells, APACHE II — is a model. Understanding what models do (and fail to do) is a prerequisite for using them responsibly.

~5 hrs Applied Statistics series

Readings

Lecture Slides

Interactive Apps

L5 ① OLS Explorer — drag data points; watch the regression line, residuals, and R² update; builds intuition for what "best fit" means and why outliers are influential
L5 ② LINE Diagnostics — toggle assumption violations (heteroscedasticity, non-linearity); show what "bad" residual plots look like vs. well-behaved ones
L5 ③ Logistic Regression — vary predictor values; see how the log-odds to probability transformation works; ground the odds ratio interpretation
L5 ④ Calibration — compare a well-calibrated model to one that systematically over- or under-predicts; show the calibration curve and Brier score
L8 ① Bias-Variance — move model complexity; watch training vs. test error diverge; the clearest possible illustration of overfitting
L9 ① ROC & AUC — shift the classification threshold; show the sensitivity/specificity trade-off live; explain why AUC 0.5 = chance and what 0.75 vs. 0.90 means clinically

Look up the derivation paper for a risk score used in your intended specialty. Identify: what type of model it uses, what outcome it predicts, how it was validated, and what populations it was not validated in.

F5 · Missing Data — The Patients Who Aren't in the Table

Missing data is not a statistical nuisance — it is often the most important signal in a dataset. In clinical settings, who is missing is rarely random.

~4 hrs Advanced Statistics + Ethics

Readings

Lecture Slides

In a published paper, identify how missing data was handled. Was missingness described by patient characteristics? Was it tested for systematic patterns? Write a one-paragraph critique.

F6 · Survival Analysis — Reading Time-to-Event Evidence

Kaplan-Meier curves appear in nearly every oncology and cardiology trial. Understanding what they show — and what the thinning tail means — is essential literacy.

~3 hrs Applied Statistics + Toolkit

Readings

Lecture Slides

Applied Stats L3 — Inference, Sampling & Survival Analysis

Interactive Apps

L6 ③ Censoring & KM — add and move censoring events; watch the KM curve update and the risk table change; illustrates why the right tail is unreliable
L6 ④ Cox Model — vary covariate values; show how the hazard ratio shifts and how proportional hazards can be violated; grounds interpretation of published Cox tables

Annotate a Kaplan-Meier curve from a landmark trial: where are the censoring marks, what is the median survival, what happens to the confidence band at the right tail, and what does the log-rank p-value tell you (and not tell you)?

Tier 2 — Applied

For MS3–MS4 and PGY-1 Residents · Connect statistical reasoning to clinical decision-making encountered on wards, in journal clubs, and when evaluating AI tools in practice.

A1 · Confounding and Causation — Why "Associated With" Is Not "Causes"

A study shows patients who received Drug X had worse outcomes. Does Drug X cause harm — or were sicker patients more likely to receive it? Confounding is the central threat to observational evidence.

~5 hrs Advanced Statistics series

Readings

Lecture Slides

Read an observational study from your current rotation. Identify the main exposure, outcome, and at least three plausible confounders. Assess whether residual confounding is likely and in what direction.

A2 · Clinical Prediction Tools and AI — What to Ask Before You Trust a Score

The ED is full of decision tools. The ICU deploys sepsis algorithms. Radiology AI flags the chest X-ray before the attending sees it. How should a resident decide whether these tools are trustworthy?

~5 hrs Trauma Registry + Toolkit

Readings

Lecture Slides

Interactive Apps

L9 ① ROC & AUC — shift prevalence of the rare outcome; watch how a fixed AUC translates into very different PPV; makes the rare-event problem viscerally clear
L9 ② Calibration — compare a model's predicted risk to observed outcomes by decile; shows when a model with high AUC still fails clinically due to miscalibration
L9 ③ Variable Importance — toggle predictors; observe how importance rankings shift with data; grounds the discussion of which features are driving a black-box score
L8 ⑤ k-Fold CV — show the variance in performance across folds; illustrates why single-split validation overestimates true model performance

Identify one AI-assisted tool deployed at your training site. Find out how it was validated, in what patient population, and whether local performance is monitored. Write a one-page critical appraisal.

A3 · Real-World Evidence — What Registry Data and EHR Studies Can and Cannot Show

Most evidence shaping clinical guidelines is not from RCTs — it is from registries, claims databases, and EHR studies. Understanding what real-world evidence can and cannot prove is essential for practice.

~5 hrs Advanced Statistics + Design of Experiments

Readings

Lecture Slides

Find a clinical guideline in your specialty. For the two highest-weighted supporting studies, classify the design and note whether the evidence is from an RCT or real-world data. What does the balance tell you about confidence in the recommendation?

A4 · Ethics, Bias, and Accountability in Clinical AI

An algorithm flags Black patients as lower risk and doesn't trigger an alert. No one wrote code to discriminate — but the system discriminates. Where does the responsibility lie, and how does a clinician recognize it?

~4 hrs Ethics series

Readings

Lecture Slides

Review a published case or news account of a clinical AI system that caused harm or produced biased outputs. Identify where bias entered, what safeguards were or weren't present, and where accountability should lie.

A5 · Bayesian Thinking at the Bedside — Updating Beliefs With Evidence

Bayesian reasoning is what clinicians do every time they order a test: start with a pre-test probability, observe a result, update. Making that process explicit improves diagnostic reasoning and protects against cognitive error.

~4 hrs Applied Statistics + Trauma Registry

Readings

Lecture Slides

Interactive Apps

L4 ① Bayesian Updater — set a prior (flat, weakly informative, or strong), add observed data, watch the posterior shift; use with a clinical prior probability scenario
L4 ② Posterior Predictive — show how the posterior distribution generates predictions for new observations; grounds "what does this model say about the next patient?"
L4 ③ Monte Carlo — run repeated sampling to approximate intractable integrals; illustrates why simulation-based inference works when formulas don't
L4 ④ MCMC Explorer — visualize a Markov chain converging to the posterior; optional for clinical audiences but powerful for research fellows who want the "why it works" explanation

Choose a clinical scenario you have encountered. Estimate a pre-test probability, identify the likelihood ratio for a relevant test, and calculate the post-test probability. Write one sentence explaining how this changes your clinical decision — and compare this to how the decision was actually made.

Tier 3 — Advanced

For Clinical Fellows and Clinician-Researchers · Develop analytical independence to critique, commission, and contribute to clinical research — including registry-based studies, AI validation, and real-world evidence generation.

Adv1 · Hierarchical Models, Clustering, and Why Healthcare Data Have Structure

Patients nest within providers, providers within hospitals, hospitals within systems. Outcomes differ by site. Ignoring that structure produces misleading conclusions — and misleading benchmarks.

~5 hrs Trauma Registry + Applied Statistics

Readings

Lecture Slides

Interactive Apps

L7 ① PCA Explorer — rotate through PC axes; color points by severity, neuro injury, or shock; shows how PCA reveals clinical phenotypes without using the outcome as a label
L7 ② Scree & Loadings — move the "retain k PCs" slider; show cumulative variance captured; grounds the practical decision of how many dimensions to keep
L7 ③ k-Means — vary k and seed; show how the elbow plot guides selection; illustrates that cluster labels are arbitrary — their clinical meaning must be interpreted
L7 ④ Hierarchical — switch linkage methods (Ward vs. single vs. complete); show how the dendrogram shape changes; connects to the clustering-as-phenotyping narrative in the trauma registry context
L7 ⑤ Curse of Dims — increase p; watch the max/min distance ratio collapse toward 1; makes the case for dimensionality reduction before clustering high-dimensional registry data

Find a publicly available hospital quality or registry benchmarking report. Identify whether the risk-adjustment accounts for hospital-level clustering. Write a one-page methods critique.

Adv2 · Causal Inference for Clinician-Researchers — Moving Beyond Association

A fellow wants to know whether early tracheostomy reduces ventilator days. No RCTs exist. The registry data are available. How do you approach this without making causal claims the data cannot support?

~6 hrs Advanced Statistics + Toolkit

Readings

Lecture Slides

Select a clinical question relevant to your fellowship. Draft a one-page target trial protocol: eligible population, intervention, comparator, outcome, follow-up, and assignment strategy. Then describe what observational data you would use and what threats to validity remain.

Adv3 · Missing Data, Sensitivity Analysis, and Analytic Honesty

In every registry and EHR study, data are missing. The question is not whether — it is how much, in what pattern, and what the analyst chose to do about it. Sensitivity analysis separates defensible conclusions from fragile ones.

~6 hrs Advanced Statistics + Trauma Registry + Toolkit

Readings

Lecture Slides

Using a real or teaching dataset, document the pattern of missingness, apply two imputation strategies, compare results, and write a sensitivity analysis paragraph suitable for a methods section.

Adv4 · Emerging Real-World Evidence — Synthetic Data, Digital Twins, and AI-Enabled Evidence

Your institution is considering deploying a digital twin for postoperative monitoring. A vendor claims their synthetic data product can replace your registry cohort. What are the right questions to ask?

~4 hrs Real-World Evidence + Toolkit

Readings

Evaluate one published or vendor-described digital twin or synthetic data application in healthcare. Answer: (1) What was the data source? (2) What fidelity metrics were reported? (3) What validity claims are made that the evidence does not support?

Adv5 · Data Standards, Interoperability, and the Architecture of Research Data

Your fellowship hospital uses Epic. The coordinating center uses a different EHR. The national registry uses its own schema. A multicenter study requires combining all three. What breaks, and who decides?

~5 hrs OMOP + Toolkit

Readings

Lecture Slides

Map one clinical concept from your specialty through three coding vocabularies (ICD-10, SNOMED, CPT). Identify where definitions diverge and what would be lost in a multi-site study that naively combined records coded under each system.

What This Curriculum Does Not Do

This curriculum does not teach coding, does not require statistical software, and does not prepare learners to conduct original statistical analyses. That is intentional.

The goal is a clinician who can:

Read a methods section critically
Ask the right questions before trusting a model or registry report
Recognize when a statistical claim exceeds what the design supports
Evaluate AI tools before they affect patient care
Contribute meaningfully to a research team without delegating all methodological judgment

Note

Interested in consulting support for curriculum implementation, faculty development, or statistical programming for clinical research? Get in touch.

All readings in this curriculum are freely available through the Data InDeed blog series. No subscription or software is required.

Medical Education Curriculum

How to Use This Curriculum

Tier 1 — Foundation

F1 · How Evidence Is Made — Probability, Uncertainty, and the Limits of Knowing

F2 · Reading a Study Without Being Fooled — p-Values, Confidence Intervals, and Statistical Significance

F3 · Study Design — Why the Architecture of Evidence Determines What You Can Conclude

F4 · Regression and Prediction — What Clinical Models Actually Do

F5 · Missing Data — The Patients Who Aren't in the Table

F6 · Survival Analysis — Reading Time-to-Event Evidence

Tier 2 — Applied

A1 · Confounding and Causation — Why "Associated With" Is Not "Causes"

A2 · Clinical Prediction Tools and AI — What to Ask Before You Trust a Score

A3 · Real-World Evidence — What Registry Data and EHR Studies Can and Cannot Show

A4 · Ethics, Bias, and Accountability in Clinical AI

A5 · Bayesian Thinking at the Bedside — Updating Beliefs With Evidence

Tier 3 — Advanced

Adv1 · Hierarchical Models, Clustering, and Why Healthcare Data Have Structure

Adv2 · Causal Inference for Clinician-Researchers — Moving Beyond Association

Adv3 · Missing Data, Sensitivity Analysis, and Analytic Honesty

Adv4 · Emerging Real-World Evidence — Synthetic Data, Digital Twins, and AI-Enabled Evidence

Adv5 · Data Standards, Interoperability, and the Architecture of Research Data

Recommended Reading Paths

Shortest coherent path through all three tiers

Primarily interested in clinical AI evaluation

Entering a research fellowship

Building a registry-based research project

Video-first: all lecture series in sequence

What This Curriculum Does Not Do