OMOP Foundations: Longitudinal Assumptions, Governance & Value-Level Metadata

OMOP & Interoperability — Lecture 1 of 2

Jonathan D. Stallings, PhD, MS

Data InDeed | dataindeed.org

2026-01-01

OMOP was built for chronic disease. Trauma is the opposite of chronic disease. Understanding this gap is the beginning of doing OMOP right.

What You’ll Learn Today

Post 01 OMOP & Longitudinal Data

  • What OMOP does well
  • How trauma breaks its assumptions
  • Time compression problem
  • Multi-context care episodes

Post 02 Interoperability as Governance

  • The comfortable myth
  • Why “OMOP-compliant” ≠ interoperable
  • Value-level drift
  • Governance questions that must be answered

Post 03 Value-Level Metadata

  • The illusion after concept mapping
  • Where trauma meaning actually lives
  • Source values and provenance
  • Value-level dictionaries

Part 1

OMOP & Longitudinal Data

Built for chronic disease — strained by trauma

What OMOP Was Built to Do

OMOP CDM strengths:

  • Standardized vocabulary across institutions (SNOMED, ICD, LOINC, RxNorm)
  • Longitudinal patient record — encounters, conditions, drugs over years
  • Federated analytics — the same SQL query runs across sites without sharing data
  • Large research networks (OHDSI: >1 billion patient records globally)
  • Validated analytical packages (ATLAS, cohort definitions, PheWAS)

The paradigm: A diabetic patient has dozens of visits over years. OMOP tracks the full trajectory — diagnoses, medications, labs — linking by person_id across time.

Where OMOP excels:

  • Pharmacoepidemiology (drug exposure → outcome over months/years)
  • Chronic disease cohort studies
  • Cross-site prevalence estimation
  • EHR-based population health analytics
  • Real-world evidence for FDA submissions

These use cases match OMOP’s visit-centric, longitudinal design perfectly.

How Trauma Breaks the OMOP Assumptions

A combat trauma episode may generate 8 structured encounters across 4 care echelons in 72 hours. A diabetes trajectory has 48 encounters over 10 years. OMOP’s visit-centric model handles the second natively — and strains on the first.

The Four Ways Trauma Breaks OMOP

1. Time compression

Injury, hemorrhage, surgery, ICU, rehab can occur within 24–72 hours. OMOP visit timestamps don’t capture the within-visit temporal order that determines trauma outcome.

2. Multi-context care

Role 1 → Role 2 → Role 3 → Role 4 is not a single patient journey in OMOP — it may be four separate visit_occurrence records at different institutions, with no shared concept of “the episode.”

3. Dynamic severity

ISS is calculated at a point in time and may be revised. OMOP stores conditions as facts, not as provisional estimates that update. A Damage Control Surgery revision changes the clinical picture — OMOP has no native mechanism for that.

4. Structural missingness

Missing labs in OMOP = not observed. In trauma, missing lactate = not drawn because the patient was too unstable. These are semantically opposite — OMOP treats them identically.

The conclusion: OMOP is a translation layer, not a trauma model. You must build the trauma semantics on top — not try to fit trauma into the existing CDM tables.

Visit-Centric Analytics Hide the Episode

Each visit is OMOP-valid. None of them is linked as part of the same trauma episode. A query for “penetrating abdominal trauma mortality” may miss the patient who died at Role 4 — because that visit has no ICD code linking back to the original injury.

Part 2

Interoperability Is a Governance Problem

“OMOP-compliant” is not the same as “interoperable”

The Comfortable Myth

The myth: Schema + Vocabulary = Interoperability

The reality: Schema + Vocabulary + Value definitions + Governance + Change control + Shared semantics = Interoperability

Two trauma registries can both be OMOP-compliant and produce different answers to the same query because:

  • “Early” tourniquet means < 30 min at Site A, < 60 min at Site B
  • “Penetrating” includes blast at Site A, excludes it at Site B
  • “ISS” is calculated at admission at Site A, at discharge at Site B
  • “Mortality” is 30-day at Site A, in-hospital at Site B

None of these differences are visible at the schema level. All of them invalidate pooled analysis.

Vocabulary Alignment Is the Easy Part

OMOP solves the bottom two layers reliably. The top three — where interoperability actually breaks — require governance infrastructure that OMOP cannot provide by itself.

Governance Questions Interoperability Depends On

Value definition governance:

  • Who defines “prehospital tourniquet application”?
  • What time window qualifies as “early”?
  • How is ISS calculated when mechanism is ambiguous?
  • When is a transferred patient’s injury time recorded?

Change control governance:

  • When a clinical definition changes, how are historical records handled?
  • Who approves changes to code lists?
  • What is the versioning convention for concept sets?

Drift governance:

  • How are new injury mechanisms captured (emerging threats)?
  • What happens when ICD codes are updated?
  • Who audits inter-site definitional drift annually?

The DoDTR multi-site problem:

DODTR, JTTS, TCCC records, theater EHRs, and VA/DoD health records may all be “OMOP-compliant.”

They will not interoperate correctly for trauma research without:

  1. A shared data dictionary (value-level)
  2. A governance body that owns definitions
  3. A versioned change-control process
  4. Explicit temporal conventions (t=0, follow-up window)
  5. A federated validation mechanism

These are organizational decisions, not technical ones.

Part 3

Value-Level Metadata

Where trauma meaning actually lives

The Illusion of Completion After Concept Mapping

A trauma analyst maps MECHANISM_OF_INJURYSNOMED:417746004 (trauma).

This is not interoperability. It is vocabulary alignment.

What’s still unresolved:

Question OMOP schema answers Value-level dict answers
❌ What values are valid for this field? ❌ No ✅ Yes
❌ Is ‘blast’ included in ‘penetrating’? ❌ No ✅ Yes
❌ What is the source system format? ❌ No ✅ Yes
❌ When was this value recorded relative to injury? ❌ No ✅ Yes
❌ Is this value provisional or final? ❌ No ✅ Yes
❌ What clinical protocol defines this threshold? ❌ No ✅ Yes

What a Value-Level Dictionary Actually Contains

# Example: value-level metadata for ISS in DoDTR
field: injury_severity_score
omop_concept_id: 4310832
source_table: INCIDENT_RECORD
source_column: ISS_SCORE

valid_range: [1, 75]
missing_codes: [99, 999, -1]
provisional_flag: true           # may be updated at discharge

temporal_reference: admission    # calculated at: admission | discharge | both
calculation_protocol: "AIS 2005 mapped to ISS by trained abstractors"
abstractor_training: "JTTS abstractor certification required"

site_variations:
  ROLE_4_STATESIDE:
    recalculation: true          # ISS revised with CT confirmation
    recalc_timing: "within 72h of admission"
  ROLE_2_DEPLOYED:
    recalculation: false         # no CT; admission ISS is final

version: "2.1"
effective_date: "2023-01-01"
approved_by: "JTS Trauma Registry Governance Board"

This is the artifact that makes two OMOP-compliant sites actually comparable.

Source Values: The Provenance Chain

The OMOP source value fields:

Every mapped concept in OMOP has a source_value — the original text or code before mapping.

SELECT
  c.concept_name,          -- SNOMED: "Penetrating wound"
  co.condition_source_value -- Original: "GSW_ABD_PENET"
FROM condition_occurrence co
JOIN concept c ON co.condition_concept_id = c.concept_id
WHERE co.person_id = 12345

Why this matters:

If two sites map different source values to the same SNOMED concept, a pooled query returns “the same thing” — but the underlying clinical events may be different.

The source value is the only way to trace back to the original clinical context.

Three-tier provenance chain:

Tier 1: Source system value "GSW_ABD_PENET"

Tier 2: OMOP concept mapping SNOMED: 417746004

Tier 3: Value-level metadata “Confirmed penetrating abdominal; blast excluded; ISS ≥ 18 required”

All three tiers must be documented for a trauma registry OMOP implementation to support federated research. Most implementations stop at Tier 2.

Lecture 1 — Key Takeaways

OMOP & Longitudinal Data

  • OMOP built for chronic disease: visit-centric, longitudinal, stable conditions
  • Trauma: time-compressed, multi-context, dynamic severity, structural missingness
  • One trauma episode = 4+ disconnected OMOP visits with no episode linkage
  • OMOP is a translation layer — not a trauma model

Interoperability as Governance

  • “OMOP-compliant” guarantees schema and vocabulary — not interoperability
  • Value definitions, temporal conventions, and drift governance are not solved by CDM
  • Two OMOP-compliant registries can produce different answers to the same query
  • Interoperability requires organizational decisions, not just technical ones

Value-Level Metadata

  • Concept mapping (Tier 2) is necessary but not sufficient
  • Value-level dictionaries capture: valid ranges, missing codes, provisional flags, temporal reference, site variations, version
  • Source values (Tier 1) are the only provenance chain to original clinical context
  • Without Tier 3 (value-level metadata), federated research is unreliable

The meta-lesson: OMOP provides the grammar. Governance provides the vocabulary. Value-level metadata provides the meaning. All three are required — and only the first is automatic.

Coming Up: Lecture 2

Making OMOP Work: Operational Systems & The Translation Layer

Posts 04 & 05:

  • Trauma-ready OMOP — provisional vs. final data, episode logic, latency, versioning
  • OMOP as translation layer — civilian vs. military semantic gaps, distributed analytics, what OMOP can and cannot do