Ontologies for Longitudinal Health Records
Ontologies, terminologies, and standards used in longitudinal health records and Personal Health Knowledge Graphs (PHKG). This page maps the technology stack used by AIDAVA and related projects for health data interoperability, FAIRification, and semantic integration.
Core Clinical Terminologies
SNOMED CT
Systematized Nomenclature of Medicine — Clinical Terms
- Purpose: Comprehensive clinical terminology covering diseases, findings, procedures, body structures, organisms, substances, etc.
- Scale: 350,000+ concepts, 1M+ relationships
- Governance: SNOMED International (non-profit)
- Use in PHKG: Primary concept representation for clinical data. AIDAVA uses SNOMED CT as the backbone ontology for its reference knowledge graph — each PHKG instance maps clinical observations to SNOMED concepts.
- Key feature: Compositional — can express complex clinical concepts by combining simpler ones (post-coordination)
- Mapping: Maps to ICD-10, LOINC, Read Codes, and national terminologies
- Source: https://www.snomed.org
LOINC
Logical Observation Identifiers Names and Codes
- Purpose: Universal standard for identifying medical laboratory observations, clinical measurements, and survey instruments
- Scale: 100,000+ codes
- Governance: Regenstrief Institute
- Use in PHKG: Identifies what was measured (lab test, vital sign, clinical observation). SNOMED describes the concept; LOINC identifies the measurement.
- Key feature: Every code has 6 axes: component, property, time, system, scale, method
- Example: LOINC 2345-7 = "Glucose [Mass/volume] in Serum or Plasma"
- Source: https://loinc.org
ICD-10 / ICD-11
International Classification of Diseases
- Purpose: Standard diagnostic classification for epidemiology, health management, and clinical purposes
- Governance: WHO
- Use in PHKG: Disease classification and mortality coding. Less granular than SNOMED CT but universally mandated for billing and reporting.
- Key difference from SNOMED: ICD is a classification (flat hierarchy for reporting); SNOMED is a terminology (rich relationships for clinical reasoning)
- Mapping: SNOMED CT ↔ ICD-10 maps maintained by SNOMED International
- Source: https://icd.who.int
RxNorm
Normalized Names for Clinical Drugs
- Purpose: Standardized nomenclature for clinical drugs in the US, increasingly used globally
- Governance: NLM (US National Library of Medicine)
- Use in PHKG: Medication representation in longitudinal records — linking prescriptions, dispensing, and administration
- Key feature: Provides ingredient, dose form, and strength as separate concepts
- Source: https://www.nlm.nih.gov/research/umls/rxnorm
Health Data Models & Standards
HL7 FHIR
Fast Healthcare Interoperability Resources
- Purpose: Standard for exchanging healthcare data electronically
- Current version: FHIR R4 (Release 4), R5 available
- Governance: HL7 International
- Use in PHKG: Defines the resource types (Patient, Observation, Condition, MedicationRequest, etc.) that structure health data exchange. AIDAVA maps its PHKG nodes to FHIR resource profiles.
- Key feature: RESTful API, JSON/XML, modular resources
- FHIR Shorthand (FSH): Authoring language for FHIR Implementation Guides and profiles
- Source: https://hl7.org/fhir
OMOP CDM
Observational Medical Outcomes Partnership Common Data Model
- Purpose: Standardized data model for observational health data — enables multi-site research
- Governance: OHDSI (Observational Health Data Sciences and Informatics)
- Use in PHKG: Common representation for longitudinal observational data across institutions. Researchers can run the same analytics across different hospital systems.
- Key tables: Person, Condition_occurrence, Drug_exposure, Measurement, Observation, Procedure_occurrence
- Mapped terminologies: SNOMED CT (conditions), RxNorm (drugs), LOINC (measurements)
- Tools: ATLAS (cohort definition), OHDSI network studies
- Source: https://ohdsi.org
openEHR
Open Electronic Health Record
- Purpose: Open standard for EHR architecture — archetype-based clinical data modeling
- Governance: openEHR Foundation
- Use in PHKG: Clinical Knowledge Manager (CKM) provides archetypes (reusable clinical data models). Unlike FHIR (exchange-focused), openEHR is storage/persistence-focused.
- Key feature: Two-level modeling — reference model (technical) + archetypes (clinical)
- Different from FHIR: openEHR defines how to STORE data; FHIR defines how to EXCHANGE it
- Source: https://www.openehr.org
Phenopackets
Phenotype Data Exchange Format
- Purpose: Standard format for representing phenotypic data linked to genomic data
- Governance: GA4GH
- Use in PHKG: Structured phenotype representation for rare disease, linking patient phenotypes (HPO terms) to genomic variants
- Source: https://phenopackets.org
Domain-Specific Ontologies
Human Phenotype Ontology (HPO)
- Purpose: Standard vocabulary for phenotypic abnormalities in human disease
- Scale: 18,000+ terms, 300,000+ annotations to diseases
- Use in PHKG: Describing patient phenotypes longitudinally — tracking symptoms and signs over time
- Source: https://hpo.jax.org
Gene Ontology (GO)
- Purpose: Standard representation of gene function across species
- Domains: Molecular function, biological process, cellular component
- Use in PHKG: Linking genomic data to functional annotations in longitudinal genomics records
- Source: http://geneontology.org
Orphanet Nomenclature
- Purpose: Standard terminology for rare diseases
- Scale: 6,000+ rare diseases
- Use in PHKG: Rare disease identification in longitudinal records, linking to Orphacodes for cross-border data exchange
- Source: https://www.orpha.net
GA4GH Standards
Global Alliance for Genomics and Health
- Purpose: Framework for responsible genomic data sharing
- Key standards:
- Beacon API: Query whether a dataset contains a particular genomic variant
- VCF: Variant Call Format for genomic variants
- Phenopackets: Phenotype data linked to genomics (see above)
- Passport/DUO: Data use ontology for access control
- Use in PHKG: Genomic data representation and sharing in longitudinal health records
- Source: https://www.ga4gh.org
FAIRification & Semantic Web
FAIR Principles
Findable, Accessible, Interoperable, Reusable
- Applied to health data through:
- Persistent identifiers (DOIs, URIs)
- Rich metadata (Dublin Core, DCAT)
- Standard vocabularies (all ontologies above)
- Open protocols (REST APIs, SPARQL)
- AIDAVA connection: AIDAVA's first technology pillar is "Automation of quality enhancement and FAIRification" of collected health data
RDF / OWL / SPARQL
- RDF: Resource Description Framework — graph data model for representing knowledge
- OWL: Web Ontology Language — for defining ontologies with rich axioms
- SPARQL: Query language for RDF databases
- Use in PHKG: PHKGs are typically represented as RDF graphs, with SNOMED/LOINC/FHIR as the ontology layer
BioPortal
- Purpose: Repository of biomedical ontologies
- Scale: 900+ ontologies, 14M+ terms
- Governance: Stanford BMIR
- Use in PHKG: Source for ontology mappings, concept searches, and cross-ontology alignment
- Source: https://bioportal.bioontology.org
Ontology Integration Architecture (PHKG)
A typical Personal Health Knowledge Graph integrates these ontologies in layers:
- Top: Patient-specific nodes (this patient, this observation, this encounter)
- Middle: FHIR Resource profiles structuring the data (Observation, Condition, Medication)
- Bottom: Terminology codes (SNOMED CT for concepts, LOINC for measurements, RxNorm for drugs)
Cross-cutting: ICD for classification/reporting, OMOP CDM for research analytics, HPO for phenotyping, GA4GH for genomics.
Key Research Papers
- "An ontology-based rare disease common data model harmonising international registries, FHIR, and Phenopackets" — Nature (2025)
- "CONNECTED: leveraging digital twins and personal knowledge graphs in healthcare digitalization" — Frontiers (2025)
- "FAIRification of health-related data using semantic web technologies in the Swiss Personalized Health Network" — Nature (2024)
- "A multimodal vision knowledge graph of cardiovascular disease" — Nature (2025)
- "Genomics on FHIR — a feasibility study to support a National Strategy for Genomic Medicine" — Nature (2024)
- "TIMER: temporal instruction modeling and evaluation for longitudinal clinical records" — npj Digital Medicine (2025)