AIDAVA
AIDAVA
AI-powered Data Curation & Publishing Virtual Assistant. EU Horizon Europe research project automating curation and publishing of personal health data using AI.
Overview
AIDAVA protypes and tests an AI-powered virtual assistant that maximizes automation of data curation and publishing of unstructured and structured, heterogeneous health data. The assistant includes a backend library of AI-based data curation tools and a frontend based on human-AI interaction modules.[1]
Key Facts
| Detail | Value |
|---|---|
| Grant ID | 101057062 |
| Funding Call | HORIZON-HLTH-2021-TOOL-06 |
| Total Cost | €7,720,620 |
| EU Contribution | €7,720,620 (100% funded) |
| Start Date | September 1, 2022 |
| End Date | August 31, 2026 |
| Duration | 4 years |
| Partners | 13 from 9 countries |
| Type | Horizon Europe Research and Innovation Action (RIA) |
| Website | aidava.eu |
Vision
"Curate once, reuse many times" — supporting patients, clinical care providers and clinical researchers from the same curated data.[2]
The project addresses the core problem that integrated, high-quality personal health data represents a potential wealth of knowledge for healthcare systems, but there is no reliable conduit for this data to become interoperable, AI-ready and reuse-ready at scale.[3]
Technology Pillars
- Automation of quality enhancement and FAIRification of collected health data, in compliance with EU data privacy
- Knowledge graphs with ontology-based standards as universal representation, to increase interoperability and portability — each Personal Health Knowledge Graph (PHKG) is an instance of a common reference knowledge graph based on ontologies derived from SNOMED, HL7 FHIR resource profiles, LOINC, and other domain-specific terminologies[4]
- Deep learning for information extraction from narrative content (NLP in three languages)
- AI-generated explanations during the process to increase users' confidence (explainability)
Use Cases
- Breast cancer patient registries — structured registry data curation
- Longitudinal health records for cardiovascular patients — integrating heterogeneous data sources over time
Both tested in three languages with hospitals and emerging personal data intermediaries.[5]
Solution Architecture
- Data cleaning machine — orchestrating multiple AI-based tools to automate curation
- Personal Health Knowledge Graph (PHKG) — universal semantic representation of all personal health data
- Conversational AI assistant — engages patients, with explainability capabilities
- Metadata capture — on data sources to support automation within a formalised Data Transfer Specification
- Tools orchestrated include: OCR, syntactic transformation, semantic transformation, entity deduplication, NLP, feature extraction from imaging[6]
Impact
- Decrease workload of clinical data stewards through increased automation
- Improve effectiveness of clinical care through high-quality data
- Support clinical research with reusable, interoperable data
- Long-term: democratise participation in data curation by citizens/patients
- Support delivery of the European Health Data Space (EHDS)[7]
Partners
| Organization | Country | Role |
|---|---|---|
| Maastricht University | Netherlands | Coordinator |
| KU Leuven | Belgium | Research partner |
| i-HD | Belgium | Health data standards & quality |
| Egnosis | Romania | Health data intermediary |
| Ontotext | Bulgaria | Knowledge graph technology |
| Averbis | Germany | NLP / text mining |
| Medical University of Graz | Austria | Clinical partner |
| North Estonia Regional Hospital | Estonia | Clinical partner (use case) |
| European Cancer Patient Coalition | Belgium | Patient advocacy |
| European Heart Network | Belgium | Patient advocacy (cardiovascular) |
| B!LOBA | Belgium | Data management |
| DFP Research | Spain | Research partner |
| EURICE | Germany | Project management |