WP3 – Information Extraction and Data Access


The main goal in this work package is to use NLP techniques to support the effort of the domain experts and IT experts to build comprehensive, standards-based information models of the sources, including the information currently only available in free text reports. This extraction process will be gradual and based on the identified core dataset for that domain, i.e. search with a clear goal in mind, not a random search in an infinite space. The idea is to gradually understand the data available in the free text report, find, classify and annotate the relevant concepts and also find the relations when possible.

WP2 will provide tools to perform syntactic and semantic analysis of textual data available in the EHR and Clinical Trial systems available in the EURECA environment.

From a broad perspective, another aim of this work package is to supplement the integration of clinical trial management systems with EHRs and with sources of external information in support of patient enrolment in clinical trials and of patient safety by facilitating clinical research. In concrete terms, it is also the objective of this work package to develop uniform interfaces for clinical trial repositories and for clinical trial data and other sources or relevant information, such as computerised clinical guidelines.


3.1 - Initial prototype for concept extraction out of EHR free text

3.2 - Initial prototype for relation identification between concepts

3.3 - Service for uniform access to clinical trial data and other external sources

3.4 - Recommendations for extended minimal set of data representing clinical trials

3.5 - Refined IE prototypes based on evaluation with the users

3.6 - Data model for clinical trial data repository


To see D3.1 click here:

http://share.ecancer.org/EURECA/Deliverables/D3 1