The SIFR project proposes to investigate the scientific and technical challenges in building ontology-based services to leverage biomedical ontologies and terminologies in indexing, mining and retrieval of French biomedical data.
Obj1: Design, development and deployment of the French Annotator.
Obj2: Obtain new research results to exploit and enhance ontology-based indexing services.
Obj3: Valorization of indexing services.
The volume of data in biomedicine is constantly increasing. Despite a large adoption of English in science, a significant quantity of these data uses the French language. Usually, the content of the resources is indexed to enable querying with keywords. However, there are obvious limits to keyword-based indexing: use of synonyms, polysemy, lack of domain knowledge. Biomedical data integration and semantic interoperability is necessary to enable new scientific discoveries that could be made by merging different available data (i.e., translational research). A key aspect in addressing semantic interoperability for life sciences is the use of terminologies and ontologies as a common denominator to structure biomedical data and make them interoperable. Especially, the community has turned toward ontologies to design semantic indexes of data that leverage the medical knowledge for better information mining and retrieval. However, besides the existence of various English tools, there are considerably less ontologies available in French and there is a strong lack of related tools and services to exploit them. This lack does not match the huge amount of biomedical data produced in French, especially in the clinical world (e.g., electronic health records).
The Semantic Indexing of French Biomedical Data Resources (SIFR) project proposes to investigate the scientific and technical challenges in building ontology-based services to leverage biomedical ontologies and terminologies in indexing, mining and retrieval of French biomedical data. We will build an ontology-based indexing workflow (i.e., French Annotator) similar to what exists for English resources but dedicated and specialized for French. Within the project, we work on several research questions from semantic indexing, text mining, terminology extraction, ontology enrichment, disambiguation, multilingualism in ontologies and semantic annotation in order to offer the community with services and applications capable of leveraging the use of biomedical ontologies in their data workflows. We will follow the translational bioinformatics and semantic Web visions to discover new knowledge by recombining already existing knowledge. Our main goal is to enable straightforward use of ontologies freeing health researchers to deal with knowledge engineering issues and to concentrate on the biological and medical challenges.
SIFR enables the emergence of new research domains and applications at LIRMM and materialize an important international collaboration with Stanford BMIR. SIFR will offer the French biomedical community (e.g., clinicians, health professionals, researchers) highly valuable ontology-based indexing services that will enhance their data production and consumption workflows. However, the results of the project are not limited to French and are also experimented within other domains such as agro-ecology or plant genomics. The project will put France in a key position to lead future European projects related to multilingual data management and semantic annotation and indexing in biomedicine and other domains.
[Master Research Intern - 6 months] Multilingualism in an ontology repository: the case of BioPortal [PDF]
[Master Devlopment Intern - 6 months] Viewpoints Web App: développement d’une application web 3‐tiers pour l’accès partagé à un « cerveau » communautaire implémenté par un graphe dynamique de points de vue [PDF]
[Postdoc - 12 months] Experimenting NCBO technologies for the Plant community [PDF]
The SIFR project is mainly funded by the French ANR organization wityhin the Young Researcher (JCJC) program 2012.
The project is also supported by the CNRS and University Montpellier 2 and the Computational Biology Institute of Montpellier project
Clement Jonquet (LIRMM) - firstname.lastname@example.org