Menu Close

ZENITH Team: Gestion de données scientifiques



Scientific Data Management

The three main challenges of scientific data management can be summarized as follows: (1) scale (large data, large applications); (2) complexity (uncertain data, multi-scale, with many dimensions), (3) heterogeneity (in particular, the semantic heterogeneity of data). They are also those of data science, whose goal is to make sense of data by combining data management, machine learning, statistics and other disciplines.
Zenith’s overall goal is to address these challenges by offering innovative solutions with significant benefits in terms of scalability, functionality, ease of use and performance. To produce generic results, these solutions are in terms of architectures, models and algorithms that can be implemented in terms of components or services in clusters or the cloud.
We design and validate our solutions by working closely with our scientific application partners such as INRAe and CIRAD in France, or MACC in Brazil. To further validate our solutions and extend the reach of our results, we also encourage industrial collaborations, even in non-scientific applications, provided they present similar challenges.

Esther Pacitti, Professeur des universités, UM
Florent Masseglia, Directeur de recherche, INRIA
Alexis Joly, Directeur de recherche, INRIA
Emmanuel Gothie, Ingénieur de recherche, INRIA
Reza Akbarinia, Chargé de recherche, INRIA
Cathy Desseaux, Assistant ingénieur, INRIA
Patrick Valduriez, Directeur de recherche, INRIA
Jean-Christophe Lombardo, Ingénieur de recherche, INRIA
Antoine Affouard, Ingénieur d’étude, INRIA

Associates & Students
Matteo Contini, IFREMER
Tanguy Lefort, UM
Kawtar Zaher, INA (Institut National de l’Audiovisuel)
Joaquim Estopinan, INRIA
Cesar Leblanc, INRIA
Camille Garcin, UM
Ananthu Aniraj, INRIA

Regular Co-workers
Benoit Lange, CDD Ingénieur-Technicien, INRIA
Thomas Paillot, CDD Ingénieur-Technicien, INRIA
Pallavin Jain, CDD Chercheur, CIHEAM-IAMM
Maxime Ryckewaert, CDD Chercheur, INRIA
Konstantinos Panousis, CDD Chercheur, INRIA
Pierre Leroy, CDD Ingénieur-Technicien, INRIA
François Munoz, Invité longue durée Chaire INRIA, INRIA
Claire Marine Parodi, CDD Ingénieur-Technicien, INRIA
Hugo Gresse, CDD Ingénieur-Technicien, INRIA
Théo Larcher, CDD Ingénieur-Technicien, INRIA
Benjamin Bourel, CDD Ingénieur-Technicien, CNRS
Christophe Botella, CDD Chercheur, INRIA
Maxime Fromholtz, CDD Ingénieur-Technicien, INRIA
Melanie Perraud, CDD Ingénieur-Technicien, INRIA
Raphael De Freitas Saldanha, CDD Chercheur, INRIA

Our approach is to capitalise on the principles of distributed and parallel data management. In particular, we exploit: high-level languages as a basis for data independence and automatic optimisation; data semantics to improve information retrieval and automate data integration; declarative languages (algebra, calculus) to manipulate data and workflows; and highly distributed and parallel environments such as P2P, cluster and cloud. To reflect our approach, we organise our research programme into five complementary themes:

  • Data integration, including polystores;
  • Query processing, including indexing and privacy; and
  • Management of scientific workflows;
  • Data analysis, including data mining and statistics;
  • Machine learning for high-dimensional data processing and retrieval.