Menu Close

TEXTE Team: Exploration et exploitation de donnees textuelles

Richard MOOT


Textual Data Exploration and Management

The TEXTE team (Textual Data Exploration and Management [French: Exploration et exploitation de données TEXTuelles) develops models and tools for processing natural language and designing necessary resources, i.e. generating lexical corpora, producing text summaries by compression as well as automatic translation. This research focuses on the automatic analysis of the syntax and the lexical semantics of languages by using rather symbolic and logical methods, and on the development and acquisition of resources (lexical network, grammar) for natural language processing.

Richard Moot, Chargé de recherche, CNRS
Violaine Prince, Professeur des universités, UM
Mathieu Lafourcade, Maître de conférences, UM
Christian Retoré, Professeur des universités, UM

Associates & Students
Hani Guenoune, EMVISTA
Maxime Chapuis, ORQUAL
Camille Gosset, Berger-Levrault

Regular Co-workers
Davide Catta, CDD Enseignant-Chercheur, UPVM

The TEXT team develops methods, tools and resources for the automatic processing of natural language, especially written language.  This work focuses more particularly on its syntax and semantics, both logical and lexical. We tend to use symbolic methods, most often logical ones, hence our attachment to the Artificial Intelligence division.  Although they are all related, the following activities can be distinguished in Text: 

  • Construction, acquisition of resources for automatic language processing (lexicon, grammar)
  • Automatic analysis of the syntax and semantics of natural language.

This work requires fundamental research, often federated by logic:

  • Constraint logic programming for model-driven syntax
  • Syntactic and semantic analysis in type theory.
  • Inference rules in a lexical network.
  • Knowledge representation.

Other methods are also used: collaborative serious games, distributed algorithmics on graphs (ants), linear algebra (word vectors), statistics (noise suppression, grammar labelling).