Menu Close

TEXTE Team: Exploration et exploitation de donnees textuelles

Richard MOOT
Richard MOOT
Head

The TEXTE team (Textual Data Exploration and Management [French: Exploration et exploitation de données TEXTuelles) develops models and tools for processing natural language and designing necessary resources, i.e. generating lexical corpora, producing text summaries by compression as well as automatic translation. This research focuses on the automatic analysis of the syntax and the lexical semantics of languages by using rather symbolic and logical methods, and on the development and acquisition of resources (lexical network, grammar) for natural language processing.

Staff
Richard Moot, Chargé de recherche, CNRS
Mathieu Lafourcade, Maître de conférences, UM
Christian Retoré, Professeur des universités, UM

Associates & Students
Jérémie Roux, UM
Maximos Skandalis, CNRS
Nicolas Boffo, Ministère de l’intérieur
Camille Gosset, Berger-Levrault

Regular Co-workers
Imen Ben Sassi, CDD Chercheur, UM
Nadine Jacquet, CDD Ingénieur-Technicien, CNRS
Violaine Prince, Invité longue durée Eméritat, UM
Hani Guenoune, CDD Ingénieur-Technicien, UM

The TEXT team develops methods, tools and resources for the automatic processing of natural language, especially written language.  This work focuses more particularly on its syntax and semantics, both logical and lexical. We tend to use symbolic methods, most often logical ones, hence our attachment to the Artificial Intelligence division.  Although they are all related, the following activities can be distinguished in Text: 

  • Construction, acquisition of resources for automatic language processing (lexicon, grammar)
  • Automatic analysis of the syntax and semantics of natural language.

This work requires fundamental research, often federated by logic:

  • Constraint logic programming for model-driven syntax
  • Syntactic and semantic analysis in type theory.
  • Inference rules in a lexical network.
  • Knowledge representation.

Other methods are also used: collaborative serious games, distributed algorithmics on graphs (ants), linear algebra (word vectors), statistics (noise suppression, grammar labelling).