Textual Data Exploration and Management
The TEXTE team (Textual Data Exploration and Management [French: Exploration et exploitation de données TEXTuelles) develops models and tools for processing natural language and designing necessary resources, i.e. generating lexical corpora, producing text summaries by compression as well as automatic translation. This research focuses on the automatic analysis of the syntax and the lexical semantics of languages by using rather symbolic and logical methods, and on the development and acquisition of resources (lexical network, grammar) for natural language processing.
The TEXT team develops methods, tools and resources for the automatic processing of natural language, especially written language. This work focuses more particularly on its syntax and semantics, both logical and lexical. We tend to use symbolic methods, most often logical ones, hence our attachment to the Artificial Intelligence division. Although they are all related, the following activities can be distinguished in Text:
- Construction, acquisition of resources for automatic language processing (lexicon, grammar)
- Automatic analysis of the syntax and semantics of natural language.
This work requires fundamental research, often federated by logic:
- Constraint logic programming for model-driven syntax
- Syntactic and semantic analysis in type theory.
- Inference rules in a lexical network.
- Knowledge representation.
Other methods are also used: collaborative serious games, distributed algorithmics on graphs (ants), linear algebra (word vectors), statistics (noise suppression, grammar labelling).