Sujet de thèse SANS FINANCEMENT

Analyse automatique de la structure logique de débats en ligne écrits

PhD proposal WITHOUT GRANT

Automated analysis of the logical structure of online written debates

Sujet de thèse sans financement (seuls les étudiants avec leur propre financement peuvent choisir ces sujets:
étudiants étrangers avec une bourse de leur pays, normaliens,..)

PhD proposal without grant (only students with their own grant can be appointed on such PhD proposals, e.g. with foreign grants)

http://www.lirmm.fr/~retore/thesisproposals.html

Advisors: Souhila Kaci, Christian Retoré

Prerequisite / prérequis: advanced notions in logic / de solides connaissances en logique

Cette thèse aura pour objet d’analyser automatiquement les interventions de débats public en ligne réalisés avec la plateforme Dialoguea de Jean Sallantin, afin de dégager la structure logique du débat et de pouvoir ainsi visualiser, parcourir et interroger le débat : quels sont les arguments soutenant ou s’opposant à un autre argument ? Quelle est la nature d’un argument ? Un argument découle-t-il d’autres arguments ? En particulier il faudra savoir résoudre, dans ce cadre restreint, le problème de l’inférence textuelle (text entailment) : un énoncé est-il conséquence d’un autre ? Ce sera particulièrement utile pour vérifier la reformulation par l’intervenant du propos auquel il souhaite réagir (une étape obligatoire dans chaque intervention).

Un aspect du sujet consistera à intégrer et à adapter à un système purement logique, les formules produite à partir de phrases et de (courts) textes par un système d’analyse du langage naturel — comme l’analyseur syntaxique et sémantique Grail du français développé par Richard Moot, fondé sur les grammaires catégorielles et la sémantique de Montague. [LCG] Afin de prendre en compte le sens lexical, il faudra utiliser un lexique sémantique plus subtil, tout au moins pour la terminologie utilisée dans le débat, dans le style du Lexique Génératif Montagovien. [MGL]

L’autre aspect du sujet, étroitement lié, sera d’analyser la structure logique du débat, et notamment de prendre en compte les préférences, [PREF] qui interviennent à plusieurs niveaux du débat. Au niveau d’un intervenant, un seul énoncé peut avoir plusieurs interprétations, il faut donc que l’analyse automatique détermine quelle interprétation est préférée par l’intervenant en question (dont on suppose qu’il est relativement cohérent) : on peut ainsi déterminer comment un énoncé contribue au raisonnement de son auteur, quels sont ses connaissances et ses a priori, et aussi les préférences de l’intervenant entre ceux-ci. Mais les préférences interviennent aussi au niveau du débat lui-même : quelle relation logique relie le propos qui suscite une réaction à ladite réaction ? Là encore il y a plusieurs possibilités entre lesquelles l’analyse automatique devra établir des préférences. À ce niveau il faudra se laisser guider non seulement par la cohérence de chaque participant, mais aussi par la structure logique globale du débat. Les relations logiques dans ce dialogue à plusieurs voix sont à définir, mais on pourra s’inspirer des relations de discours de la Segmented Discourse Representation Theory [SDRT].

____________

This PhD proposals aims at conceiving a system for the automated analysis of online written debates from the platform Dialoguea by Jean Sallantin in order to exhibit the logical structure of the debate, to explore the debate: what are the arguments supporting or opposing a given argument, what is the nature of an argument, does an argument derive from a set of arguments? In particular, the PhD should address in this particular setting the question known as text entailment: does a set of sentences entails a given sentence? This will be especially useful to study whether the reformulation of an argument by a participant (a mandatory initial step when reacting to an argument on the debate platform) corresponds to the initial argument.

One aspect of the subject will consists in adapting and integrating into a purely logical, the logical formulae that are produced from texts and sentences by a natural language system — like the Grail large-scale syntactic and semantic parser developed by Richard Moot, based on categorial grammars and Montague semantics. [LCG] In order to take into account lexical meaning, one should use a refined semantic lexicon for the terminology that is used in the debate, in the style of the Montagovian Generative Lexicon [MGL]

The other aspect of the subject, tightly connected to the first one, is to exhibit the logical structure of the debate, including a proper treatment of preferences, which intervene at different levels in the debate. A single utterance may have several readings and the analysis should find which one the author of this utterance prefers, in order to understand the contribution of the utterance to the reasoning of the author, his knowledge and his a priori, and his preferences among these. But preferences also intervene in the global structure of the debate: which logical relation connects an utterance and a reaction to this utterance? Here as well, there are several possibilities and some of them are preferred. At the global debate level, one should take into account the coherence of a participant, but also the logical structure of the whole debate. Determining the relevant logical relations in a debate with several (typically 30) particpants is part of the PhD, but the relation of Segmented Discourse Representation Theory are likely to be a good hint. [SDRT].

[SDRT] Nicholas Asher and Alex Lascarides Logics of conversation Cambridge University Press 2003.

[PREF] Souhila Kaci Working with preferences : less is more. Springer 2011

[LCG] Richard Moot Christian Retoré The logic of categorial grammars : a deductive account of natural language syntax and semantics, LNCS 6850 Springer 2012.

[MGL] Christian Retoré (2014) The Montagovian Generative Lexicon /\Ty_n: a Type Theoretical Framework for Natural Language Semantics in R Matthes & A Schubert TYPES 2013 Postproceedings LIPICS 2014 http://dx.doi.org/10.4230/LIPIcs.TYPES.2013.202