Loïc Vial - Joint Neural Models of Word Sense Disambiguation and Machine Translation

Organized by: 
Loïc Vial
Loïc Vial


Composition du jury :

  • Benjamin Lecouteux, maître de conférences, Université Grenoble Alpes, directeur de thèse
  • Didier Schwab, maître de conférences, Université Grenoble Alpes, examinateur et co-encadrant de thèse
  • Mathieu Lafourcade, maître de conférences HDR, Université de Montpellier, rapporteur
  • Pierre Zweigenbaum, directeur de recherche, CNRS Île-de-France Gif-sur-Yvette, rapporteur
  • Frédéric Béchet, professeur des universités, Université Aix-Marseille, examinateur
  • Laurent Besacier, professeur des universités, Université Grenoble Alpes, examinateur 


Word Sense Disambiguation (WSD) and Machine Translation (MT) are two central and among the oldest tasks of Natural Language Processing (NLP). Although they share a common origin, WSD being initially conceived as a fundamental problem to be solved for MT, the two tasks have subsequently evolved very independently of each other. Indeed, on the one hand, MT has been able to overcome the explicit disambiguation of terms thanks to statistical and neural models trained on large amounts of parallel corpora, and on the other hand, WSD, which faces some limitations such as the lack of unified resources and a restricted scope of applications, remains a major challenge to allow a better understanding of the language in general.
Today, in a context in which neural networks and word embeddings are becoming more and more important in NLP research, the recent neural architectures and the new pre-trained language models offer not only some new possibilities for developing more efficient WSD and MT systems, but also an opportunity to bring the two tasks together through joint neural models, which facilitate the study of their interactions.
In this thesis, our contributions will initially focus on the improvement of WSD systems by unifying the ressources that are necessary for their implementation, constructing new neural architectures and developing original approaches to improve the coverage and the performance of these systems. Then, we will develop and compare different approaches for the integration of our state of the art WSD systems and language models into MT systems for the overall improvement of their performance. Finally, we will present a new architecture that allows to train a joint model for both WSD and MT, based on our best neural systems.