Aller au contenu principal


Lundi 27 Novembre 2023

Multimodal Transformers for Emotion Recognition

Mental health and emotional well-being have significant influence on physical health, and are especially important for healthy aging. Continued progress in sensors and microelectronics has provided a number of new technologies that can be deployed in homes and used to monitor health and well-being. These can be combined with recent advances in machine learning to provide services that enhance the physical and emotional well-being of individuals to promote healthy aging. In this context, an automatic emotion recognition system can provide a tool to help assure the emotional well-being of frail people. Therefore, it is desirable to develop a technology that can draw information about human emotions from multiple sensor modalities and can be trained without the need for large labeled training datasets.

This thesis addresses the problem of emotion recognition using the different types of signals that a smart environment may provide, such as visual, audio, and physiological signals. To do this, we develop different models based on the Transformer architecture, which has useful characteristics such as their capacity to model long-range dependencies, as well as their capability to discern the relevant parts of the input. We first propose a model to recognize emotions from individual physiological signals, proposing a self-supervised pre-training technique that uses unlabeled physiological signals, showing that this pre-training technique helps the model to perform better. This approach is then extended to work with multiple physiological signals. To further take advantage of the different modalities that a smart environment may provide, we also propose a model that uses as inputs multimodal signals such as video, audio, and physiological signals, addressing the issue that in real-world scenarios, there might be cases where a modality is missing.

The methods developed in this thesis are evaluated using several datasets, obtaining results that demonstrate the effectiveness of our approaches. Our results open new avenues for deeper exploration of using Transformer-based models to process information from environmental sensors. The results of this work can contribute to better care for the mental health of frail people.

Biography of Juan Vazquez-Rodriguez
Juan Vazquez-Rodriguez is a doctoral candidate at Orange Innovation and the Laboratoire Informatique de Grenoble (LIG). Prior to his doctoral studies he obtained his Bachelor's degree at ESPE University in Ecuador, a master's degree in Computer Engineering at Purdue University in the United States, and a Master of Science in Informatics in the MoSIG Masters Program Grenoble, France. His doctoral project is part of a collaboration between Orange 3 Massifs (Orange Labs) and the MIAI AI Institute Chair on Collaborative Intelligent Systems.

Date et Lieu

Lundi 27 Novembre 2023 à 14:00
Orange 3 Massifs, Salle Agora, 22 Chem. du Vieux Chêne, Meylan

Composition du Jury

Professeur des Universités, Sorbonne University, (Rapporteur)
Professeur des Universités, University of Augsburg and Imperial College London, (Rapporteur)
Professeure des Universités, Carnegie Mellon University, (Examinatrice)

des Universités, Institut Polytechnique de Grenoble, (Examinateur)
Professeur des Universités, Institut Polytechnique de Grenoble, (Directeur de thèse)
Chercheur, Orange, Grenoble, (Co-encadrant de thèse)
Julien CUMIN
Chercheur, Orange, Grenoble, (Co-encadrant de thèse)

Publié le 23 novembre 2023

Mis à jour le 24 novembre 2023