Monday, November 25th, 2024
- Share
- Share on Facebook
- Share on X
- Share on LinkedIn
Optimal regrets in Markov decision processes
Abstract:
In this manuscript, we investigate the problem of regret minimization in Markov decision processes under the average gain criterion. In both the model independent (aka minimax) and model dependent settings, we provide new lower bounds on the expected regret as well as algorithmic methods achieving them — hence being optimal and solving (at first order) the long standing problem of regret minimization in (communicating) Markov decision processes. Beyond regret minimization, we further study the trajectorial behavior of classical algorithms from a novel local viewpoint, through the lens of new learning metrics that quantify how algorithms choose actions locally rather than globally. Further techniques are provided to correct the behavior of existing algorithms regarding these metrics.
Date and place
Monday, November 25th at 2pm at Bâtiment IMAG, Salle 2 (RDC)
and Zoom
Jury members
Aurélien Garivier
Professeur, ENS de Lyon (Rapporteur)
Ronald Ortner
Professeur associé, Montanuniversität Leoben (Rapporteur)
Eric Gaussier
Professeur des universités, Université Grenoble Alpes (Examinateur)
Pierre Gaillard
Chargé de recherche, Centre INRIA de l'Université Grenoble Alpes (Examinateur)
Tor Lattimore
Chargé de recherche, DeepMind London (Examinateur)
Bruno Gaujal
Directeur de recherche, Centre INRIA de l'Université Grenoble Alpes (Directeur de thèse)
- Share
- Share on Facebook
- Share on X
- Share on LinkedIn