Thursday, January 27, 2022
The challenges of robot task learning
This talk will start by introducing a panel of the technical challenges of automatically acquiring robotic manipulation competencies. First, we will discuss the challenges of the Optimal Control and reinforcement learning frameworks in these settings. On the first hand, optimal control hardly assumes uncertainty over the environment dynamics, on the other hand, reinforcement learning requires possibly hazardous exploration. Furthermore, in both cases, explicit knowledge of the task and the environment are required. Demonstrations are a promising medium to cope with the difficulty of specifying a task by an end-user. However, a usable system needs to generalize from a limited amount of demonstrations in a given environment.
As an alternative, imitation learning has been considered as one of the promising approaches to enable a robot to acquire competencies. Nonetheless, this paradigm possibly requires a significant number of samples to become effective. Recently, one-shot imitation learning has enabled robots to accomplish manipulation tasks from a limited set of demonstrations. This approach has shown encouraging results for executing variations of initial conditions of a given task without requiring task-specific engineering. However, it remains inefficient for generalizing in variations of tasks involving different reward or transition functions.
As a second part of the talk, we propose to improve the generalization ability of demonstration based learning to unseen tasks that are significantly different from the training tasks. To achieve this purpose, we propose an approach to combine optimization-based and metric-based meta-learning for achieving task transfer in these challenging settings. First, we introduce the use of transformer-based sequence-to-sequence policy networks trained from limited sets of demonstrations, which is a form of metric-based meta-learning. Second, we propose to meta-train our model from a set of training demonstrations by leveraging optimization-based meta-learning. This approach allows us to efficiently fine-tune our model for a new task. Finally, we evaluate our approach using the recently proposed framework Meta-World which is composed of a large set of robotic manipulation tasks organized in various categories. We show significant improvements over previous one-shot-imitation approaches in various task transfer settings.
Mis à jour le 27 January 2022