Reinforcement Learning

Training-time privileged information (PI) can enable language models to succeed on tasks they would otherwise fail, making it a …

Emiliano Penaloza, Dheeraj Vattikonda, Nicolas Gontier, Alexandre Lacoste, Laurent Charlin, Massimo Caccia

International Conference on Machine Learning (ICML), 2026.

I present diffusion models as part of a family of machine learning techniques that withhold information from a model’s input and train …

Pierre-André Noël

ICLR Blogposts 2026, 2026.

Rafael Pardinas, Ehsan Kamalloo, Alexandre Piche, Dzmitry Bahdanau

Transactions on Machine Learning Research (TMLR), 2026.

In order to be deployed safely, Large Language Models (LLMs) must be capable of dynamically adapting their behavior based on their …

Alexandre Piche, Aristides Milios, Dzmitry Bahdanau, Christopher Pal

Transactions on Machine Learning Research (TMLR), 2025.

Learning generalist agents, able to solve multitudes of tasks in different domains is a long-standing problem. Reinforcement learning …

Pietro Mazzaglia, Tim Verbelen, Bart Dhoedt, Aaron Courville, Sai Rajeswar Mudumba

Neural Information Processing Systems (NeurIPS), 2024.

The ability to predict outcomes of interactions between embodied agents and objects is paramount in the robotic setting. While …

Stefano Ferraro, Pietro Mazzaglia, Tim Verbelen, Sai Rajeswar Mudumba

Workshop at the Neural Information Processing Systems (NeurIPS), 2024.

In order to safely deploy Large Language Models (LLMs), they must be capable of dynamically adapting their behavior based on their …

Alexandre Piche, Aristides Milios, Dzmitry Bahdanau, Christopher Pal

Workshop at the International Conference of Learning Representation (ICLR), 2024.

Target networks are at the core of recent success in Reinforcement Learning. They stabilize the training by using old parameters to …

Alexandre Piche, Valentin Thomas, Joseph Marino, Gian Maria Marconi, Mohammad Emtiyaz Khan, Christopher Pal

Transactions on Machine Learning Research (TMLR), 2023.

In the presence of confounding, naively using off-the-shelf offline reinforcement learning (RL) algorithms leads to sub-optimal …

Maxime Gasse, Damien Grasset, Pierre-Yves Oudeyer, Guillaume Gaudron

Transactions on Machine Learning Research (TMLR), 2023.

Controlling artificial agents from visual sensory data is an arduous task. Reinforcement learning (RL) algorithms can succeed but …

Sai Rajeswar Mudumba, Pietro Mazzaglia, Tim Verbelen, Alexandre Piche, Bart Dhoedt, Aaron Courville, Alexandre Lacoste

International Conference on Machine Learning (ICML), 2023.