Accueil
Équipe
Publications
Évènements
Blog
Carrières
Nous joindre
Français
Français
English
ServiceNow
ServiceNow IA recherche
Tags
Reinforcement Learning
ServiceNow IA recherche
Reinforcement Learning
Destruction is a General Strategy to Learn Generation; Diffusion's Strength is to Take it Seriously; Exploration is the Future
I present diffusion models as part of a family of machine learning techniques that withhold information from a model’s input and train …
Pierre-André Noël
ICLR Blogposts 2026, 2026.
Article
Citation
PipelineRL: Faster On-policy Reinforcement Learning for Long Sequence Generation
Rafael Pardinas
,
Ehsan Kamalloo
,
Alexandre Piche
,
Dzmitry Bahdanau
Transactions on Machine Learning Research (TMLR), 2026.
Article
Citation
Code
LLMs can learn self-restraint through iterative self-reflection
In order to be deployed safely, Large Language Models (LLMs) must be capable of dynamically adapting their behavior based on their …
Alexandre Piche
,
Aristides Milios
,
Dzmitry Bahdanau
,
Christopher Pal
Transactions on Machine Learning Research (TMLR), 2025.
Article
Citation
Vidéo
Multimodal foundation world models for generalist embodied agents
Learning generalist agents, able to solve multitudes of tasks in different domains is a long-standing problem. Reinforcement learning …
Pietro Mazzaglia
,
Tim Verbelen
,
Bart Dhoedt
,
Aaron Courville
,
Sai Rajeswar Mudumba
Neural Information Processing Systems (NeurIPS), 2024.
Article
Citation
Code
Representing Positional Information in Generative World Models for Object Manipulation
The ability to predict outcomes of interactions between embodied agents and objects is paramount in the robotic setting. While …
Stefano Ferraro
,
Pietro Mazzaglia
,
Tim Verbelen
,
Sai Rajeswar Mudumba
Workshop at the Neural Information Processing Systems (NeurIPS), 2024.
Article
Citation
Self-evaluation and self-prompting to improve the reliability of LLMs
In order to safely deploy Large Language Models (LLMs), they must be capable of dynamically adapting their behavior based on their …
Alexandre Piche
,
Aristides Milios
,
Dzmitry Bahdanau
,
Christopher Pal
Workshop at the International Conference of Learning Representation (ICLR), 2024.
Article
Citation
Vidéo
Bridging the Gap Between Target Networks and Functional Regularization
Target networks are at the core of recent success in Reinforcement Learning. They stabilize the training by using old parameters to …
Alexandre Piche
,
Valentin Thomas
,
Joseph Marino
,
Gian Maria Marconi
,
Mohammad Emtiyaz Khan
,
Christopher Pal
Transactions on Machine Learning Research (TMLR), 2023.
Article
Citation
Code
Using Confounded Data in Latent Model-Based Reinforcement Learning
In the presence of confounding, naively using off-the-shelf offline reinforcement learning (RL) algorithms leads to sub-optimal …
Maxime Gasse
,
Damien Grasset
,
Pierre-Yves Oudeyer
,
Guillaume Gaudron
Transactions on Machine Learning Research (TMLR), 2023.
Article
Citation
Code
Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels
Controlling artificial agents from visual sensory data is an arduous task. Reinforcement learning (RL) algorithms can succeed but …
Sai Rajeswar Mudumba
,
Pietro Mazzaglia
,
Tim Verbelen
,
Alexandre Piche
,
Bart Dhoedt
,
Aaron Courville
,
Alexandre Lacoste
International Conference on Machine Learning (ICML), 2023.
Article
Citation
Code
Choreographer: Learning and Adapting Skills in Imagination
Unsupervised skill learning aims to learn a rich repertoire of behaviors without external supervision, providing artificial agents with …
Pietro Mazzaglia
,
Tim Verbelen
,
Bart Dhoedt
,
Alexandre Lacoste
,
Sai Rajeswar Mudumba
International Conference of Learning Representations (ICLR), 2023.
Article
Citation
Code
»
Citation
×