ServiceNow recherche

Reinforcement Learning

Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization
Combinatorial optimization has found applications in numerous fields, from aerospace to transportation planning and economics. The goal …
Promoting Coordination through Policy Regularization in Multi-Agent Deep Reinforcement Learning
In multi-agent reinforcement learning, discovering successful collective behaviors is challenging as it requires exploring a joint …
Finding and Visualizing Weaknesses of Deep Reinforcement Learning Agents
As deep reinforcement learning driven by visual perception becomes more widely used there is a growing need to better understand and …
Reinforced Active Learning for Image Segmentation
Learning-based approaches for semantic segmentation have two inherent challenges. First, acquiring pixel-wise labels is expensive and …
Learning Reward Machines for Partially Observable Reinforcement Learning
Reward Machines (RMs) provide a structured, automata-based representation of a reward function that enables a Reinforcement Learning …
Real-Time Reinforcement Learning
Markov Decision Processes (MDPs), the mathematical framework underlying most algorithms in Reinforcement Learning (RL), are often used …
Searching for Markovian Subproblems to Address Partially Observable Reinforcement Learning
In partially observable environments, an agent’s policy should often be a function of the history of its interaction with the …
BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning
Allowing humans to interactively train artificial agents to understand language instructions is desirable for both practical and …
Improving Optimization Bounds using Machine Learning: Decision Diagrams meet Deep Reinforcement Learning
Finding tight bounds on the optimal solution is a critical element of practical solution methods for discrete optimization problems. In …
Reinforced Imitation in Heterogeneous Action Space
Imitation learning is an effective alternative approach to learn a policy when the reward function is sparse. In this paper, we …