ServiceNow IA recherche

PipelineRL: Faster On-policy Reinforcement Learning for Long Sequence Generation

Publication
Transactions on Machine Learning Research (TMLR)
Rafael Pardinas
Rafael Pardinas
Applied Research Scientist

Applied Research Scientist at Frontier AI Research located at [‘London (remote), UK’].

Ehsan Kamalloo
Ehsan Kamalloo
Research Scientist

Research Scientist at Frontier AI Research located at [‘Toronto (remote), Canada’].