ServiceNow Research

How to Train Your LLM Web Agent: A Statistical Diagnosis (Oral)

Abstract

Large language model (LLM) agents for web interfaces have advanced rapidly, yet open-source systems still lag behind proprietary agents. Bridging this gap is key to enabling customizable, efficient, and privacy-preserving agents. Two challenges hinder progress: the reproducibility issues in RL and LLM agent training, where results often depend on sensitive factors like seeds and decoding parameters, and the focus of prior work on single-step tasks, overlooking the complexities of web-based, multi-step decision-making.

We address these gaps by providing a statistically driven study of training LLM agents for web tasks. Our two-stage pipeline combines imitation learning from a Llama 3.3 70B teacher with on-policy fine-tuning via Group Relative Policy Optimization (GRPO) on a Llama 3.1 8B student. Through 240 configuration sweeps and rigorous bootstrapping, we chart the first compute allocation curve for open-source LLM web agents. Our findings show that dedicating one-third of compute to teacher traces and the rest to RL improves MiniWoB++ success by 6 points and closes 60% of the gap to GPT-4o on WorkArena, while cutting GPU costs by 45%. We introduce a principled hyperparameter sensitivity analysis, offering actionable guidelines for robust and cost-effective agent training.

Publication
Workshop at the International Conference of Machine Learning (ICML)
Dheeraj Vattikonda
Dheeraj Vattikonda
Visiting Researcher

Visiting Researcher at AI Frontier Research located at Montreal, QC, Canada.

Emiliano Penaloza
Emiliano Penaloza
Visiting Researcher

Visiting Researcher at AI Frontier Research located at Montreal, QC, Canada.

Hadi Nekoei
Hadi Nekoei
Visiting Researcher

Visiting Researcher at AI Frontier Research located at Montreal, QC, Canada.

Nicolas Gontier
Nicolas Gontier
Research Scientist

Research Scientist at AI Frontier Research located at Montreal, QC, Canada.

Miguel Muñoz-Mármol
Miguel Muñoz-Mármol
AI Developer

AI Developer at AI Research Deployment​ located at Toronto, ON, Canada.

Stefania Raimondo
Stefania Raimondo
Research Manager

Research Manager at AI Research Deployment​ located at Toronto, ON, Canada.

Alexandre Drouin
Alexandre Drouin
Head of AI Frontier Research​

Head of AI Frontier Research​ at AI Frontier Research located at Montreal, QC, Canada.

Alexandre Piche
Alexandre Piche
Research Scientist

Research Scientist at AI Frontier Research located at Montreal, QC, Canada.

Alexandre Lacoste
Alexandre Lacoste
Research Lead

Research Lead at AI Frontier Research located at Montreal, QC, Canada.

Massimo Caccia
Massimo Caccia
Research Scientist

Research Scientist at AI Frontier Research located at Montreal, QC, Canada.