ServiceNow recherche

Fine-Tuning Web Agents: It Works, But It's Trickier Than You Think

Résumé

Recent advancements in large language models (LLMs) have sparked interest in developing autonomous web agents capable of performing digital tasks through web interfaces in a human-like manner. However, even the strongest closed-source models often struggle to achieve robust results on several benchmarks, while a notable performance gap exists between them and open-source counterparts. This study investigates the potential of fine-tuning to enhance the performance of a smaller, lower-performing but cost-efficient LLM by leveraging successful traces from stronger LLMs, referred to as experts. We outline a comprehensive pipeline for data collection, filtering, and supervised fine-tuning and explore various behav- ior cloning parameters. Our experiments provide key insights into the challenges of fine tuning LLMs into web agents on benchmarks like MiniWoB and WorkArena. Notably, we find that the fine-tuned agents’ ability to predict expert trajectories does not consistently lead to improved downstream task performance. This raises issues such as off-policy bias and the loss of reasoning abilities during fine-tuning. We discuss potential solutions to these challenges and make both the codebase and a dataset of 140M tokens open-source for the community to build upon.

Publication
NOW AI Conference (NOWAI)
Massimo Caccia
Massimo Caccia
Research Scientist

Research Scientist at AI Frontier Research located at Montreal, QC, Canada.

Léo	Boisvert
Léo Boisvert
Visiting Researcher

Visiting Researcher at AI Frontier Research located at Montreal, QC, Canada.

Alexandre Piche
Alexandre Piche
Research Scientist

Research Scientist at AI Frontier Research located at Montreal, QC, Canada.

Nicolas Chapados
Nicolas Chapados
VP of Research

VP of Research at AI Research Management located at Montreal, QC, Canada.

Alexandre Drouin
Alexandre Drouin
Head of AI Frontier Research​

Head of AI Frontier Research​ at AI Frontier Research located at Montreal, QC, Canada.

Alexandre Lacoste
Alexandre Lacoste
Research Lead

Research Lead at AI Frontier Research located at Montreal, QC, Canada.