ServiceNow IA recherche

AgentLab Controller: Level Up Your Web Agent with Step-Through Debugging

Résumé

Recent progress in building computer-using agents has enabled large language models to navigate browser environments and solve complex tasks. However, debugging these agents remains a significant challenge, especially in identifying and addressing model failures. In this work, we present AgentLab Controller, an interactive framework for debugging and improving web agents. AgentLab Controller provides a visual interface that allows users to step forward and backward through agent executions, inspect action sequences, and dynamically edit prompts or insert hints to repair or refine agent behavior. Our system captures detailed execution traces and facilitates error analysis across task attempts. We demonstrate the utility of AgentLab Controller in a case study on WorkArena tasks, where targeted prompt edits and curated hints significantly improve agent success rates. Our results highlight how human-in-the-loop debugging and hint mining can be systematically integrated into the development and evaluation of web agents.

Publication
NOW AI
Orlando Marquez
Orlando Marquez
Applied Research Scientist

Applied Research Scientist at Azimuth located at Montreal, QC, Canada.

Alexandre Drouin
Alexandre Drouin
Head of Frontier AI Research​

Head of Frontier AI Research​ at Frontier AI Research located at Montreal, QC, Canada.

Alexandre Lacoste
Alexandre Lacoste
Research Lead

Research Lead at Frontier AI Research located at Montreal, QC, Canada.