ServiceNow Research

The Unsolved Challenges of LLMs in Open-Ended Web Tasks: A Case Study

Abstract

In this work, we investigate the challenges associated with developing goal-driven AI agents capable of performing open-ended tasks in a web environment using zero-shot learning. Our primary focus is on harnessing the capabilities of large language models (LLMs) in the context of web navigation through HTML-based user interfaces (UIs). We evaluate the MiniWoB benchmark and show that it is a suitable yet challenging platform for assessing an agent’s ability to comprehend and solve tasks without prior human demonstrations. Our main contribution encompasses a set of extensive experiments where we compare and contrast various agent design considerations, such as action space, observation space, and the choice of LLM, with the aim of shedding light on the bottlenecks and limitations of LLM-based zero-shot learning in this domain, in order to foster research endeavours in this area. In our empirical analysis, we find that: (1) the code-based action space is notably the most effective; (2) open-source LLMs hold their own as competitive agents for open-ended web tasks when compared to their proprietary counterparts; and (3) using an accessibility-based representation for web pages, despite resulting in some performance loss, emerges as a cost-effective strategy, particularly as web page sizes increase.

Publication
Workshop at the Neural Information Processing Systems (NeurIPS)
Massimo Caccia
Massimo Caccia
Research Scientist

Research Scientist at AI Frontier Research located at Montreal, QC, Canada.

Issam H. Laradji
Issam H. Laradji
Research Manager

Research Manager at AI Frontier Research located at Vancouver, BC, Canada.

Alexandre Drouin
Alexandre Drouin
Head of AI Frontier Research​

Head of AI Frontier Research​ at AI Frontier Research located at Montreal, QC, Canada.

David Vazquez
David Vazquez
Director of AI Research

Director of AI Research at AI Research Management located at Montreal, QC, Canada.

Nicolas Chapados
Nicolas Chapados
VP of Research

VP of Research at AI Research Management located at Montreal, QC, Canada.

Alexandre Lacoste
Alexandre Lacoste
Research Lead

Research Lead at AI Frontier Research located at Montreal, QC, Canada.