Agents

BigCharts-R1: Enhanced Chart Reasoning with Visual Reinforcement Finetuning

Charts are essential to data analysis, transforming raw data into clear visual representations that support human decision-making. …

Ahmed Masry, Abhay Puri, Masoud Hashemi, Juan A. Rodriguez, Megh Thakkar, Khyati Mahajan, Vikas Yadav, Sathwik Tejaswi Madhusudhan, Alexandre Piche, Dzmitry Bahdanau, Christopher Pal, David Vazquez, Enamul Hoque Prince , Perouz Taslakian, Sai Rajeswar Mudumba, Spandana Gella

Conference on Language Modeling (COLM), 2025.

DoomArena: A framework for Testing AI Agents Against Evolving Security Threats

We present DoomArena, a security evaluation framework for AI agents. DoomArena is designed on three principles: 1) It is a …

Léo Boisvert, Mihir Bansal, Chandra Kiran Reddy Evuru, Gabriel Huang, Abhay Puri, Avinandan Bose, Maryam Fazel, Quentin Cappart, Jason Stanley, Alexandre Lacoste, Alexandre Drouin, Krishnamurthy (Dj) Dvijotham

Conference on Language Modeling (COLM), 2025.

Fine-Tune an SLM or Prompt an LLM? The Case of Generating Low-Code Workflows

Large Language Models (LLMs) such as GPT-4o can handle a wide range of complex tasks with the right prompt. As per token costs are …

Orlando Marquez, Patrice Béchard, Emily Chen, Maggie Baird, JingFei Chen

Knowledge Discovery and Data Mining, 2025.

AgentAda: Skill-Adaptive Data Analytics for Tailored Insight Discovery

We introduce AgentAda, the first LLM-powered analytics agent that can learn and use new analytics skills to extract more specialized …

Amirhossein Abaskohi, Amrutha Ramesh, Shailesh Nanisetty, Chirag Goel, David Vazquez, Christopher Pal, Spandana Gella, Giuseppe Carenini, Issam H. Laradji

Workshop at the Annual Meeting of the Association for Computational Linguistics (ACL), 2025.

DoomArena: A framework for Testing AI Agents Against Evolving Security Threats

We present DoomArena, a security evaluation framework for AI agents. DoomArena is designed on three principles: 1) It is a …

Léo Boisvert, Abhay Puri, Gabriel Huang, Mihir Bansal, Chandra Kiran Reddy Evuru, Avinandan Bose, Quentin Cappart, Maryam Fazel, Alexandre Lacoste, Alexandre Drouin, Jason Stanley, Krishnamurthy (Dj) Dvijotham

Workshop at the International Conference of Machine Learning (ICML), 2025.

Silent Sabotage: Injecting Backdoors into AI Agents Through Fine-Tuning

The rise of AI agents that can use tools, browse the web and interact with computers on behalf of a user, has sparked strong interest …

Léo Boisvert, Abhay Puri, Chandra Kiran Reddy Evuru, Joshua Kazdan, Avinandan Bose, Quentin Cappart, Maryam Fazel, Sai Rajeswar Mudumba, Jason Stanley, Nicolas Chapados, Alexandre Drouin, Krishnamurthy (Dj) Dvijotham

Workshop at the International Conference of Machine Learning (ICML), 2025.

SafeArena: Evaluating the Safety of Autonomous Web Agents

LLM-based agents are becoming increasingly proficient at solving web-based tasks. With this capability comes a greater risk of misuse …

Ada Tur, Nicholas Meade, Xing Han Lu, Alejandra Zambrano, Arkil Patel, Esin Durmus, Spandana Gella, Karolina Stanczak, Siva Reddy

International Conference on Machine Learning (ICML), 2025.

UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction

Developing autonomous agents that can navigate diverse Graphical User Interfaces (GUIs) and solve complex tasks is essential for …

Shravan Nayak, Xiangru Jian, Kevin Lin, Juan A. Rodriguez, Motek Kalsi, Nicolas Chapados, Tamer Özsu, Aishwarya Agrawal, David Vazquez, Christopher Pal, Perouz Taslakian, Spandana Gella, Sai Rajeswar Mudumba

International Conference on Machine Learning (ICML), 2025.

WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation

Understanding diverse web data and automating web development presents an exciting challenge for agentic AI. While existing benchmarks …

Rabiul Awal, Mahsa Massoud, Zichao Li, Aarash Feizi, Suyuchen Wang, Christopher Pal, Aishwarya Agrawal, David Vazquez, Siva Reddy, Juan A. Rodriguez, Perouz Taslakian, Sai Rajeswar Mudumba

Workshop at the Computer Vision and Pattern Recognition Conference (CVPR), 2025.

StarVector: Generating Scalable Vector Graphics Code from Images and Text

Scalable Vector Graphics (SVGs) are vital for modern image rendering due to their scalability and versatility. Previous SVG generation …

Juan A. Rodriguez, Abhay Puri, Shubham Agarwal, Issam H. Laradji, Pau Rodriguez, Sai Rajeswar Mudumba, David Vazquez, Christopher Pal, Marco Pedersoli

Computer Vision and Pattern Recognition (CVPR), 2025.