Agents

Keeping up with dynamic attackers: Certifying robustness to adaptive online data poisoning

The rise of foundation models fine-tuned on human feedback from potentially untrusted users has increased the risk of adversarial data …

Avinandan Bose, Laurent Lessard, Maryam Fazel, Krishnamurthy (Dj) Dvijotham

International Conference on Artificial Intelligence and Statistics (AISTATS), 2025.

A Guide To Effectively Leveraging LLMs for Low-Resource Text Summarization: Data Augmentation and Semi-supervised Approaches

Existing approaches for low-resource text summarization primarily employ large language models (LLMs) like GPT-3 or GPT-4 at inference …

Gaurav Sahu, Olga Vechtomova, Issam H. Laradji

North American Chapter of the Association for Computational Linguistics (NAACL), 2025.

Generating a Low-code Complete Workflow via Task Decomposition and RAG

AI technologies are moving rapidly from research to production. With the popularity of Foundation Models (FMs) that generate text, …

Orlando Marquez, Patrice Béchard

Conference on AI Engineering (CAIN), 2025.

Societal Alignment Frameworks Can Improve LLM Alignment

Recent progress in large language models (LLMs) has focused on producing responses that meet human expectations and align with shared …

Karolina Stanczak, Nicholas Meade, Mehar Bhatia, Hattie Zhou, Konstantin Böttinger, Jeremy Barns, Jason Stanley, Nicolas Papernot, Nicolas Chapados, Denis Therien, Timothy P Lillicrap, Ana Marasovic, Sylvie Delacroix, Gillian K Hadfield, Siva Reddy

Workshop at the International Conference of Learning Representation (ICLR), 2025.

BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks

Multimodal AI has the potential to significantly enhance document-understanding tasks, such as processing receipts, understanding …

Juan A. Rodriguez, Xiangru Jian, Siba Smarak Panigrahi, Tianyu Zhang, Aarash Feizi, Abhay Puri, Akshay Kalkunte, Francois Savard, Ahmed Masry, Shravan Nayak, Rabiul Awal, Mahsa Massoud, Amirhossein Abaskohi, Zichao Li, Suyuchen Wang, Pierre-André Noël, Mats L. Richter, Saverio Vadacchino, Shubham Agarwal, Sanket Biswas, Sara Shanian, Ying Zhang, Sathwik Tejaswi Madhusudhan, João Monteiro, Krishnamurthy (Dj) Dvijotham, Torsten Scholak, Nicolas Chapados, Sepideh Kharaghani, Sean Hughes, Tamer Özsu, Siva Reddy, Marco Pedersoli, Yoshua Bengio, Christopher Pal, Issam H. Laradji, Spandana Gella, Perouz Taslakian, David Vazquez, Sai Rajeswar Mudumba

International Conference of Learning Representations (ICLR), 2025.

InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation

Data analytics is essential for extracting valuable insights from data that can assist organizations in making effective decisions. We …

Gaurav Sahu, Abhay Puri, Juan A. Rodriguez, Amirhossein Abaskohi, Mohammad (Aaron) Chegini , Alexandre Drouin, Perouz Taslakian, Valentina Zantedeschi, Alexandre Lacoste, David Vazquez, Nicolas Chapados, Christopher Pal, Sai Rajeswar Mudumba, Issam H. Laradji

International Conference of Learning Representations (ICLR), 2025.

MMTEB: Massive Multilingual Text Embedding Benchmark

Text embeddings are typically evaluated on a narrow set of tasks, limited in terms of languages, domains, and task types. To circumvent …

Kenneth Enevoldsen, Isaac Chung, Imene Kerboua, Márton Kardos, Ashwin Mathur, David Stap, Jay Gala, Wissam Siblini, Dominik Krzeminski, Genta Indra Winata, Saba Sturua, Saiteja Utpala, Mathieu Ciancone, Marion Schaeffer, Gabriel Sequeira, Diganta Misra, Shreeya Dhakal, Jonathan Rystrøm, Roman Solomatin, Ömer Çagatan, Akash Kundu, Martin Bernstorff, Shitao Xiao, Akshita Sukhlecha, Bhavish Pahwa, Rafał Poswiata, Kranthi Kiran GV, Shawon Ashraf, Daniel Auras, Björn Plüster, Jan Philipp Harries, Loïc Magne, Isabelle Mohr, Mariya Hendriksen, Dawei Zhu, Hippolyte Gisserot-Boukhlef, Tom Aarsen, Jan Kostkan, Konrad Wojtasik, Taemin Lee, Marek Šuppa, Crystina Zhang, Roberta Rocca, Mohammed Hamdy, Andrianos Michail, John Yang, Manuel Faysse, Aleksei Vatolin, Nandan Thakur, Manan Dey, Dipam Vasani, Pranjal Chitale, Simone Tedeschi, Nguyen Tai, Artem Snegirev, Michael Günther, Mengzhou Xia, Weijia Shi, Xing Han Lu, Jordan Clive, Gayatri Krishnakumar, Anna Maksimova, Silvan Wehrli, Maria Tikhonova, Henil Pancha, Aleksandr Abramov, Malte Ostendorff, Zheng Liu, Simon Clematide, Lester James Miranda, Alena Fenogenova, Guangyu Song, Ruqiya Bin Saf, Wen-Ding Li, Alessia Borghini, Federico Cassano, Hongjin Su, Jimmy Lin, Howard Yen, Lasse Hansen, Sara Hooker, Chenghao Xiao, Vaibhav Adlakha, Orion Weller, Siva Reddy, Niklas Muennighoff

International Conference of Learning Representations (ICLR), 2025.

LitLLMs, LLMs for Literature Review: Are We There Yet?

Literature reviews are an essential component of scientific research, but they remain time-intensive and challenging to write, …

Gaurav Sahu, Abhay Puri, Shubham Agarwal, Issam H. Laradji, Krishnamurthy (Dj) Dvijotham, Jason Stanley, Laurent Charlin, Christopher Pal

Transactions on Machine Learning Research (TMLR), 2025.

StarVector: Generating Scalable Vector Graphics Code from Images and Text

Scalable Vector Graphics (SVGs) are vital for modern image rendering due to their scalability and versatility. Previous SVG generation …

Juan A. Rodriguez, Abhay Puri, Shubham Agarwal, Issam H. Laradji, Pau Rodriguez, Sai Rajeswar Mudumba, David Vazquez, Christopher Pal, Marco Pedersoli

AAAI Demos, 2025.

Do LLMs Know When to NOT Answer? Investigating Abstention Abilities of Large Language Models

Abstention Ability (AA) is a critical aspect of Large Language Model (LLM) reliability, referring to an LLM’s capability to …

Nishanth Madhusudhan, Sathwik Tejaswi Madhusudhan, Vikas Yadav, Masoud Hashemi

International Conference on Computational Linguistics (COLING), 2025.