RAG

Hierarchical Retrieval at Scale: Bridging Transparency and Efficiency

Information retrieval is a core component of many intelligent systems as it enables conditioning of outputs on new and large-scale …

Shubham Gupta, Zichao Li, Tianyi Chen, Cem Subakan, Siva Reddy, Perouz Taslakian, Valentina Zantedeschi

International Conference on Machine Learning (ICML), 2026.

Augmenting LLM Reasoning with Dynamic Notes Writing for Complex QA

Iterative RAG for multi-hop question answering faces challenges with lengthy contexts and the buildup of irrelevant information. This …

Rishabh Maheshwary, Masoud Hashemi, Khyati Mahajan, Shiva Krishna Reddy Malay, Sai Rajeswar Mudumba, Sathwik Madhusudhan, Spandana Gella, Vikas Yadav

Language Resources and Evaluation Conference, 2026.

Hierarchical Retrieval at Scale: Bridging Transparency and Efficiency

Information retrieval is a core component of many intelligent systems as it enables conditioning of outputs on new and large-scale …

Shubham Gupta, Zichao Li, Tianyi Chen, Cem Subakan, Siva Reddy, Perouz Taslakian, Valentina Zantedeschi

Workshop at the International Conference of Machine Learning (ICML), 2026.

ColMate: Contrastive Late Interaction and Masked Text for Multimodal Document Retrieval

Retrieval-augmented generation has proven practical when models require specialized knowledge or access to the latest data. However, …

Ahmed Masry, Megh Thakkar, Patrice Béchard, Sathwik Madhusudhan, Rabiul Awal, Shambhavi Mishra, Akshay Kalkunte, Enamul Hoque Prince , Spandana Gella, Torsten Scholak, Sai Rajeswar Mudumba

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025.

Multi-task retriever fine-tuning for domain-specific and efficient RAG

Retrieval-Augmented Generation (RAG) has become ubiquitous when deploying Large Language Models (LLMs), as it can address typical …

Patrice Béchard, Orlando Marquez

Knowledge Discovery and Data Mining, 2025.

Generating a Low-code Complete Workflow via Task Decomposition and RAG

AI technologies are moving rapidly from research to production. With the popularity of Foundation Models (FMs) that generate text, …

Orlando Marquez, Patrice Béchard

Conference on AI Engineering (CAIN), 2025.

XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference

In-context learning (ICL) approaches typically leverage prompting to condition decoder-only language model generation on reference …

João Monteiro, Étienne Marcotte, Pierre-André Noël, Valentina Zantedeschi, David Vazquez, Nicolas Chapados, Christopher Pal, Perouz Taslakian

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.

Reducing hallucination in structured outputs via Retrieval-Augmented Generation

A common and fundamental limitation of Generative AI (GenAI) is its propensity to hallucinate. While large language models (LLM) have …

Patrice Béchard, Orlando Marquez

North American Chapter of the Association for Computational Linguistics (NAACL), 2024.