About
People
Publications
Events
Blog
Careers
Contact
English
English
Français
ServiceNow
ServiceNow AI Research
Tags
RAG
ServiceNow AI Research
RAG
Augmenting LLM Reasoning with Dynamic Notes Writing for Complex QA
Iterative RAG for multi-hop question answering faces challenges with lengthy contexts and the buildup of irrelevant information. This …
Rishabh Maheshwary
,
Masoud Hashemi
,
Khyati Mahajan
,
Shiva Krishna Reddy Malay
,
Sai Rajeswar Mudumba
,
Sathwik Madhusudhan
,
Spandana Gella
,
Vikas Yadav
Language Resources and Evaluation Conference, 2026.
Paper
Cite
Hierarchical Retrieval at Scale: Bridging Transparency and Efficiency
Information retrieval is a core component of many intelligent systems as it enables conditioning of outputs on new and large-scale …
Shubham Gupta
,
Zichao Li
,
Tianyi Chen
,
Cem Subakan
,
Siva Reddy
,
Perouz Taslakian
,
Valentina Zantedeschi
Workshop at the International Conference of Machine Learning (ICML), 2026.
Paper
Cite
Video
ColMate: Contrastive Late Interaction and Masked Text for Multimodal Document Retrieval
Retrieval-augmented generation has proven practical when models require specialized knowledge or access to the latest data. However, …
Ahmed Masry
,
Megh Thakkar
,
Patrice Béchard
,
Sathwik Madhusudhan
,
Rabiul Awal
,
Shambhavi Mishra
,
Akshay Kalkunte
,
Enamul Hoque Prince
,
Spandana Gella
,
Torsten Scholak
,
Sai Rajeswar Mudumba
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025.
Paper
Cite
Multi-task retriever fine-tuning for domain-specific and efficient RAG
Retrieval-Augmented Generation (RAG) has become ubiquitous when deploying Large Language Models (LLMs), as it can address typical …
Patrice Béchard
,
Orlando Marquez
Knowledge Discovery and Data Mining, 2025.
Paper
Cite
Generating a Low-code Complete Workflow via Task Decomposition and RAG
AI technologies are moving rapidly from research to production. With the popularity of Foundation Models (FMs) that generate text, …
Orlando Marquez
,
Patrice Béchard
Conference on AI Engineering (CAIN), 2025.
Paper
Cite
XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference
In-context learning (ICL) approaches typically leverage prompting to condition decoder-only language model generation on reference …
João Monteiro
,
Étienne Marcotte
,
Pierre-André Noël
,
Valentina Zantedeschi
,
David Vazquez
,
Nicolas Chapados
,
Christopher Pal
,
Perouz Taslakian
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
Paper
Cite
Code
Reducing hallucination in structured outputs via Retrieval-Augmented Generation
A common and fundamental limitation of Generative AI (GenAI) is its propensity to hallucinate. While large language models (LLM) have …
Patrice Béchard
,
Orlando Marquez
North American Chapter of the Association for Computational Linguistics (NAACL), 2024.
Paper
Cite
Video
Cite
×