About
People
Publications
Events
Blog
Careers
Contact
English
English
Français
ServiceNow
ServiceNow AI Research
Tags
Alignment
ServiceNow AI Research
Alignment
Societal Alignment Frameworks Can Improve LLM Alignment
Recent progress in large language models (LLMs) has focused on producing responses that meet human expectations and align with shared …
Karolina Stanczak
,
Nicholas Meade
,
Mehar Bhatia
,
Hattie Zhou
,
Konstantin Böttinger
,
Jeremy Barns
,
Jason Stanley
,
Nicolas Papernot
,
Nicolas Chapados
,
Denis Therien
,
Timothy P Lillicrap
,
Ana Marasovic
,
Sylvie Delacroix
,
Gillian K Hadfield
,
Siva Reddy
ACM Conference on Fairness, Accountability, and Transparency, 2026.
Paper
Cite
M2Lingual: Enhancing Multilingual, Multi-Turn Instruction Alignment in Large Language Models
Instruction finetuning (IFT) is critical for aligning Large Language Models (LLMs) to follow instructions. While many effective IFT …
Rishabh Maheshwary
,
Vikas Yadav
,
Hoang Nguyen
,
Khyati Mahajan
,
Sathwik Tejaswi Madhusudhan
North American Chapter of the Association for Computational Linguistics (NAACL), 2025.
Paper
Cite
Code
Societal Alignment Frameworks Can Improve LLM Alignment
Recent progress in large language models (LLMs) has focused on producing responses that meet human expectations and align with shared …
Karolina Stanczak
,
Nicholas Meade
,
Mehar Bhatia
,
Hattie Zhou
,
Konstantin Böttinger
,
Jeremy Barns
,
Jason Stanley
,
Nicolas Papernot
,
Nicolas Chapados
,
Denis Therien
,
Timothy P Lillicrap
,
Ana Marasovic
,
Sylvie Delacroix
,
Gillian K Hadfield
,
Siva Reddy
Workshop at the International Conference of Learning Representation (ICLR), 2025.
Paper
Cite
Curry-DPO: Enhancing Alignment using Curriculum Learning & Ranked Preferences
Direct Preference Optimization (DPO) is an effective technique that leverages pairwise preference data (usually one chosen and rejected …
Pulkit Pattnaik
,
Rishabh Maheshwary
,
Kelechi Ogueji
,
Vikas Yadav
,
Sathwik Tejaswi Madhusudhan
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
Paper
Cite
Cite
×