Alignment

Societal Alignment Frameworks Can Improve LLM Alignment

Recent progress in large language models (LLMs) has focused on producing responses that meet human expectations and align with shared …

Karolina Stanczak, Nicholas Meade, Mehar Bhatia, Hattie Zhou, Konstantin Böttinger, Jeremy Barns, Jason Stanley, Nicolas Papernot, Nicolas Chapados, Denis Therien, Timothy P Lillicrap, Ana Marasovic, Sylvie Delacroix, Gillian K Hadfield, Siva Reddy

ACM Conference on Fairness, Accountability, and Transparency, 2026.

M2Lingual: Enhancing Multilingual, Multi-Turn Instruction Alignment in Large Language Models

Instruction finetuning (IFT) is critical for aligning Large Language Models (LLMs) to follow instructions. While many effective IFT …

Rishabh Maheshwary, Vikas Yadav, Hoang Nguyen, Khyati Mahajan, Sathwik Tejaswi Madhusudhan

North American Chapter of the Association for Computational Linguistics (NAACL), 2025.

Societal Alignment Frameworks Can Improve LLM Alignment

Recent progress in large language models (LLMs) has focused on producing responses that meet human expectations and align with shared …

Workshop at the International Conference of Learning Representation (ICLR), 2025.

Curry-DPO: Enhancing Alignment using Curriculum Learning & Ranked Preferences

Direct Preference Optimization (DPO) is an effective technique that leverages pairwise preference data (usually one chosen and rejected …

Pulkit Pattnaik, Rishabh Maheshwary, Kelechi Ogueji, Vikas Yadav, Sathwik Tejaswi Madhusudhan

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.