Alignment

Recent progress in large language models (LLMs) has focused on producing responses that meet human expectations and align with shared …

ACM Conference on Fairness, Accountability, and Transparency, 2026.

Instruction finetuning (IFT) is critical for aligning Large Language Models (LLMs) to follow instructions. While many effective IFT …

North American Chapter of the Association for Computational Linguistics (NAACL), 2025.

Recent progress in large language models (LLMs) has focused on producing responses that meet human expectations and align with shared …

Workshop at the International Conference of Learning Representation (ICLR), 2025.

Direct Preference Optimization (DPO) is an effective technique that leverages pairwise preference data (usually one chosen and rejected …

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.