Large Language Models

Neural sentence embedding models for dense retrieval typically rely on binary relevance labels, treating query-document pairs as …

Conference on Language Modeling (COLM), 2025.

We present significant extensions to diffusion-based sequence generation models, blurring the line with autoregressive language models. …

Nima Fathi, Torsten Scholak, Pierre-André Noël

Conference on Language Modeling (COLM), 2025.

In order to be deployed safely, Large Language Models (LLMs) must be capable of dynamically adapting their behavior based on their …

Transactions on Machine Learning Research (TMLR), 2025.

Graph databases like Neo4j are gaining popularity for handling complex, interconnected data, over traditional relational databases in …

North American Chapter of the Association for Computational Linguistics (NAACL), 2025.

Instruction finetuning (IFT) is critical for aligning Large Language Models (LLMs) to follow instructions. While many effective IFT …

North American Chapter of the Association for Computational Linguistics (NAACL), 2025.

Multilingual LLMs have achieved remarkable benchmark performance, but we find they continue to underperform on non-Latin script …

North American Chapter of the Association for Computational Linguistics (NAACL), 2025.

Integrating expert knowledge, e.g. from large language models, into causal discovery algorithms can be challenging when the knowledge …

Workshop at the International Conference of Learning Representation (ICLR), 2025.

Recent progress in large language models (LLMs) has focused on producing responses that meet human expectations and align with shared …

Workshop at the International Conference of Learning Representation (ICLR), 2025.

We take significant steps toward unifying autoregressive and diffusion-based sequence generation by extending the SEDD discrete …

Nima Fathi, Torsten Scholak, Pierre-André Noël

Workshop at the International Conference of Learning Representation (ICLR), 2025.

Decoder-only Transformers often struggle with complex reasoning tasks, particularly arithmetic reasoning requiring multiple sequential …

International Conference of Learning Representations (ICLR), 2025.