ServiceNow IA recherche

Large Language Models

Causal Differentiating Concepts: Interpreting LM Behavior via Causal Representation Learning
Language model activations entangle concepts that mediate their behavior, making it difficult to interpret these factors, which has …
BiXSE: Improving Dense Retrieval via Probabilistic Graded Relevance Distillation

Neural sentence embedding models for dense retrieval typically rely on binary relevance labels, treating query-document pairs as …

Unifying Autoregressive and Diffusion-Based Sequence Generation
We present significant extensions to diffusion-based sequence generation models, blurring the line with autoregressive language models. …
Auto-Cypher: Improving LLMs on Cypher generation via LLM-supervised generation-verification framework
Graph databases like Neo4j are gaining popularity for handling complex, interconnected data, over traditional relational databases in …
M2Lingual: Enhancing Multilingual, Multi-Turn Instruction Alignment in Large Language Models
Instruction finetuning (IFT) is critical for aligning Large Language Models (LLMs) to follow instructions. While many effective IFT …
Prompting with Phonemes: Enhancing LLM Multilinguality for non-Latin Script Languages
Multilingual LLMs have achieved remarkable benchmark performance, but we find they continue to underperform on non-Latin script …
Learning to Defer for Causal Discovery with Imperfect Experts
Integrating expert knowledge, e.g. from large language models, into causal discovery algorithms can be challenging when the knowledge …
Societal Alignment Frameworks Can Improve LLM Alignment
Recent progress in large language models (LLMs) has focused on producing responses that meet human expectations and align with shared …
Unifying Autoregressive and Diffusion-Based Sequence Generation
We take significant steps toward unifying autoregressive and diffusion-based sequence generation by extending the SEDD discrete …
Seq-VCR: Preventing Collapse in Intermediate Transformer Representations for Enhanced Reasoning
Decoder-only Transformers often struggle with complex reasoning tasks, particularly arithmetic reasoning requiring multiple sequential …