9

Backpropagating from Customer Success
How do we measure the real performance of AI in enterprise—beyond just model performance? This work introduces a research project …
Learning to Defer for Causal Discovery with Imperfect Experts
Integrating expert knowledge, e.g. from large language models, into causal discovery algorithms can be challenging when the knowledge …
No, of course I can! Refusal Mechanisms Can Be Exploited Using Harmless Fine-Tuning Data
Leading language model (LM) providers like OpenAI and Google offer fine-tuning APIs that allow customers to adapt LMs for specific use …
Societal Alignment Frameworks Can Improve LLM Alignment
Recent progress in large language models (LLMs) has focused on producing responses that meet human expectations and align with shared …
The Landscape of Causal Discovery Data: Grounding Causal Discovery in Real-World Applications
Causal discovery aims to automatically uncover causal relationships from data, a capability with significant potential across many …
Unifying Autoregressive and Diffusion-Based Sequence Generation
We take significant steps toward unifying autoregressive and diffusion-based sequence generation by extending the SEDD discrete …
WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation
Understanding diverse web data and automating web development presents an exciting challenge for agentic AI. While existing benchmarks …
EarthView: A Large Scale Remote Sensing Dataset for Self-Supervision
This paper presents EarthView, a comprehensive dataset specifically designed for self-supervision on remote sensing data, intended to …
AgentMerge: Enhancing Generalization in Fine-Tuned LLM Agents
Recent advancements in large language models (LLMs) have spurred interest in developing autonomous agents capable of performing complex …