1

Curry-DPO: Enhancing Alignment using Curriculum Learning & Ranked Preferences
Direct Preference Optimization (DPO) is an effective technique that leverages pairwise preference data (usually one chosen and rejected …
XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference
In-context learning (ICL) approaches typically leverage prompting to condition decoder-only language model generation on reference …
Fine-Tuning Web Agents: It Works, But It's Trickier Than You Think
Recent advancements in large language models (LLMs) have sparked interest in developing autonomous web agents capable of performing …
An Ecosystem for Web Agents: WorkArena, BrowserGym, AgentLab and more
The BrowserGym ecosystem addresses the growing need for efficient evaluation and benchmarking of web agents, particularly those …
Context is Key: A Benchmark for Forecasting with Essential Textual Information
Forecasting is a critical task in decision making across various domains. While numerical data provides a foundation, it often lacks …
TACTIS-2: Better, Faster, Simpler Attentional Copulas for Multivariate Time Series
We introduce a new model for multivariate probabilistic time series prediction, designed to flexibly address a range of tasks including …
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
Large decoder-only language models (LLMs) are the state-of-the-art models on most of today’s NLP tasks and benchmarks. Yet, the …
A Sparsity Principle for Partially Observable Causal Representation Learning
Causal representation learning aims at identifying high-level causal variables from perceptual data. Most methods assume that all …
WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?
We study the use of large language model-based agents for interacting with software via web browsers. Unlike prior work, we focus on …
PAG-LLM: Paraphrase and Aggregate with Large Language Models for Minimizing Intent Classification Errors
Large language models (LLM) have achieved remarkable success in natural language generation but lesser focus has been given to their …