9

WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?
We study the use of large language model-based agents for interacting with software via web browsers. Unlike prior work, we focus on …
Towards Disentangled High-level Causal Explanations in Text
In this work, we propose a causal representation learning framework for learning disentangled and intervenable high-level explanations …
A Sparsity Principle for Partially Observable Causal
Causal representation learning (CRL) aims at identifying high-level causal variables from low-level data, e.g. images. Current methods …
Capture the Flag: Uncovering Data Insights with Large Language Models
The extraction of a small number of relevant insights from vast amounts of data is a crucial component of data-driven decision-making. …
Lag-Llama: A Foundation Model for Probabilistic Time Series Forecasting
In this work, we present Lag-Llama, a general-purpose probabilistic time series forecasting model trained on a large collection of time …
Multi-View Causal Representation Learning with Partial Observability
We present a unified framework for studying the identifiability of representations learned from simultaneously observed views, such as …
Surrogate Minimization: An Optimization Algorithm for Training Large Neural Networks with Model Parallelism
Optimizing large memory-intensive neural networks requires distributing its layers across multiple GPUs (referred to as model …
The Unsolved Challenges of LLMs in Open-Ended Web Tasks: A Case Study
In this work, we investigate the challenges associated with developing goal-driven AI agents capable of performing open-ended tasks in …
Efficient Dynamics Modeling in Interactive Environments with Koopman Theory
The accurate modeling of dynamics in interactive environments is critical for suc- cessful long-range prediction. Such a capability …
Invariant Causal Set Covering Machines
Rule-based models, such as decision trees, appeal to practitioners due to their interpretable nature. However, the learning algorithms …