9

SantaCoder: don't reach for the stars!
The BigCode project is an open-scientific collaboration working on the responsible development of large language models for code. This …
Leveraging Human Preferences to Master Poetry
Large language models have been fine-tuned to learn poetry via supervised learning on a dataset containing relevant examples. However, …
In-Context Learning for Text Classification with Many Labels
In-context learning (ICL) using large language models for tasks with many labels is challenging due to the limited context window, …
On the Compositional Generalization Gap of In-Context Learning
Pretrained large generative language models have shown great performance on many tasks, but exhibit low compositional generalization …
Attention for Compositional Modularity
Modularity and compositionality are promising inductive biases for addressing longstanding problems in machine learning such as better …
A General Purpose Neural Architecture for Geospatial Systems
Geospatial Information Systems are used by researchers and Humanitarian Assistance and Disaster Response (HADR) practitioners to …
Breadth-First Pipeline Parallelism
We introduce Breadth-First Pipeline Parallelism, a novel training schedule which optimizes the combination of pipeline and data …
Can large language models build causal graphs?
Building causal graphs can be a laborious process. To ensure all relevant causal pathways have been captured, researchers often have to …
Choreographer: Learning and Adapting Skills in Imagination
Unsupervised skill learning aims to learn a rich repertoire of behaviors without external supervision, providing artificial agents with …
Constraining Low-level Representations to Define Effective Confidence Scores
Neural networks are known to fail with high confidence, especially for data that somehow differs from the training distribution. Such …