Open Source

BigCode
BigCode is an open scientific collaboration working on the responsible development of large language models for code.
Azimuth
Azimuth, an open-source dataset and error analysis tool for text classification, with love from ServiceNow.
SeCo: Seasonal Contrast Remote Sensing Dataset
The SeCo dataset is collected from Sentinel-2 with a principled procedure to gather large-scale, unlabeled, and uncurated remote sensing datasets containing images from multiple Earth locations at different timestamps.
BaaL: Bayesian Active Learning
BaaL is an active learning library developed at ElementAI. This repository contains techniques and reusable components to make active learning accessible for all.
Fashion-Gen: The Generative Fashion Dataset
The Fashion-Gen dataset includes 293,008 high-definition fashion images paired with item descriptions provided by professional stylists from the fashion design domain. It provides an excellent public benchmark for research exploring state-of-the-art text-to-image design-oriented generative models. Fashion-Gen data is expert-annotated e-commerce photographs shared by the global fashion platform SSENSE with the AI research community and partnering with ServiceNow to prepare and host the data.