Accueil
Équipe
Publications
Open Source
Démos
Évènements
Blog
Carrières
Nous joindre
Français
Français
English
ServiceNow
ServiceNow IA recherche
Tags
Code Generation
ServiceNow IA recherche
Code Generation
The BigCode Project Governance Card
This document serves as an overview of the different mechanisms and areas of governance in the BigCode project. It aims to support …
Sean Hughes
,
Harm de Vries
,
Jennifer Robinson
,
Carlos Muñoz Ferrandis
,
Loubna Ben Allal
,
Leandro von Werra
,
Jennifer Ding
,
Sébastien Paquet
,
Yacine Jernite
ArXiv, 2024.
PDF
Citation
RepoFusion: Training Code Models to Understand Your Repository
Despite the huge success of Large Language Models (LLMs) in coding assistants like GitHub Copilot, these models struggle to understand …
Disha Shrivastava
,
Denis Kocetkov
,
Harm de Vries
,
Dzmitry Bahdanau
,
Torsten Scholak
ArXiv, 2023.
PDF
Citation
StarCoder: may the source be with you!
The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code …
Raymond Li
,
Loubna Ben Allal
,
Yangtian Zi
,
Denis Kocetkov
,
Chenghao Mou
,
Christopher Akiki
,
Jia Li
,
Jenny Chim
,
Terry Yue Zhuo
,
Thomas Wang
,
Mishig Davaadorj
,
João Monteiro
,
Oleh Shliazhko
,
Nicolas Gontier
,
Nicholas Meade
,
Ming-Ho Yee
,
Logesh Kumar Umapathi
,
Benjamin Lipkin
,
Zhiruo Wang
,
Rudra Murthy
,
Jason Stillerman
,
Siva Sankalp Patel
,
Dmitry Abulkhanov
,
Marco Zocca
,
Zhihan Zhang
,
Nour Fahmy
,
Urvashi Bhattacharyya
,
Swayam Singh
,
Sasha Luccioni
,
Paulo Villegas
,
Maxim Kunakov
,
Fedor Zhdanov
,
Manuel Romero
,
Tony Lee
,
Nadav Timor
,
Jennifer Ding
,
Claire Schlesinger
,
Hailey Schoelkopf
,
Jan Ebert
,
Jennifer Robinson
,
Carolyn Jane Anderson
,
Brendan Dolan-Gavitt
,
Danish Contractor
,
Siva Reddy
,
Daniel Fried
,
Dzmitry Bahdanau
,
Yacine Jernite
,
Carlos Muñoz Ferrandis
,
Sean Hughes
,
Thomas Wolf
,
Arjun Guha
,
Leandro von Werra
,
Harm de Vries
,
Joel Lamy Poirier
,
Alex Gu
,
Armel Zebaze
,
Jian Zhu
,
Manan Dey
,
Marc Marone
,
Mayank Mishra
,
Muhtasham Oblokulov
,
Olivier Dehaene
,
Qian Liu
,
Tri Dao
,
Wenhao Yu
,
Niklas Muennighoff
Transactions on Machine Learning Research (TMLR), 2023.
PDF
Citation
Multilingual Code Retrieval Without Paired Data: A New Benchmark and Experiments
We seek to overcome limitations to code retrieval quality posed by the scarcity of data containing pairs of code snippets and natural …
João Monteiro
,
Torsten Scholak
,
Virendra Mehta
,
David Vazquez
,
Christopher Pal
Workshop at the International Conference on Learning Representations (ICLR), 2023.
PDF
Citation
Diapositives
Vidéo
SantaCoder: don't reach for the stars!
The BigCode project is an open-scientific collaboration working on the responsible development of large language models for code. This …
Harm de Vries
,
Raymond Li
,
Joel Lamy Poirier
,
Dzmitry Bahdanau
,
Denis Kocetkov
,
Sean Hughes
Workshop at the International Conference on Learning Representations (ICLR), 2023.
PDF
Citation
The Stack: 3 TB of permissively licensed source code
Large Language Models (LLMs) play an ever-increasing role in the field of Artificial Intelligence (AI)–not only for natural …
Denis Kocetkov
,
Raymond Li
,
Loubna Ben Allal
,
Jia Li
,
Chenghao Mou
,
Carlos Muñoz Ferrandis
,
Yacine Jernite
,
Margaret Mitchell
,
Sean Hughes
,
Thomas Wolf
,
Dzmitry Bahdanau
,
Leandro von Werra
,
Harm de Vries
Transactions on Machine Learning Research (TMLR), 2022.
PDF
Citation
Towards Neural Functional Program Evaluation
This paper explores the capabilities of current transformer-based language models for program evaluation of simple functional …
Torsten Scholak
,
Jonathan Pilault
,
Joey Velez-Ginorio
Conference on Neural Information Processing Systems (NeurIPS), 2021.
PDF
Citation
Picard: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models
Large pre-trained language models for textual data have an unconstrained output space; at each decoding step, they can produce any of …
Torsten Scholak
,
Nathan Schucher
,
Dzmitry Bahdanau
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
PDF
Citation
Vidéo
DuoRAT: Towards Simpler Text-to-SQL Models
Recent neural text-to-SQL models can effectively translate natural language questions to corresponding SQL queries on unseen databases. …
Torsten Scholak
,
Raymond Li
,
Dzmitry Bahdanau
,
Harm de Vries
,
Christopher Pal
North American Chapter of the Association for Computational Linguistics (NAACL), 2021.
PDF
Citation
«
Citation
×