About
People
Publications
Open Source
Demos
Events
Blog
Careers
Contact
English
English
Français
ServiceNow
ServiceNow Research
Tags
Code Generation
ServiceNow Research
Code Generation
Multilingual Code Retrieval Without Paired Data: A New Benchmark and Experiments
We seek to overcome limitations to code retrieval quality posed by the scarcity of data containing pairs of code snippets and natural …
João Monteiro
,
Torsten Scholak
,
Virendra Mehta
,
David Vazquez
,
Christopher Pal
Workshop at the International Conference on Learning Representations (ICLR), 2023.
PDF
Cite
Slides
Video
SantaCoder: don't reach for the stars!
The BigCode project is an open-scientific collaboration working on the responsible development of large language models for code. This …
Harm de Vries
,
Raymond Li
,
Joel Lamy Poirier
,
Dzmitry Bahdanau
,
Denis Kocetkov
,
Sean Hughes
Workshop at the International Conference on Learning Representations (ICLR), 2023.
PDF
Cite
The Stack: 3 TB of permissively licensed source code
Large Language Models (LLMs) play an ever-increasing role in the field of Artificial Intelligence (AI)–not only for natural …
Denis Kocetkov
,
Raymond Li
,
Loubna Ben Allal
,
Jia Li
,
Chenghao Mou
,
Carlos Muñoz Ferrandis
,
Yacine Jernite
,
Margaret Mitchell
,
Sean Hughes
,
Thomas Wolf
,
Dzmitry Bahdanau
,
Leandro von Werra
,
Harm de Vries
Transactions on Machine Learning Research (TMLR), 2022.
PDF
Cite
Code
Towards Neural Functional Program Evaluation
This paper explores the capabilities of current transformer-based language models for program evaluation of simple functional …
Torsten Scholak
,
Jonathan Pilault
,
Joey Velez-Ginorio
Conference on Neural Information Processing Systems (NeurIPS), 2021.
PDF
Cite
Picard: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models
Large pre-trained language models for textual data have an unconstrained output space; at each decoding step, they can produce any of …
Torsten Scholak
,
Nathan Schucher
,
Dzmitry Bahdanau
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
PDF
Cite
Code
Video
DuoRAT: Towards Simpler Text-to-SQL Models
Recent neural text-to-SQL models can effectively translate natural language questions to corresponding SQL queries on unseen databases. …
Torsten Scholak
,
Raymond Li
,
Dzmitry Bahdanau
,
Harm de Vries
,
Christopher Pal
North American Chapter of the Association for Computational Linguistics (NAACL), 2021.
PDF
Cite
Code
«
Cite
×