ServiceNow Research

Code Generation

Multilingual Code Retrieval Without Paired Data: A New Benchmark and Experiments
We seek to overcome limitations to code retrieval quality posed by the scarcity of data containing pairs of code snippets and natural …
SantaCoder: don't reach for the stars!
The BigCode project is an open-scientific collaboration working on the responsible development of large language models for code. This …
The Stack: 3 TB of permissively licensed source code
Large Language Models (LLMs) play an ever-increasing role in the field of Artificial Intelligence (AI)–not only for natural …
Towards Neural Functional Program Evaluation
This paper explores the capabilities of current transformer-based language models for program evaluation of simple functional …
Picard: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models
Large pre-trained language models for textual data have an unconstrained output space; at each decoding step, they can produce any of …
DuoRAT: Towards Simpler Text-to-SQL Models
Recent neural text-to-SQL models can effectively translate natural language questions to corresponding SQL queries on unseen databases. …