Announcing BigCode for the responsible development of large language models

  • 2022
  • ServiceNow Research
September 26, 2022

BigCode by ServiceNow Research and Hugging Face

We’re excited to announce the BigCode project, led by ServiceNow Research and Hugging Face. In the spirit of the BigScience initiative,1 we aim to develop state-of-the-art large language models (LLMs) for code in an open and responsible way.

Code LLMs enable the completion and synthesis of code, both from other code and natural language descriptions, and work across a wide range of domains, tasks, and programming languages. These models can assist professional and citizen developers with coding new applications.

BigCode invites AI researchers to collaborate on the following topics: 

  • A representative evaluation suite for code LLMs covering a diverse set of tasks and programming languages

  • Responsible development and governance of data sets for code LLMs

  • Faster training and inference methods for LLMs

The first goal of BigCode is to develop and release a data set large enough to train a state-of-the-art language model for code. We’ll ensure that only files from repositories with permissive licenses go into the data set.

With that data set, we’ll train a 15-billion-parameter language model for code using ServiceNow’s in-house GPU cluster. With an adapted version of Megatron-LM, we’ll train the LLM on the distributed infrastructure.

Once the model is trained, we’ll evaluate its capabilities. While there are numerous benchmarks available for natural language processing (NLP), the landscape of benchmarks suited for code is much sparser. We’ll strive to make evaluation easier and broader so that we can learn more about the model’s capabilities.

Academic research usually stops after evaluation; this is where the work for practical applications starts. Inference speed is crucial for applications such as autocompletion. We’re interested in making architectural changes and devising tools for post-training optimization.

We’ll follow, as well as establish, responsible AI practices to train and share LLMs. We’ll uphold the principles of openness and transparency in the LLM development process. Experiments can be expensive and take a long time to run, so we’ll share the scientific plan with participants to solicit feedback before we execute it.

AI practitioners from diverse backgrounds are invited to join the BigCode project. The invitation is open to those who have a professional AI research background and can commit time to the project.

In general, we expect applicants to be affiliated with a research organization (either in academia or industry) and work on the technical/ethical/legal aspects of LLMs for coding applications.

Learn more about the project on the official website, and join the conversation on Twitter @BigCodeProject.

1 The BigScience initiative is a scientific collaboration that culminated in July 2022 with the release of BLOOM, the world’s largest open multilingual language model.

© 2022 ServiceNow, Inc. All rights reserved. ServiceNow, the ServiceNow logo, Now, and other ServiceNow marks are trademarks and/or registered trademarks of ServiceNow, Inc. in the United States and/or other countries. Other company names, product names, and logos may be trademarks of the respective companies with which they are associated.

Topics

  • Future of low-code: female working on laptop
    Application Development
    5 trends driving the future of low-code
    Adopting low-code technologies can help businesses create purpose-rich careers for their employees. Find out five key trends driving the future of low-code.
  • ServiceNow Scholarship for diversity in tech: college students gathered around a laptop
    Careers
    Introducing the ServiceNow Scholarship to promote diversity in tech
    The ServiceNow Scholarship fund demonstrates ServiceNow’s deep dedication to supporting underserved students and building a more diverse tech industry.
  • 3 boomerang employees who returned to ServiceNow
    Careers
    Returning to ServiceNow: Insights from 3 boomerang employees
    Watch as three boomerang employees share their stories about why they came back to ServiceNow and how they’re excelling both personally and professionally.

Trends & Research

  • RPA: group of workers gathered around a conference table looking at a laptop
    AI and Automation
    Forrester report: ServiceNow debuts as a Strong Performer in RPA
  • #1 in ITSM, AIOps and IT Operations Market Share: Organizations around the world count on ServiceNow in times of change.
    IT Management
    ServiceNow is No. 1 for AIOps, IT operations, and IT service management categories by market share
  • ESG technology: green surrounding a river, woman smiling, 2 government employees in conversation
    Cybersecurity and Risk
    Survey says ESG technology drives results

Year