NVIDIA GTC '22 - Leaders in AI Panel Discussion and Q&A

"When it comes time to put new AI models or breakthroughs into practice, deployment must occur at larger scale than what we consider in a research paper." Nicolas Chapados, VP, Research, NVIDIA GTC '22 - Leaders in AI Panel Discussion

Leaders in AI Panel Discussion: AI Journey from Concept to Scale

NVIDIA GTC 2022, Tuesday, March 22, 12:00 PM - 12:50 PM PDT

Nicolas Chapados (VP Research at ServiceNow), met online with Bryan Catanzaro (VP Applied Deep Learning Research at NVIDIA), Ya Xu (VP of Engineering and Head of Data at LinkedIn), and Jure Leskovec (Associate Professor and Senior Technical Advisor at Stanford University and Pinterest).

Bryan Catanzaro led the discussion with the panel and explored how each of the leaders is bridging the gap between AI research and production, their approaches to technology transfer between research and engineering, recent achievements in AI, and the future of AI.

Throughout the GTC session broadcast, there was a real-time Q&A, with Bryan Catanzaro and Nicolas Chapados both moderating. You'll need to register and watch the session replay online to get the full story, but we’ve highlighted a few of the questions that Nicolas answered during the Q&A for your reading pleasure. Please note that this is only a subset of all the questions, and there were sometimes multiple answers to the same question by Bryan. Ya and Jure were unfortunately not available for the Q&A. This Q&A was posted with the kind permission of the team at NVIDIA.

We offer a special thank you to Bryan Catanzaro and the team at NVIDIA for hosting this session, and to Ya Xu and Jure Leskovec for the engaging discussion with Nicolas Chapados.

HIGHLIGHTS FROM THE ONLINE Q&A

[Please note that some of the text has been edited to correct typos, and the original is available online with the session recording.]

How do you take high-level problems described by business leaders and distill [them] into AI modeling solutions?

From Nicolas Chapados: Sometimes, this gap between AI Theory and AI Practice has been called the Valley of Death and frankly, if you want to innovate, you need to make it a core competency to have the ability to cross it. For us, there is no single magic bullet: it is a process. It will typically involve several teams that include AI research scientists, product managers, UX designers, software engineers, and customer success teams. For people with such diverse backgrounds, communication becomes an issue: they may use the same word to mean different things, and our managers need to be aware of these "impedance mismatches". Another source of friction lies in different risk tolerance. At one end of the innovation continuum, researchers live in an uncertain world, with the landscape of research changing weekly. At the other end of the innovation continuum, engineering teams want to have solid roadmaps that they can execute on. So we need to reconcile these very different perspectives.

What are some things to look out for (or avoid) when turning an ML research idea into a prototype and then into a deployed product?

From Nicolas Chapados: Our goal is to de-risk early. For example, one of our fundamental research teams may come up with a great new idea advancing the application of text-to-code. They'll write a paper about it, but this paper will be using public datasets so that this can be considered a valid scientific contribution. In parallel, we also have an "emerging technologies lab", whose goal is to take the research advances that look and feel promising, and working with research, do proofs-of-concept and proofs-of-value in the context of ServiceNow problems and data. So taking the text-to-code example, we would ask "where within the ServiceNow products would this make sense?" and collect internal data and do a POC & POV for such. Now with the POC and POV in hand, we can work with the product managers to insert those advances in the right place on our product and platform roadmaps.

How important is the standardization of practice and technology across teams who are building ML models in your large organizations?

From Nicolas Chapados: Standardization helps to a certain extent. It works much better around application areas (e.g. NLP) but can be hard to be too strict about since frameworks and methods still evolve rapidly. We've found it helpful to have standard interface layers between training and production/inference, such as ONNX.

How important are model compression techniques (like distillation) to your teams? Is their use commonplace?

From Nicolas Chapados: We study and use distillation techniques primarily with large language models and NLP use cases.

What are some of your experiences (and learnings) training and deploying large-scale language models like transformers?

From Nicolas Chapados: It's still early days for us right now. We're deploying BERT-scale models (trained with ServiceNow-specific data) to handle some NLP tasks. Getting the model right is one thing, but getting the deployment environment right (e.g. scaling across all of our data centers) is the harder thing, for which we're still iterating.