ServiceNow AI Research

Large Language Models

Grounding Computer Use Agents on Human Demonstrations
Building reliable computer-use agents requires grounding: accurately connecting natural language instructions to the correct on-screen …
Causal Differentiating Concepts: Interpreting LM Behavior via Causal Representation Learning
Language model activations entangle concepts that mediate their behavior, making it difficult to interpret these factors, which has …
Breaking the Bottleneck with DiffuApriel: High-Throughput Diffusion LMs with Mamba Backbone
Diffusion-based language models have recently emerged as a promising alternative to autoregressive generation, yet their reliance on …
Apriel-MTP: Multi-Token Prediction for Faster and More Efficient Language
We introduce multi-token prediction (MTP) variants of the Apriel model family, designed to generate multiple to- kens per forward pass. …
Faster On-Policy Reinforcement Learning for Long Sequence Generation
Reinforcement Learning (RL) is increasingly utilized to enhance the reasoning capabilities of Large Language Models (LLMs). However, …
StarVLM ReRank: Better UI Grounding via Enhanced Visual Input and Element Position Perception
UI grounding is a fundamental task for enterprise workflow automation. This task maps natural language instructions to precise pixel …
Unifying Autoregressive and Diffusion-Based Sequence Generation
We present significant extensions to diffusion-based language models, blurring the line with autoregressive ones. We introduce …
WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation
We present WebMMU, a multilingual benchmark that evaluates three core web tasks: (1) website visual question answering, (2) code …
Using Scaling Laws for Data Source Utility Estimation in Domain-Specific Pre-Training
We introduce a framework for optimizing domain-specific dataset construction in foundation model training. Specifically, we seek a …