ServiceNow AI Research

Agents

Shifting AI Security to the Left: Design-Time Defenses to Mitigate the Risks of Prompt Injections
Prompt injections pose a critical weakness for modern Large Language Models, making it difficult for AI to distinguish between …
StarVLM ReRank: Better UI Grounding via Enhanced Visual Input and Element Position Perception
UI grounding is a fundamental task for enterprise workflow automation. This task maps natural language instructions to precise pixel …
WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation
Understanding diverse web data and automating web development presents an exciting challenge for agentic multimodal models. While …
WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation
We present WebMMU, a multilingual benchmark that evaluates three core web tasks: (1) website visual question answering, (2) code …
DoomArena: A framework for Testing AI Agents Against Evolving Security Threats
We present DoomArena, a security evaluation framework for AI agents. DoomArena is designed on three principles: 1) It is a …
Fine-Tune an SLM or Prompt an LLM? The Case of Generating Low-Code Workflows
Large Language Models (LLMs) such as GPT-4o can handle a wide range of complex tasks with the right prompt. As per token costs are …
AgentAda: Skill-Adaptive Data Analytics for Tailored Insight Discovery
We introduce AgentAda, the first LLM-powered analytics agent that can learn and use new analytics skills to extract more specialized …
DoomArena: A framework for Testing AI Agents Against Evolving Security Threats
We present DoomArena, a security evaluation framework for AI agents. DoomArena is designed on three principles: 1) It is a …