Join the #BuildWithBuildAgent Challenge! Get recognized, earn exclusive swag, and inspire the ServiceNow Community with what you can build using Build Agent.  Join the Challenge.

Loic1
ServiceNow Employee
ServiceNow Employee

ServiceNow Vision AI agents are coming!

 

What is a Vision AI agent?

 

A Vision AI Agent is a new type of AI Agent that enables customers to automate tasks that depend on visual understanding of screens or real-world environments. These agents take video inputs, “watch” user demonstrations and interpret what they see in real time. They reason over the video, convert actions into step-by-step processes, diagnose issues and execute actions across ServiceNow and third-party systems.

 

Simply record your screen or capture an issue using your device camera: Vision AI agents analyze the visual context, extract key information and carry out the necessary remediation or workflow automation automatically.

 

See Vision AI agents in action

 

Demo: Vision AI agents turn screen recordings into structured processes

In this demo, an AI Agent learns from tutorial videos or recordings to turn unstructured video input into structured KB articles or processes.

 

https://youtu.be/UfwiucNHYjE

 

 

Demo: Vision AI agents troubleshoot IT issues from video input

In this demo, an AI Agent parses a screen recording to identify visual errors, search relevant knowledge, and troubleshoot the issue.

https://youtu.be/hAYmb_-6A5k

 

How do Vision AI agents work with other ServiceNow AI capabilities?

 

Vision AI Agents operate as a special type of ServiceNow AI agents, leveraging the AI Agent Orchestrator and special Tools to efficiently process visual input and share contextual data in real time.

They are built in the framework and coordinate seamlessly with Virtual Agent, RAG search and workflow automation to streamline complex workflows. When a use case involves visual context, Vision AI Agents can be invoked automatically to analyze what the user is seeing and execute the right actions.

 

A key difference with Web or Desktop AI agents is that while Vision AI agents can “see” webpages or applications, they cannot interact with them by clicking or scrolling. They rely on workflows, scripts and APIs to execute actions.

 

This is awesome! How can I get started?

 

Vision AI Agents is rolling out in preview for early customer feedback. To learn more, sign up here.

 

Safe harbor notice for forward-looking statements

 

This article may contain “forward-looking” statements that are based on our beliefs and assumptions and on information currently available to us only as of the date of publication. We cannot guarantee that we will achieve the plans, intentions, or expectations disclosed in our forward‐looking statements, and you should not place undue reliance on our forward‐looking statements. The information on new products, features, or functionality is intended to outline our general product direction and should not be relied upon in making a purchasing decision, is for informational purposes only, and shall not be incorporated into any contract, and is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for our products remains at our sole discretion. We undertake no obligation, and do not intend, to update the forward‑looking statements.

 

Comments
PaulSylo
Tera Sage
Tera Sage

Hi @Loic1  - Very interesting, can we try in our environment as well ?

Version history
Last update:
4 weeks ago
Updated by:
Contributors