Manually capture steps

Australia Enable AI

Release

australia

ft:locale

en-US

ft:publication_title

Australia Enable AI

ft:clusterId

platai

bundleId

platai

workflow

Platform

Extend a desktop action by manually capturing steps in AI Desktop Actions

Release version: Australia

Updated May 27, 2026

6 minutes to read

Extend the automation logic in a desktop action by manually capturing steps in AI Desktop Actions.

Before you begin

To access the AI Desktop Actions functionality, perform the following steps:

Enable AI Desktop Actions on your ServiceNow instance. For more information, see Configure AI Desktop Actions.
Download the AI Desktop Actions installer to automate repetitive tasks across applications and systems. For more information, see Download AI Desktop Actions installer.

Confirm that the following system requirements are met:

Windows 11 operating system is used.
A .NET 9.0 runtime v9.0.10 and .NET 9 Desktop Runtime v9.0.10 is installed.
No extended monitors are connected.
Theme must match between the systems used for recording and execution.
Remote Desktop must be enabled on your machine and your account must be granted Remote Desktop access permissions before you start using the AI Desktop Actions Execution workspace.

Familiarize yourself with the Design workspace and Action recorder. For more information, see AI Desktop Actions Design workspace and Action recorder in AI Desktop Actions.

Role required: sn_aia.admin

About this task

Note:

To create desktop actions with AI-assisted anchor validation and automatic screen context generation, use Record with AI instead.

You can simulate a user interaction in an automation by manually capturing screens and defining steps.

Using the controls in the window, you can capture an area of an application window. You can then set one or more anchors, and define steps that represent the user interactions in that window.

Anchor: Anchors help specify the target area for the interaction by defining a static area from which steps can be defined at a relative distance.
Step: Steps are your sequential interactions of type click, select, type, scroll, and more.

Note:

If your automation requires manual inputs, such as entering an OTP or CAPTCHA, you must provide instructions to the AI Agent to wait for the user input during execution. Otherwise, the automation can't proceed.

Procedure

From your Windows system, launch the AI Desktop Actions application.
On the login page, in the Add ServiceNow URL field, enter the ServiceNow instance URL.
For example, https://<instance name>.service-now.com.
Select Proceed.
Log in to your ServiceNow account by entering your user name and password.
Your must have the sn_aia.admin role.
Optional: On the onboarding journey modal, complete the onboarding and select Get started.
If you launch the AI Desktop Actions for the first time, the onboarding journey widget appears. You can select Don't show me again to hide the widget the next time you launch AI Desktop Actions or Skip intro to skip the onboarding.
On the AI Desktop Actions home page, select an existing desktop action.
The Design workspace is displayed.
Capture screens.
1. In the Design tab, select the Capture Options icon .
2. Select Manual capture.
  The AI Desktop Actions window is minimized and the Capture panel is launched.
3. Open the applications that you want to automate steps for.
4. Capture the area of the external application’s window by selecting the Select icon on the Capture panel or pressing the Ctrl + Shift + C on the keyboard.
  Note:
  If the Ctrl + Shift + C shortcut conflicts with another application on the your machine, such as Zoom, then you must use the Select button to initiate manual screen capture.
  For example, you can capture the area surrounding a button or a text field in a web browser. The cursor icon changes to the icon.
5. Drag the icon and select the required screen area.
  When you leave the icon, the selected area is captured as a screenshot in the Design workspace.
  If you aren't satisfied with the captured screen, you can recapture the screen area by selecting the Capture image icon .
Insert anchors.
1. Insert an anchor on the captured screen by selecting the Add anchor icon .
  Note:
  If two applications in the frame have similar logos or visual elements, verify that the anchor position is unique to the target application to avoid incorrect element identification during automation.
  An anchor is a reference point on the screen that helps the automation identify and interact with a nearby UI elements. During execution, the system uses computer vision to locate the anchor and then identifies the UI elements at a related distance from the anchor. Anchors improve the stability and accuracy of steps when the target element’s location may shift or when the UI layout varies across sessions.
  Note:
  Don't use dynamically changing UI elements as anchors. If an element changes its color, text, or state after an action (for example, after a click), select a different anchor that remains static on the screen.
2. Move the anchor to a part of the captured image that won’t change.
  For example, move the anchor to a title or field label.
  If the area under the anchor doesn’t exactly match the corresponding area of the captured image, the anchor isn't recognized, and the steps aren't performed as intended. Choose a static area of the image for setting your anchor.
  You can add multiple anchors on each screen. Multiple anchors let you define the geographical relationship between anchor and target with greater accuracy when targeting different locations in the image.

Configure the steps.

From the Anchor control menu, select the Add step icon .

Select the type of step to perform for this step from the contextual menu.

Screen capture of an app with anchor added, displaying various type of input and output steps.

Table 1. Description of the actions
Goal	Step	Type	Example
Enter text in a field	Set Text	Input	Enter any text data such as a user name, an address, a survey response, or in any situation where text entry is accepted. Note: If you set a static value for this field, the automation uses it during execution and doesn’t prompt you for input from the Now Assist panel.
Simulate a mouse click	Click	Input	Click a button, open a menu, or perform any step typically performed by a mouse click.
Simulate an alternative mouse action (for example, right-click, drag, scroll, or paste)	Mouse Click	Input	Perform various mouse device actions, such as right-click and select an object or scroll on a web page.
Simulate a key press or a key function	Send Keys	Input	Perform keyboard shortcuts, such as copying text by entering `Ctrl + C` on fields and elements. Note: If you set a static value for this field, the automation uses it during execution and doesn’t prompt you for input from the Now Assist panel.
Capture text from a window or web page	Get Text	Output	Receive text from the source area.
Capture a table	Get Table	Output	Receive table from the source area when the text is in the table format. Note: For the step to capture table data successfully, the data must already be in the table form. The step can’t convert ordinary text to table data.
Read text from an image	OCR Read Text	Output	Recognize text from images and return it in the standard text format.

You can add multiple steps representing your automation steps.

Configure the properties for added screens, anchors, and steps in the Properties panel.
For more information, see Screen, anchor, and step properties in AI Desktop Actions.
Optional: Modify the auto-generated names for all added screens, anchors, and steps.
You can modify the auto-generated names following these naming guidelines.
- Name fields must not be empty.
- Name fields must contain only alphanumeric characters. Spaces and special characters are not permitted.
- Each name must be unique at its respective parent level.
  - Each screen must have a unique name at the desktop-action level.
  - Each anchor must have a unique name at the screen level.
  - Each step must have a unique name at the anchor level.

What to do next

Configure the details of your desktop action. For more information, see Add details to desktop actions in AI Desktop Actions.
Test and activate the desktop action so that it can be added as a tool to AI agents. For more information, see Test and activate a desktop action in AI Desktop Actions.
Add the desktop action as a tool to AI agents in AI Agent Studio. For more information, see Add a defined desktop action tool to an AI agent for desktop and web-based task.