Manually capture steps

Australia Enable AI

Release

australia

ft:locale

ja-JP

ft:publication_title

Australia Enable AI

ft:clusterId

platai

bundleId

platai

workflow

Platform

Automate repetitive tasks by manually capturing steps in Agentic Desktop

リリースバージョン: Australia

更新日 2026年03月12日

所要時間：12分

Create desktop actions by manually capturing steps to automate repetitive tasks in Agentic Desktop. The steps that you define on one or more application screens are saved as a reusable desktop action of type UI block.

始める前に

To access the Agentic Desktop functionality, perform the following steps:

Enable Agentic Desktop on your ServiceNow instance. For more information, see Configure Agentic Desktop.
Download the Agentic Desktop installer to automate repetitive tasks across applications and systems. For more information, see Download Agentic Desktop installer.

Confirm that the following system requirements are met:

Windows 11 operating system is used.
A .NET 9.0 runtime v9.0.10 and .NET 9 Desktop Runtime v9.0.10 is installed.
No extended monitors are connected.
Theme must match between the systems used for recording and execution.
Remote Desktop must be enabled on your machine and your account must be granted Remote Desktop access permissions before you start using the Agentic Desktop Execution workspace.

Familiarize yourself with the Design workspace and Action recorder. For more information, see Agentic Desktop Design workspace and .

Role required: sn_aia.admin

このタスクについて

You can simulate a user interaction in an automation by manually capturing screens and defining steps.

Using the controls in the window, you can capture an area of an application window. You can then set one or more anchors, and define steps that represent the user interactions in that window.

Anchor: Anchors help specify the target area for the interaction by defining a static area from which steps can be defined at a relative distance.
Step: Steps are your sequential interactions of type click, select, type, scroll, and more.

注:

If your automation requires manual inputs, such as entering an OTP or CAPTCHA, you must provide instructions to the AI Agent to wait for the user input during execution. Otherwise, the automation can't proceed.

手順

From your Windows system, launch the Agentic Desktop application.
On the login page, in the Add ServiceNow URL field, enter the ServiceNow instance URL.
For example, https://<instance name>.service-now.com.
Select Proceed.
Log in to your ServiceNow account by entering your user name and password.
Your account must have the sn_aia.admin role.
On the Agentic Desktop home page, select Create desktop action.
In the New desktop action dialog box, select Manual capture steps.
Enter a name and description for the desktop action.
Select Start capturing.
The Design workspace is displayed.
Capture screens.
1. In the Design tab, select the Capture Options icon .
2. Select Manual capture screens.
  The Agentic Desktop window is minimized and the Capture panel is launched.
3. Open the applications that you want to automate steps for.
4. Capture the area of the external application’s window by selecting the Select icon on the Capture panel or pressing the Ctrl + Shift + C on the keyboard.
  注:
  If the Ctrl + Shift + C shortcut conflicts with another application on the your machine, such as Zoom, then you must use the Select button to initiate manual screen capture.
  For example, you can capture the area surrounding a button or a text field in a web browser. The cursor icon changes to the icon.
5. Drag the icon and select the required screen area.
  When you leave the icon, the selected area is captured as a screenshot in the Design workspace.
  If you are not satisfied with the captured screen, you can recapture the screen area by selecting the Capture image icon .
Insert anchors.
1. Insert an anchor on the captured screen by selecting the Add anchor icon .
  An anchor is a reference point on the screen that helps the automation identify and interact with a nearby UI elements. During execution, the system uses computer vision to locate the anchor and then identifies the UI elements at a related distance from the anchor. Anchors improve the stability and accuracy of steps when the target element’s location may shift or when the UI layout varies across sessions.
  注:
  Do not use dynamically changing UI elements as anchors. If an element changes its color, text, or state after an action (for example, after a click), select a different anchor that remains static on the screen.
2. Move the anchor to a part of the captured image that won’t change.
  For example, move the anchor to a title or field label.
  If the area under the anchor doesn’t exactly match the corresponding area of the captured image, the anchor isn't recognized, and the steps are not performed as intended. Choose a static area of the image for setting your anchor.
  You can add multiple anchors on each screen. Multiple anchors let you define the geographical relationship between anchor and target with greater accuracy when targeting different locations in the image.

Configure the steps.

From the Anchor control menu, select the Add step icon .

Select the type of step to perform for this step from the contextual menu.

Screen capture of an app with anchor added, displaying various type of input and output steps.

表 : 1. Description of the actions
Goal	Step	Type	Example
Enter text in a field	Set Text	Input	Enter any text data such as a user name, an address, a survey response, or in any situation where text entry is accepted. 注: If you set a static value for this field, the automation uses it during execution and doesn’t prompt you for input from the Now Assist panel.
Simulate a mouse click	Click	Input	Click a button, open a menu, or perform any step typically performed by a mouse click.
Simulate an alternative mouse action (for example, right-click, drag, scroll, or paste)	Mouse Click	Input	Perform various mouse device actions, such as right-click and select an object or scroll on a web page.
Simulate a key press or a key function	Send Keys	Input	Perform keyboard shortcuts, such as copying text by entering `Ctrl + C` on fields and elements. 注: If you set a static value for this field, the automation uses it during execution and doesn’t prompt you for input from the Now Assist panel.
Capture text from a window or web page	Get Text	Output	Receive text from the source area.
Capture a table	Get Table	Output	Receive table from the source area when the text is in the table format. 注: For the step to capture table data successfully, the data must already be in the table form. The step can’t convert ordinary text to table data.
Read text from an image	OCR Read Text	Output	Recognize text from images and return it in the standard text format.

You can add multiple steps representing your automation steps.

Configure the properties for added screens, anchors, and steps.
For more information, see Screen, anchor, and step properties in Agentic Desktop.
Provide names for all added screens, anchors, and steps.
When you create these elements, you can edit the auto-generated name, but follow these naming guidelines.
- Name fields must not be empty.
- Name fields must contain only alphanumeric characters. Spaces and special characters are not permitted.
- Each name must be unique at its respective parent level.
  - Each screen must have a unique name at the desktop-action level.
  - Each anchor must have a unique name at the screen level.
  - Each step must have a unique name at the anchor level.

次のタスク

Configure the details of your desktop action. For more information, see Add details to desktop actions in Agentic Desktop.
Test and activate the desktop action so that it can be added as a tool to AI agents. For more information, see Test and activate a desktop action in Agentic Desktop.
Add the desktop action as a tool to AI agents in AI Agent Studio. For more information, see Add a desktop action to an AI agent.