Queued Application Operations
Summarize
Summary of Queued Application Operations
Starting in the Tokyo release, ServiceNow CICD APIs that require an update instance-wide lock (mutex) to perform application operations are queued instead of being rejected when the lock is occupied. This queuing mechanism ensures smoother handling of concurrent operations like app installs, plugin activations, rollbacks, and apply changes, improving reliability and reducing operation failures due to lock conflicts.
Show less
How Queued Operations Work
- CICD services create application operation messages using NowMQ (Now Message Queue) with a common subject "sys.applifecycle.operation".
- Each message contains a JSON object with details such as the Execution Tracker sysid (progress ID), operation type (e.g., appinstall, pluginactivation), and relevant parameters like plugin or app ID.
- An Execution Tracker record is created for each CICD request, initially marked as Pending, showing operation details and queue position.
- Queued messages are processed sequentially or in parallel (if enabled) by scheduled jobs polling the queue.
- Operations pending in the queue do not appear in recent history, but completed operations are tracked for 24 hours.
Managing the Application Operation Queue
- Manual installs from the UI are not queued but may be delayed if many CICD requests hold the update lock.
- Administrators can pause, resume, or cancel queued operations through the Application Operation Queue UI (System Diagnostics → Application Operation Queue).
- Pausing the queue releases the update mutex after current jobs finish, allowing manual operations to proceed.
- Cancelling a queued operation is possible only if the Execution Tracker state is Pending, not Running.
Upgrade Window Behavior
- Two hours before a scheduled upgrade (customizable), the queue processing pauses automatically ("Upgrade Paused" state), though new CICD requests continue to queue.
- After upgrade completion, queued messages resume processing automatically.
Impact and Supported CICD APIs
- The queuing mechanism does not change existing CICD API request/response contracts.
- It prevents errors due to update lock conflicts seen in pre-Tokyo releases by queuing requests for sequential or parallel processing.
- Supported CICD APIs include various versions of app installs, rollbacks, apply changes, batch installs, imports, and plugin activations/rollbacks.
Parallelization of Operations
- Starting with Tokyo, app installs and plugin activations can run in parallel by default to improve efficiency.
- The update mutex is held during queue processing to prevent UI operations that require the same lock.
- Parallel processing can be disabled via a system property to revert to sequential execution.
- The maximum number of parallel jobs defaults to 2 and can be adjusted, but increasing parallelism may impact instance performance and memory usage.
Resource Locking and Limits on Parallelization
- Operations that share the same scope, cause schema changes, or contain fix scripts cannot run in parallel.
- The queue processor evaluates each job’s eligibility for parallel execution based on resource locks stored in the syspadlock table.
- Jobs that cannot obtain required locks are deferred with a cool-down period and remain visible in the queue.
- The cool-down duration and maximum failure attempts for resource acquisition can be configured via system properties.
- Failures such as inability to download application packages are logged and can cause jobs to be removed from the queue after repeated errors.
CICD APIs that must obtain the update instance wide lock / mutex to perform the requested operations are queued instead of being rejected when the update instance wide lock / mutex is occupied by the other operations.
Beginning in Tokyo, the CICD APIs that require to obtain the update instance wide lock / mutex to perform the requested operations will be queued instead of being rejected when the update instance wide lock / mutex is occupied by the other operations. When a CICD request is received, the corresponding CICD service constructs an application operation NowMQ (Now Message Queue) message and insert the message to the queue using NowMQ APIs. The queued messages are then polled by the scheduled job and handled one by one or in parallel if parallel processing is enabled and the operation satisfies necessary criteria.
Application Operation NowMQ Message
The application operation NowMQ messages have a common subject that is "sys.applifecycle.operation”. The message body of the application operation NowMQ message contains a JSON object that includes the sys_id of the Execution Tracker (also refer to as the progress id returned in the CICD API response), the operation type that can be one of the following: app_install, plugin_activation, batch_install, rollback, import_app and apply_changes. It also contains the information like plugin id for plugin activation, app id or scope for application install.
Execution Tracker for Application Operation
When the Application Operation NowMQ message is constructed and inserted, the Execution Tracker record for the corresponding CICD request is created and its sys_id is added into the body of the NowMQ message. The Execution Tracker is in Pending state initially. The Execution Tracker’s “Details” column contains the information about the type of operation, and the important input parameters for the CICD request. Its “Message” column contains the information about the queue position. When the queue is paused, the message is prefixed with “[App Operation Queue is paused]”.
Sample Application Operations Execution Trackers when queue is running.
Sample Application Operation Execution Trackers when queue is paused.
Manage Application Operation Queue
While manually installing products from All Applications is queued, manually installing an application from UI like All Applications or Application Manager isn’t queued.
Sometimes, the instance may be receiving, queueing, and handling many CICD requests which may cause the manual install from UI starving for update instance wide lock / mutex. When this happens, admin can temporarily pause the Application Operation Queue.
Admin can manage the Application Operation Queue through System Diagnostics->Application Operation Queue UI page. On “Operation Queue Status” panel, the admin can pause or resume the queue. Admin can also cancel the pending execution tracker, this will eventually remove the corresponding queued message from NowMQ by the App Operation Queue Health Monitor job.
Application Operation Queue UI page
Sample Application Operation Queue UI page. Admin can click the button in “Operation Queue
Status” to pause or resume the queue.
Click the “Application Operations Execution Trackers” list item, it opens the Execution
Tracker form. If the queued message is pending in the queue, update the execution tracker state
to “Cancelled” and save the change will cancel the corresponding queued CICD request. Note: if
the state of the execution tracker is “Running”, the CICD request can’t be cancelled.
Application Operation Queue and Upgrade Window
By default, 2 hours (can be customized through sys property “com.glide.update_operation.queue_upgrade_window”) before the scheduled upgrade, Application Operation Queue stop processing queued messages.
The Application Operation Queue status is changed to “Upgrade Paused”. During this upgrade
window, new CICD requests continue to be queued.
When upgrade completes, the Application Operation Queue resumes processing the queued messages automatically.
Impact to CICD Pipeline
The existing request/response contracts for the CICD APIs aren’t changed. The operation failure due to update instance wide lock / mutex conflicts that are observed in a pre-Tokyo release won’t be seen. The request is queued and served one by one or in parallel depending on they job type.
CICD APIs supporting queueing
- api/sn_cicd/app_repo/install
- api/sn_cicd/v1/app_repo/install
- api/sn_cicd/v2/app_repo/install
- api/sn_cicd/app_repo/rollback
- api/sn_cicd/v1/app_repo/install
- api/sn_cicd/v2/app_repo/rollback
- api/sn_cicd/sc/apply_changes
- api/sn_cicd/v1/sc/apply_changes
- api/sn_cicd/v2/sc/apply_changes
- api/sn_cicd/app/batch/install
- api/sn_cicd/sc/import
- api/sn_cicd/plugin/{plugin_id}/activate
- api/sn_cicd/plugin/{plugin_id}/rollback
Parallelize Application Installs and Plugin Activations
- api/sn_cicd/app_repo/install
- api/sn_cicd/v1/app_repo/install
- api/sn_cicd/v2/app_repo/install
- api/sn_cicd/plugin/{plugin_id}/activate
All queue processing takes an instance wide lock / mutex and holds this mutex until any queued operation is completed. This lock is called UpdateMutex, and its status can be viewed in the sys_mutex table. During this time, operations that take this same lock (app installs, plugin activations, source control operations) are not performable via the UI. The queue can still be paused via the Application Operation Queue page to release the lock after currently running jobs have been completed.
Parallelization is enabled by default. It can be turned off by using the property
com.glide.update_operation.parallel_operation_enabled, and all operations will
run sequentially from the queue as in previous releases.
Limits on Parallelization
The queue processor determines if an enqueued job can run. If a job can run, it is scheduled to be picked up by an instance node at first availability. If not, it is returned to the queue, and the processor evaluates the next job in the queue for processing.
There is a limit to the maximum number of jobs that can be executed in parallel, which
defaults to 2. This property can be toggled via
glide.update.app_operation_queue.parallel.max but keep in mind that there is
an upper limit of available threads to perform the install and increased parallelization takes
additional memory that can slow the instance for active users.
Obtaining Resource Locks
- Any two operations that share the same scope including customizations.
- Any two operations that cause schema changes.
- Any two operations containing fix scripts.
Queue processing determines if these criteria are met for enqueued operations and defers not
processable jobs if necessary. The Progress ID of the job is updated to reflect if the operation
is waiting for appropriate resource locks to be obtained. An executing operation maintains a
list of the resources (scopes, schema, fix scripts present) in the sys_padlock table, and the
insert and release of these locks can be observed in this table by the Progress ID. If a queued
job is deferred due to inability to obtain necessary locks on resources, it is put on a
cool-down to allow other messages to process. The cool-down period can be modified with property
com.glide.update_operation.job_cancel_timeout_minutes. The job is still in the
queue and is visible on the Application Operation Queue page.
com.glide.update_operation.max_failure_count property.