Alerts in Instance Observer

Yokohama Impact

Release

yokohama

ft:locale

en-US

ft:publication_title

Yokohama Impact

ft:clusterId

ipact

bundleId

ipact

Alerts in Instance Observer

Release version: Yokohama

Updated July 31, 2025

2 minutes to read

Summarize

Summarized using AI

Summary of Alerts in Instance Observer

ServiceNow Instance Observer offers a robust alert system that continuously monitors the health, performance, and user experience of your platform. These alerts are organized into categories to simplify monitoring and enable prompt, targeted responses to potential issues.

Show full answer Show less

Key Features

Transactions: Detects anomalies such as drops or surges in transaction volume, degraded response times at system and node levels, and database-related latency issues like slow queries that affect responsiveness.
Node Health: Monitors critical node infrastructure metrics including CPU usage, memory consumption, and garbage collection delays to prevent performance bottlenecks or failures.
Database Performance and Health: Tracks CPU load on primary and shard database hosts, replication lag, row lock frequencies, and abnormal growth patterns in databases and tables to ensure data reliability and query efficiency.
Email Processing: Alerts on delays or failures in both outbound and inbound email handling, supporting timely communication workflows.
Scheduler and Job Execution: Identifies issues such as scheduler blocks, long-running jobs, and unusual thread activity to maintain smooth job lifecycle execution.
Session and User Activity: Provides insights into user login patterns at both instance and node levels to monitor user engagement and detect potential anomalies.
Event Queue and Semaphore Management: Facilitates debugging by monitoring semaphore wait times, queue depths, and backlog in critical event queues, including mission-critical and external communication channels.
Asynchronous Messaging Bus (AMB): Observes real-time internal messaging behavior by tracking outgoing message queue sizes and utilization rates.
Historical or List Data Volume: Flags excessive record counts in history or list tables that may impact system performance.
Application Host Health: Alerts on CPU overload at the application layer to help maintain application stability.
AI/ML or Intelligent Alerting: Utilizes AI-driven analysis to detect anomalies and patterns, providing proactive performance insights.

Key Outcomes

By leveraging Instance Observer alerts, ServiceNow customers can proactively identify and address performance degradations, infrastructure bottlenecks, and user activity anomalies. This comprehensive monitoring enables faster troubleshooting, improved platform reliability, and enhanced user experience, ensuring operational continuity and optimal service delivery.

ServiceNow Instance Observer provides a comprehensive set of alerts designed to monitor platform health, performance, and user experience. These alerts are categorized for easy consumption and actionability.

Transactions

Monitors application transactions for anomalies, spikes, or degradations in performance such as:

Transaction Decrease: Detects a drop in total transaction volume
Transaction Decrease Node: Identifies transaction volume drop per node
Transaction Increase: Flags unexpected transaction surges
Transaction Increase Node: Highlights node-level transaction spikes
Response Time: Triggers when system-wide response time increases
Response Time Node: Flags nodes with degraded response times
Database Response Time: Monitors database-level latency impacting transactions
Slow Queries Per Second: Identifies the volume of slow database queries affecting responsiveness

Node health (CPU, memory, or garbage collection)

Tracks node infrastructure health to avoid bottlenecks or failures:

Node CPU time: High CPU usage alert for a node
Node memory: Monitors memory consumption patterns
Node garbage collection time: Tracks JVM GC delays
Load balancer container CPU utilization: Flags CPU overload on LB containers
Load balancer container memory utilization: Detects memory exhaustion on LB containers

Database performance and health

Covers critical database indicators to verify query health and data reliability:

Database host health CPU: High CPU on primary DB host
Shards host health CPU: Resource issues on shard hosts
Read replica host health (CPU): Read-replica CPU anomalies
Standby replication lag: Lag in standby DB replication
InnoDB row lock: Frequency of row lock waits
Primary database growth: Flags abnormal growth in primary DB
Database table growth: Specific table-level growth indicators

Inbound and outbound email

Promotes timely delivery and ingestion of email-based communications:

Outbound email: Delays or failures in outbound email processing
Inbound email: Issues in ingesting incoming emails

Scheduler and job execution

Helps detect issues in the job execution life cycle:

Scheduler stuck: Scheduler not progressing or blocked
Long-running jobs: Jobs exceeding typical run time
Specific long-running jobs: Custom job monitoring
Thread running: Threads running unusually long or in high volume

Session and user activity

Tracks user login behavior across instance and nodes:

User session logged in – Instance: Log in activity across instance
User session logged in – Node: Node-wise session metrics

Event queue and semaphore management

Critical for debugging platform event handling and job execution throttling:

Default semaphore mean: Semaphore wait time trends
Default semaphore QDepth: Depth of queued semaphore requests
Integrated semaphore: Monitors integrated semaphore contention
Event queue check: Tracks backlog in event queues
Specific queue for events: Custom event queue monitoring
High priority event queue: Monitors mission-critical event queues
ECC queue: External communication channel backlog alerts

Asynchronous Messaging Bus (AMB)

Internal messaging bus observability for real-time app behavior:

AMB send queue depth: Size of outgoing message queue
AMB send in use: Utilization of AMB sending capacity

Historical or list data volume

Monitors growth of historical or list data that can impact performance:

History list length: Flags excessive record count in history tables.

Application host health

Monitors health at the application layer:

Application host health CPU: Application-tier CPU overload alerts.

AI/ML or intelligent alerting

Includes alerts generated via AI/ML-based behavior analysis:

Auriga Intelligent: AI-driven anomaly or pattern detection alerts.