Exploring ITOM AIOps
Summarize
Summary of Exploring ITOM AIOps
ITOM AIOps in ServiceNow empowers IT operations teams, site reliability engineers, and DevOps professionals to proactively monitor, analyze, and optimize IT infrastructure and service health using AI-driven monitoring, analytics, and automation. This approach helps prevent service outages, reduce mean time to resolution (MTTR), and improve overall infrastructure performance.
Show less
Key Features
- Service Operations Workspace: A unified interface consolidating alerts, incidents, and operational data from multiple AIOps applications. It provides correlated events, remediation suggestions, and service impact insights to enable rapid issue identification and response.
- Agent Client Collector: Real-time monitoring of service availability and infrastructure performance, using dynamic thresholds and anomaly detection.
- Synthetic Monitoring: Simulates user transactions to detect performance bottlenecks and provide early warnings before real users are affected.
- Health Log Analytics: Collects and analyzes log data to identify patterns and anomalies, generating actionable alerts for emerging issues.
- Event Management: Centralizes alert ingestion, reduces noise through event correlation, maps alerts to CMDB configuration items, and prioritizes incidents based on business impact.
- Express List: A view within the Service Operations Workspace that enables efficient monitoring, alert resolution, impact evaluation, and incident reporting.
- Service Observability: Integrates telemetry with CMDB data to reveal hidden dependencies and provide insights into how infrastructure components affect services.
- Service Reliability Management: Supports structured incident response workflows, automated escalations, and collaboration tools.
- SLO Management: Enables definition, monitoring, and automated alerting for service level objectives and indicators to ensure proactive service quality management.
Roles and Responsibilities
- Admin: Configures Event Management properties and rules.
- Operator: Manages alerts including acknowledgment and closure.
- User: Performs basic alert lifecycle operations such as viewing and acknowledgment.
Benefits
- Predictive Issue Detection: Early identification of potential outages through log analytics and synthetic monitoring.
- Automated Alert Response: Streamlined alert handling and remediation through automation in the Service Operations Workspace.
- Performance Optimization: Continuous monitoring and improvement of IT infrastructure health.
- Service Health and Investigation: Enhanced visibility and root cause analysis via service observability.
- Service Level Management: Proactive management of service quality using SLO tracking and alerting.
Complementary Products
- Discovery: Automatically identifies IT assets and configurations to maintain an accurate, comprehensive CMDB.
- Service Mapping: Provides detailed application service relationships, improving visibility and impact analysis.
- Service Portfolio Management, Software Asset Management, Hardware Asset Management: Deliver lifecycle data and asset management to optimize IT portfolio management.
- Incident Management: Utilizes Event Management data to rapidly create and manage incidents.
- Customer Service Management (CSM): Leverages service impact data for improved customer support and faster issue resolution.
Implementation Considerations
Before implementing ITOM AIOps, ensure a well-populated and accurate Configuration Management Database (CMDB) using Discovery. Proper CMDB data is essential for event correlation, service impact calculation, and context-rich alerting. Configure the MID Web Server extension to enable metric data and event ingestion required by Event Management, Agent Client Collector, and Health Log Analytics. Utilize Setup Hub to guide the configuration of Event Management and related AIOps capabilities within your ServiceNow instance.
Overview of ITOM AIOps applications and capabilities that enable proactive IT operations management through AI-powered monitoring, analytics, and automation.
ITOM AIOps overview
ITOM AIOps enables IT operations teams, site reliability engineers, and DevOps professionals to proactively monitor, analyze, and optimize the health and performance of their IT infrastructure and services.
ServiceNow® ITOM AIOps provides AIOps capabilities that transform how organizations manage IT operations. By combining advanced analytics, machine learning, and automation, ITOM AIOps helps teams avoid service outages, reduce mean time to resolution, and optimize infrastructure performance.
Service Operations Workspace
Network Operations Center (NOC) operators and IT operations teams use Service Operations Workspace as their primary interface for managing IT operations powered by ITOM AIOps. The workspace consolidates alerts, incidents, and operational data from multiple AIOps applications into unified dashboards and workflows.
In their daily work, operators use the workspace to monitor service health across the entire IT infrastructure, investigate alerts with full context from multiple data sources, and coordinate response activities. The workspace presents correlated events, suggested remediation actions, and service impact information in a single interface, enabling operators to quickly understand and respond to issues.
The workspace integrates data from all AIOps products to provide operators with comprehensive situational awareness and streamlined incident management capabilities.
ITOM AIOps workflow
Each AIOPs application focuses on specific aspects of IT operations while contributing to a unified AIOps platform.
- Agent Client Collector
- Monitors service availability and infrastructure performance in real-time. Combined with Metric Intelligence, it establishes dynamic thresholds and detects anomalies that may indicate potential service outages before they occur.
- Synthetic monitoring
- Monitors critical services by simulating user transactions on API endpoints, identifying performance bottlenecks and helping maintain optimal user experiences. It provides early warning of service degradation before real users are affected.
- Health Log Analytics
- Collects log data in real time and uses machine learning to identify patterns and detect anomalies. It typically ingests logs through the MID Server, identifies normal operating patterns, and using Event Management, raises actionable alerts when it detects significant deviations that might indicate emerging issues.
- Event Management
- Serves as the central nervous system for your IT operations. It receives alerts from monitoring tools and correlates related events to reduce noise. The system maps alerts to configuration items in the Configuration Management Database (CMDB) and calculates business impact using service dependencies from Service Mapping. This correlation transforms hundreds of individual alerts into prioritized, context-rich incidents that focus teams on the most critical issues.
- Express List
- Is the SOW view into Event Management, allowing operators to efficiently monitor systems and services, resolve alerts, evaluate the alert impact, track issues, and report incidents.
- Service Observability
- Combines external observability telemetry with Configuration Management Database (CMDB) data, to reveal dependencies that may not be obvious from individual alerts. Charts provide critical insights by showing how infrastructure components may be affecting a single service.
- Service Reliability Management
- Enables teams to respond to incidents with structured workflows, automated escalation, and collaborative tools that streamline incident response.
- SLO Management
- Helps organizations define, monitor, and report on service level objectives and service level indicators. It provides automated SLO tracking and alerting when services approach or breach defined thresholds, enabling proactive service quality management.
ITOM AIOps users
| Role title [name] | Description |
|---|---|
| Admin [evt_mgmt_admin] |
Configures and sets up Event Management properties and rules. |
| Operator [evt_mgmt_operator] |
Manages alerts, including closing and acknowledging them. |
| User [evt_mgmt_user] |
Manages the lifecycle of alerts, including performing basic operations such as viewing and acknowledging them. |
ITOM AIOps benefits
| Benefit | Feature | Users |
|---|---|---|
| Predictive issue detection | Admin, operator | |
| Automated alert response | Alert automation in Service Operations Workspace for ITOM | Admin |
| Performance optimization | Exploring Agent Client Collector | Admin, operator |
| Service health and investigation | Exploring Service Observability | Admin, operator |
| Service level management | Admin, operator |
Products that add value to ITOM AIOps
- Discovery
-
Discovery automatically identify and collect information about IT assets, configurations, and relationships, providing organizations with a comprehensive inventory to effectively manage and monitor their IT infrastructure.
- Service Mapping
-
Service Mapping offers detailed information about application instance services within the [cmdb_ci_service_discovered] table. This data helps establish connections between infrastructure and application configuration items (CIs) stored in the [cmdb_ci_appl] table, enhancing visibility into IT environments and facilitating efficient management and monitoring processes.
- Service Portfolio Management
-
Service Portfolio Management (SPM) offers the associated product model, while Software Asset Management (SAM) and Hardware Asset Management (HAM) provide life-cycle data for Technology Portfolio Management (TPM). Together, they enable comprehensive management of IT assets, ensuring effective utilization, compliance, and optimization throughout their life cycles.
Products that benefit from ITOM AIOps
- Incident Management
-
Incident Management tools leverage downstream information from Event Management to create incidents swiftly, ensuring timely resolution of issues.
- Customer Service Management (CSM)
-
Customer Service Management (CSM) systems benefit from ITOM AIOps by utilizing application service impact data to identify affected users promptly, enhancing customer support efficiency and satisfaction.
What to know before you begin
Before implementing ITOM AIOps, verify that your instance has the necessary prerequisites and that you understand the configuration requirements for each application.
A well-populated Configuration Management Database (CMDB) is crucial to get the most out of AIOps. ITOM AIOps relies on accurate configuration item data to map events to infrastructure components, calculate service impact, and provide context for alert correlation. Use Discovery to populate your CMDB with current infrastructure data before activating AIOps applications.
Configure the MID Web Server extension to enable ITOM AIOps features. The MID Web Server is an extension that enables external clients to push metric data and events to the MID Server, which is required for Event Management, and many instances of Agent Client Collector, and Health Log Analytics.
Setup Hub provides a sequence of tasks that help you configure Event Management on your ServiceNow instance. For more information about using the Setup Hub, see Configure Event Management using Setup Hub.