Exploring Metric Intelligence

  • Release version: Yokohama
  • Updated January 30, 2025
  • 2 minutes to read
  • Summarize
    Summarized using AI
    This content was generated using new OpenAI-powered functionality. Results are provided on an as is basis and are not guaranteed to be accurate or complete.

    Summary of Exploring Metric Intelligence

    Metric Intelligence in ServiceNow leverages historical metric data and AI-based anomaly detection to identify unusual behavior in configuration items (CIs) that might not be captured by traditional event monitoring. This capability helps prevent potential service outages by providing real-time anomaly alerts and insights into system health and performance beyond standard event data.

    Show full answer Show less

    Key Features

    • Data Collection and Normalization: Performance data is gathered from servers and infrastructure via agents, third-party, and custom connectors, then normalized for consistent analysis.
    • Model Creation and Anomaly Detection: Daily machine learning models are generated using up to 14 days of historical data to establish normal behavior baselines. Real-time anomaly detection identifies deviations from these baselines, assigning anomaly scores.
    • Role-Based Access:
      • Event Management Users: Can view alerts and underlying metrics.
      • Event Management Administrators: Have full configuration capabilities over metric definitions and connectors.
      • Operators: Can view all metric definitions and connector settings.
    • Integration with Service Operations Workspace: Anomalies are displayed immediately to facilitate prompt incident response.
    • Noise Reduction: The system promotes only the most meaningful anomalies to reduce alert noise.
    • Metric Explorer: Provides visualization of raw metric data to improve understanding and resolution of alerts and incidents.

    How It Works

    The Metric Intelligence pipeline starts with data collection via agents and connectors. Data is then normalized, grouped, and sent through the ServiceNow instance to the Clotho Time Series Database (TSDB). A Trainer/Learner job creates daily models of normal behavior, which are cached on the MID Server. The MID Server performs real-time anomaly detection by comparing incoming data against these models. Anomalies detected are scored and surfaced as alerts.

    Benefits for ServiceNow Customers

    • Proactive Monitoring: Enables early detection of potential service issues that event data alone might miss.
    • Improved Alert Quality: AI-driven anomaly detection reduces false positives and highlights critical issues.
    • Faster Incident Resolution: Access to detailed metric visualizations aids in troubleshooting and corrective actions.
    • Role-Specific Functionality: Provides appropriate access and capabilities for administrators, operators, and event management users.

    Next Steps

    To fully leverage Metric Intelligence, customers should explore configuration and optimization guides to tailor anomaly detection and metric management to their environment. Reviewing reference materials will further support successful implementation and use.

    Learn more about using Metric Intelligence to analyze metric data and identify anomalies.

    Metric Intelligence overview

    Metric Intelligence helps identify and avoid potential service outages. Based on historical metric data, Metric Intelligence indicates anomalous behavior of CIs which events might not capture.

    Metric Intelligence users

    Table 1. Users
    User Description
    Event Management user

    [evt_mgmt_user]

    Can view alerts and their underlying metrics.
    Event Management administrator

    [evt_mgmt_admin]

    Can configure all metric definitions and connector settings.
    Operator

    [evt_mgmt_operator]

    Can view all metric definitions and connector settings.

    Metric Intelligence workflow

    The following illustration describes the layout and data flow within the Metric Intelligence application.

    Figure 1. Metric Intelligence Pipeline
    Infographic outlining Metric Intelligence workflow
    1. Data collection: Agents, third-party connectors, and custom connectors (REST) gather performance data from servers and infrastructure components. Data collected by agents is passed to the MID Server via the WebSocket, and data gathered by third-party and custom connectors is passed to the MID Server via the Connector.
    2. Data normalization: Raw data is formatted by the Normalizer to make it legible to the metric base.
    3. Data grouping: Data is grouped by the Batcher and sent to the REST API on the instance (Glide).
    4. Data transference to the Clotho TSDB: REST API processes data and sends it to the Clotho TSDB.
    5. Model creation: The Trainer/Learner job runs and creates a model based on the received data. For example, the job may learn that the threshold for normal CPU usage is 60%. A new model is created every day, based on that day's data together with past data (most models collect data from the past 14 days).
    6. Model data transference to the Time Series Model Cache DB: The data is sent to the Time Series Model Cache DB on the MID Server via the instance (Glide). The model cache stores the bounds of the 'normal' model.
    7. Anomaly detection: Data outside the bounds of normal is detected by the MID Server and is rendered an anomaly score. Anomalies are stored on the instance and are displayed in the Service Operations Workspace. Anomaly detection is performed in real-time, so customers are made aware of anomalies immediately.

    Metric Intelligence benefits

    Benefit Feature Users
    Monitor your system’s health, performance, and availability through automated collection of events and metrics, leveraging automated configurations. Agent Client Collector Monitoring NOC Operator, Event Management administrator
    Reduce noise by promoting only the most meaningful anomalies.

    View anomaly alerts

    Create metric rules

    Event Management administrator
    Detect anomalies with AI-based anomaly detection, either with unsupervised machine-learning abnormal pattern detection (no user intervention), or by setting deterministic alert rules (manually setting a static threshold). How Health Log Analytics generates alerts Event Management administrator
    Improve resolution time on open alerts and incidents with raw metric data visualization. Metric Explorer NOC Operator, Event Management administrator