Exploring Metric Intelligence

Metric Intelligence overview

Metric Intelligence helps identify and avoid potential service outages. Based on historical metric data, Metric Intelligence indicates anomalous behavior of CIs which events might not capture.

Metric Intelligence users

Table 1. Users
User	Description
Event Management user [evt_mgmt_user]	Can view alerts and their underlying metrics.
Event Management administrator [evt_mgmt_admin]	Can configure all metric definitions and connector settings.
Operator [evt_mgmt_operator]	Can view all metric definitions and connector settings.

Metric Intelligence workflow

The following illustration describes the layout and data flow within the Metric Intelligence application.

Data collection: Agents, third-party connectors, and custom connectors (REST) gather performance data from servers and infrastructure components. Data collected by agents is passed to the MID Server via the WebSocket, and data gathered by third-party and custom connectors is passed to the MID Server via the Connector.
Data normalization: Raw data is formatted by the Normalizer to make it legible to the metric base.
Data grouping: Data is grouped by the Batcher and sent to the REST API on the instance (Glide).
Data transference to the Clotho TSDB: REST API processes data and sends it to the Clotho TSDB.
Model creation: The Trainer/Learner job runs and creates a model based on the received data. For example, the job may learn that the threshold for normal CPU usage is 60%. A new model is created every day, based on that day's data together with past data (most models collect data from the past 14 days).
Model data transference to the Time Series Model Cache DB: The data is sent to the Time Series Model Cache DB on the MID Server via the instance (Glide). The model cache stores the bounds of the 'normal' model.
Anomaly detection: Data outside the bounds of normal is detected by the MID Server and is rendered an anomaly score. Anomalies are stored on the instance and are displayed in the Service Operations Workspace. Anomaly detection is performed in real-time, so customers are made aware of anomalies immediately.

Metric Intelligence benefits


Benefit	Feature	Users
Monitor your system’s health, performance, and availability through automated collection of events and metrics, leveraging automated configurations.	Agent Client Collector Monitoring	NOC Operator, Event Management administrator
Reduce noise by promoting only the most meaningful anomalies.	View anomaly alerts Create metric rules	Event Management administrator
Detect anomalies with AI-based anomaly detection, either with unsupervised machine-learning abnormal pattern detection (no user intervention), or by setting deterministic alert rules (manually setting a static threshold).	How Health Log Analytics generates alerts	Event Management administrator
Improve resolution time on open alerts and incidents with raw metric data visualization.	Metric Explorer	NOC Operator, Event Management administrator

What to explore next

To learn more about configuring and using Metric Intelligence, see: