Anomalies detection
Summarize
Summary of Anomalies Detection
Instance Observer in ServiceNow proactively detects anomalies—unusual or outlier data points—in performance metrics on production instances, helping you monitor system behavior effectively. These anomalies are identified for both cyclical and non-cyclical metrics within the Impact Total package, enabling you to spot potential performance issues based on deviations from historical patterns. Although not all anomalies signify problems, they serve as notifications for you to assess their criticality and configure alerts accordingly.
Show less
Key Features
- Cyclical and Non-Cyclical Metrics: Cyclical metrics repeat regularly, while non-cyclical metrics happen irregularly. Five key cyclical metrics tracked include:
- Transaction Count: Total UI transactions of type UITYPE.
- Server Response Time: Average execution time for UITYPE transactions.
- SQL Response Time: Mean database response time at the application layer.
- Semaphore Mean: Average concurrent user transactions over one minute.
- Node Memory Max: Maximum memory usage per node in MB.
- Visual Indicators: Anomalies are marked in red on performance charts, with the x-axis showing time and y-axis showing metric values. Charts display mean values and upper/lower boundaries based on historical data for context.
- Job Anomaly Detection: Tracks scheduled jobs hourly, highlighting anomalies in job execution times and concurrent job counts. You can drill down to detailed hourly job data and seven-day execution patterns to identify root causes for performance changes.
- Scheduled Job Criteria: A job qualifies as scheduled if it runs daily or weekly consistently over the past four weeks.
- Calculation Methods: Average transaction counts and job durations are calculated over four-week historical windows to establish baselines and detect anomalies using statistical thresholds.
- Anomaly Identification Algorithm: Utilizes the Z-score statistical model (univariate method) to detect outliers by comparing current values against mean plus five times the standard deviation.
Anomaly Response and Alerting
Detected anomalies are informational outliers rather than definitive issues, so you should analyze their criticality before configuring alerts. Instance Observer supports alert configuration based on these detected anomalies to help you respond proactively to potential performance concerns.
Instance Observer proactively detects the anomalies for cyclical or non-cyclical metrics under the performance chart for the Impact Total package on production instances. Anomalies represent metrics outliers based on historical patterns. Every anomaly may not represent an issue, but, notifies you and you decide the criticality and configure alerts, accordingly.
Cyclical metrics occur as a complete set of events that repeat themselves regularly in the same order or in a regularly repeated period. Non-cyclical metrics are metrics that repeat themselves irregularly or in random, less predictable repeated periods. An anomaly, also known as an outlier, is a data point that is unusual, rare, or doesn't conform to the expected patterns or distribution of the data.
- Transaction count: The instance-wide sum of all UI transactions of an internal type known as UI_TYPE.
- Server Response Time: The average or mean execution time for UI_TYPE transactions.
- SQL Response Time: The reported mean of database response time measured at the application layer that starts when a query is sent to the database and finishes when the response has been received.
- Semaphore Mean: The average number of end-user transactions being processed concurrently over a one-minute period.
- Node Memory Max: The in use memory max in MB per node at a given data point in history. This value generally ranges between 1000 MB to 2048 MB.
- The x-axis represents time, and the y-axis represents the actual metrics as per date range selected. For example, the line chart displays the transaction count values over time. Anomalies are denoted by red color coding on the chart and represent the occurrence of an anomaly in the data. The placement of the red mark depends on the criteria or algorithm used to detect anomalies.
- The range represents the upper and lower boundary limits with a normal distribution of the metrics based on their historical dataset pattern.
- The mean line represents the four week average value of the metrics to compare the deviation at a given point in time.
Job anomaly detection
Job anomaly charts track the number of scheduled jobs running concurrently for each hour of the day with the overlaying metric of the average of transaction counts for each hour. Any bar in the chart that has one or multiple anomalous jobs is highlighted as red. Select the detail link of the bar chart to view the job level details.
Job Details will represent the hourly scheduled jobs along with respective average transactions of that instance.
Drill down from the hourly scheduled job count into an individual recurrence job for any hour of day and further into the execution pattern of the same job from the last seven days. This can help to perform end to end root cause for jobs that usually take a consistent amount of time to complete that suddenly experience a significant increase or decrease in execution time, indicating a possible performance issue.
- Schedule job criteria
- For the job to be considered a scheduled job, it should satisfy at least one of the following criteria:
- The job runs at least once for each day of the week.
- The job has run at least once for every week in the past four weeks.
- Average transaction count calculation
- For every hour in a given day, the sum of the transaction count of the past four weeks for the same day and same hour is averaged. For example, by fetching the sum of transaction counts from the past four Mondays for the fourth hour and averages the values for the final calculation.
- Job anomaly identification
- For the past four weeks, for every hour in which the job ran, the average duration time is calculated with the standard deviation value, which is the mean value, plus 5 multiplied by the standard deviation value.
Anomaly response
All anomalies don’t represent an issue, but, the outliers detected based on historical patterns. Configure alerts accordingly after analyzing the criticality of the anomaly detected. See Configure anomaly alerts for more information on alerts.