Event Management configuration preferences

Washington DC IT Operations Management

Release

washingtondc

ft:locale

en-US

ft:publication_title

Washington DC IT Operations Management

ft:clusterId

itom

bundleId

itom

workflow

Technology

Event Management configuration preferences

Release version: Washingtondc

Updated February 1, 2024

4 minutes to read

Summarize

Summarized using AI

Summary of Event Management Configuration Preferences

This guide outlines the configuration preferences and best practices for Event Management within ServiceNow's IT Operations Management. It aims to help customers optimize performance and manage events effectively.

Show full answer Show less

Key Features

Self-health Monitoring: Enabled by default to track Event Management features and prevent performance issues.
Business Rules Guidance: Avoid creating business rules for event tables; focus on alert tables and ensure high efficiency to prevent performance degradation.
Scaling Recommendations: Monitor average event processing times before scaling. If processing time exceeds a few milliseconds, investigate the causes.
Multi-node Processing: For large-scale environments, enable multi-node event processing and configure job counts based on deployment size and expected event rates.
Alert Lifecycle Management: Understand how alerts are opened, closed, and managed, including the concept of flapping alerts.

Key Outcomes

By following these configuration preferences, customers can enhance the performance of Event Management.
Effective management of alerts and events helps prevent system overloads and ensures timely responses to incidents.
Implementing these practices leads to more efficient troubleshooting and service continuity, ultimately improving operational health.

Preferred settings of properties and general configuration.

Use the Known Error Portal and the Community to further help you find information issues.

General preferences

Self-health

By default, the self-health monitoring feature is enabled. Use this feature to monitor and track Event Management features.

Note:

CIs used in the self-health service are created in the CMDB.

Use the following settings to help with preventing performance degradation.


Topic	Details
Business rules	Avoid writing business rules for event [em_event] tables, as they do not run in the current default REST URL that is used for event injection. Business rules that are written for the alert [em_alert] table must be highly efficient or they may result in performance degradation. Instead of writing a business rule, consider whether it is more appropriate to write a job. An inefficient business rule can cause incident creation for an alert to fail and the alert impact calculation to fail. Avoid writing asynchronous business rules for the alert table. Business rules must not change the Category field in the event [em_event] tables.
Scaling up	Check the average event processing time before scaling up event throughput when first starting with Event Management. Do this check after an initial flow of events and all rules are in place. If processing time takes over a few milliseconds per event in average, determine the cause for the processing slowdown before continuing to scale. Performance duration can be checked in the Performance Statistics [sa_performance_statistics] table filtered by the Type field with the Event Processing value.
Configure for large-scale environments	Set the Enable multi node event processing (evt_mgmt.event_processor_enable_multi_node) property to Yes. Enable multi-node in production environments and set values based on the size of the deployment and expected event rate. Set the Number of scheduled jobs processing events (evt_mgmt.event_processor_job_count) property to `4`. If you are sending events from a custom source, verify that events haveMessage Key or Source, Node, Type, and Resource data.
Latency issues for receiving events	Check the following settings: Verify that the Bucket field in the event [em_event] table is set to a value that is greater than zero (0). Navigate to System Scheduler > Scheduled Jobs and search for - process events. Check that all - process events jobs exist according to the Number of scheduled jobs processing events (event_processor_job_count) property configuration. Verify that the State is Running or Ready. If the state is Queued or Error, set the job state to Ready.
Archive events	Avoid changing the default retention time for events. To log events for a longer time, create an archive table and a job that copies new events to it. Do this by scheduling a job to regularly back up events [em_event] to a custom table. Do not extend table rotation by adding more days.

Alert settings

Alert lifecycle: General alert functionality:

An alert is opened whenever an event is not ignored or its threshold is exceeded by an event rule, and de-duplication does not identify the event as belonging to an existing alert.
An alert is closed when a closing event is sent on the same message key, or the alert is closed manually.
An alert is reopened if an opening alert that has the same message key is sent within the timeframe defined in properties (default is one hour).
If an alert is opened and closed at a high rate, as defined in properties, it becomes flapping. When this opening and closing rate stops, the alert goes out of flapping state.
If an incident is opened from an alert, that alert remains open as long as the incident remains open. By default, when either the incident or the alert is closed the other is closed as well. This behavior can be configured using properties.
Do not close an alert when creating a corresponding incident.
Do not delete an open alert. Close an alert first and then delete it.
Use Acknowledge to denote that the alert is known, and can temporarily be ignored.
Do not use Acknowledge to mark an alert as needing attention.
The evt_mgmt.alert_auto_close_interval property automatically closes alerts after the specified period. Do not specify 0, as this value disables the feature and may lead to performance degradation.

Alert management rules

Scheduled jobs apply alert management rules to new and updated alerts every 11 seconds. If an alert management rule does not immediately start, allow 10–15 seconds before you start troubleshooting.
Use the Order field to control which alert management rule runs if two alert management rules have similar conditions set.
Specify a filter that determines which alerts the rule applies to.
Respond to alerts. For example, by using subflows and workflows, create incidents for primary alerts with critical severity, or open a search engine in a browser to search for data according to the description field of the alert.

For more information, see Alert management rules for resolving alerts

Business rules

Business rules created on alert tables should not take more than a few milliseconds. In place of using a business rule, consider if the same functionality can be achieved using a job.
Do not use business rules to associate an alert with a CI. Use event rules to do binding instead of using business rules.