Event Management configuration preferences
Summarize
Summary of Event Management Configuration Preferences
This guide provides essential configuration preferences and best practices for ServiceNow Event Management, enabling customers to optimize event processing, alert handling, and system performance. It covers general settings, scaling, event integration, event and alert rule configurations, alert lifecycle management, and planning for large-scale deployments.
Show less
General Preferences
- Self-health Monitoring: Disabled by default; can be enabled in Event Management Properties to track system health using CIs created in the CMDB.
- Business Rules: Avoid writing inefficient or async business rules on event tables to prevent performance degradation. Prefer jobs over business rules for complex processing.
- Scaling: Monitor average event processing times before scaling throughput. Use performance statistics to identify bottlenecks.
- Large-Scale Configuration: Enable multi-node event processing and configure scheduled jobs based on deployment size and event rates.
- Latency Troubleshooting: Verify event bucket values and scheduled job states to ensure timely event processing.
- Event Archiving: Do not extend default retention periods; instead, create custom archive tables and scheduled jobs to back up events.
Event Integration
- SNMP Traps: Use monitoring tools to send SNMP traps and upload MIBs before defining event rules for accurate event parsing.
- Web Service API: Recommended for integration to reduce event rule complexity. Use dedicated credentials per event source.
- CloudWatch: Use dedicated credentials for integration.
- Email: Use only for low-volume event sources if other methods are unavailable.
Event Rules
- Write broad event rules first, followed by more specific ones with lower order values.
- Event processing does not modify original events; troubleshooting can be done via processing notes or UI actions.
- Ensure exact matching of event fields, especially for SNMP traps requiring MIB uploads.
- Establish consistent naming conventions for event rules, e.g., <customer acronym>.<Event Source>.<Description>.
- Populate key fields such as Source, Node, Type, Resource, and Metric Name to enable effective de-duplication and alert correlation.
- Include additional event data only in the Additional information field; do not add custom columns to event tables.
De-Duplication and CI Binding
- Use the messagekey field to identify duplicate events; default key combines Source, Node, Type, Resource, and Metric Name.
- Ensure event sources populate these fields to improve processing distribution and throughput.
- Bind alerts to configuration items (CIs) via the Node field or ciidentifiers JSON to associate alerts accurately with hosts or devices.
Alert Settings and Lifecycle
- Alerts open when events exceed thresholds and are not duplicates; close upon receiving closing events or manual closure.
- Alerts can be reopened if the same message key event occurs within a configurable timeframe.
- Alert flapping is detected based on rapid open-close rates; alerts exit flapping state once this stabilizes.
- Alerts linked to incidents remain open while the incident is open; properties control whether closing one closes the other.
- Use "Acknowledge" to mark alerts as known but not requiring immediate action.
- Do not create alerts in Closed, OK, or Open states improperly; use Clear instead of OK for resolved issues.
- Automatic alert closure is configurable but should not be disabled to avoid performance issues.
Alert Management Rules
- Define automated alert responses such as incident creation, subflow execution, or notifications.
- Use alert filters to target specific alerts and the Order field to prioritize rule execution.
- Scheduled jobs apply these rules approximately every 11 seconds; allow a 10-15 second delay before troubleshooting.
- Prefer subflows for customizable alert handling workflows.
- Keep alert-related business rules lightweight; use jobs instead where possible. Avoid binding alerts to CIs via business rules; use event rules instead.
Planning and Operational Best Practices
- Organize event source configurations in parallel efforts for efficiency.
- Validate event formats and test in non-production environments using duplicate element managers or dual event sending.
- Group services logically using Service Groups to simplify dashboards and service health visualization.
Metric Intelligence Collector Logs and Performance
- Collector logs are located in the MID Server directory and include PowerShell metric logs and input/output files.
- Enable debug mode on MID Server for detailed performance monitoring.
- Performance data is accessible in the Performance Statistics table filtered by Metric Collector.
Preferred settings of properties and general configuration.
Use the Known Error Portal and the Community to further help you find information issues.
General preferences
- Self-health
- By default, the self-health monitoring feature is not enabled. To enable it,
navigate to and select Yes for the
Enable Event Management self-health monitoring
(evt_mgmt.self_health_active) property. Use this feature to monitor and
track many Event Management
features.Note:CIs used in the self-health service are created in the CMDB.
Event integration
- SNMP traps
- Use a monitoring tool to send SNMP traps, rather than sending them directly from devices.
- To avoid having to rewrite event rules, upload MIBs prior to defining the event rules.
- Web service API
- Using a web service API for integration can reduce the number of event rules needed. This action avoids having to transform events (prepared data is sent in an event to the instance).
- Use dedicated credentials for integration. Optionally, designate credentials specific to each event source.
- CloudWatch
- Use dedicated credentials for integrating CloudWatch with ServiceNow.
- Use email only if the source has a low volume and other options are not available, such as, running a script or forwarding an SNMP trap.
- Event rules
- Configuration settings when creating event rules:
- Write Event Rules to apply to the broadest number of events possible. More specific rules can then be created as necessary and should use a lower-order value.
- If a more general rule can achieve the same outcome, avoid writing Event Rules that apply only to a certain subset of events.
- When Event Rules are applied to events, no changes are made to the original event. All processing occurs in memory, so use the Processing Notes field and/or use the Check Process of Event UI action link to troubleshoot.
- If you change a rule/transform that has existing mapping rules, you should review and retest with events that are either actual or simulated.
- Ensure that the From field value exactly matches a string in the JSON in the additional_info field of an event. This matching happens when a rule has been configured based on information in a MIB file. If the MIB file is not uploaded, the JSON for the SNMP trap shows varbinds (variable binding) with dotted names, instead of the translated name in the MIB. The event field mapping rule then fails to be applied.
- Establish a consistent naming convention. A common convention is: <customer acronym>.<Event Source>.<Description>. For example, ACME.OEM.Normalize
- If two Event Rules have similar conditions set, use the Order field to control which Event Rule runs.
- Use Event Rules to associate an alert with a CI.
Alert settings
- Alert lifecycle
- General alert functionality:
- An alert is opened whenever an event is not ignored or its threshold is exceeded by an event rule, and de-duplication does not identify the event as belonging to an existing alert.
- An alert is closed when a closing event is sent on the same message key, or the alert is closed manually.
- An alert is reopened if an opening alert that has the same message key is sent within the timeframe defined in properties (default is one hour).
- If an alert is opened and closed at a high rate, as defined in properties, it becomes flapping. When this opening and closing rate stops, the alert goes out of flapping state.
- If an incident is opened from an alert, that alert remains open as long as the incident remains open. By default, when either the incident or the alert is closed the other is closed as well. This behavior can be configured using properties.
- Do not close an alert when creating a corresponding incident.
- Do not delete an open alert. Close an alert first and then delete it.
- Use Acknowledge to denote that the alert is known, and can temporarily be ignored.
- Do not use Acknowledge to mark an alert as needing attention.
- Do not create alerts in any of these states:
- Closed
- OK
- Open
- The
evt_mgmt.alert_auto_close_intervalproperty automatically closes alerts after the specified period. Do not specify 0, as this value disables the feature and may lead to performance degradation. - Do not create alerts in OK state. In some monitoring systems OK denotes that an issue has been resolved, while in other monitoring systems OK is used to denote events that are not of operational significance. For the former case, use Clear instead of OK using a Mapping Rule. For the latter case, have an Ignore rule, unless the events are of specific value.
- Alert management rules
- Use alert management rules to define automated responses to alerts, such as opening incidents, running subflows, and launching applications or URLs. For more information, see Alert management rules for resolving alerts.
- Use alert filters to specify which alerts the rule applies to. For more information, see Create an alert management rule.
- Use the Order field to control which alert management rule runs first when rules have similar conditions. For more information, see Create an alert management rule.
- Use subflows to customize alert handling. For example, you can resolve alerts or notify teams. For more information, see Create a custom subflow for alerts.
- A scheduled job applies alert management rules to new or updated alerts every 11 seconds. If an alert management rule does not start immediately, allow 10–15 seconds before you start troubleshooting. For more information, see Alert management rules for resolving alerts.
Business rules
- Business rules created on alert tables should not take more than a few milliseconds. In place of using a business rule, consider if the same functionality can be achieved using a job.
- Do not use business rules to associate an alert with a CI. Use event rules to do binding instead of using business rules.
Planning
- Organize event source configuration of filters, modules, and so on, into multiple parallel efforts, rather than in serial.
- Validate processed event formats to ensure that data that is parsed is aligned with desired results.
- Test production events in a non-production environment. Integrate with non-production element managers and ServiceNow instances. If non-production element managers are not available, send events from element managers to both production and non-production environments.
Services and dashboard
- Use Service Groups to group services into logical groups to reduce the number of services displayed on the Service Health dashboard.
- Import manually built service maps.
Metric Intelligence collector logs and files
Metric Intelligence collector logs and files are located under the path $(MID_SERVER_DIR)/agent. Use these logs and files for troubleshooting and monitoring purposes.
| Log or file | Path |
|---|---|
| PowerShell metric collector log file | Logs/retrieve_metrics{connector instance ID}.log |
| PowerShell output file | work/metrics/metrics_output_{connector instance ID}.txt |
| PowerShell input file | work/metrics/parameters_{connector instance ID}.txt |
Metric Intelligence performance can be checked in the MID Server log file when the mid.log.level MID Server parameter is in debug mode.
Metric Intelligence performance numbers are available in the Performance Statistics [sa_performance_statistics] table. To view the performance numbers, filter the Performance Statistics list for Metric Collector.