Alert Correlation behavior and possible solutions (Rule BAsed)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 hours ago
For Rule based correlation we have a situation where a closure of Primary alert causes closure of the Secondary alerts, while this works in most of the cases in some cases this causes serious concerns.
We receive alerts from Application Monitoring sources for the Systems which are customer facing.
A system has various components such as multiple instances and central service , so in general if a System goes through a degradation or Outage, we may receive multiple alerts, today we group them via rule-based correlation into Primary and Secondary. But we have a situation where if one instance out of many instances is up and if the corresponding alert was a Primary alert, it will simply close the Primary alert and the respective secondary alerts, but the System might still not be up, and the team may lose insight on the System Outage/degradation. Therefore, there is a following ask from the Customer.
- If the Primary Alert is getting closed do not close the secondary alert rather promote the next secondary (based on time of arrival of the alert) as Primary and keep the association with the INC as it is. (We cannot implement Tag based clustering due to the limitation of Tag based clustering only allowing configuring of Tags and not scripting)
- Moreover Tag Based clustering solution is facing an issue from the platform as there are some delays observed with this approach i.e. the tag based clustering in itself takes time and until then the incident creation happens.
Considering the scale at which ServiceNow ITOM event Management is implemented, the out of the box central settings are proving to be a hindrance.
Can someone please advise us on the above-

