
- Post History
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
on ‎05-30-2020 10:38 AM
Introduction:
In this Event management series we are looking at different aspects of event management as shown below. In previous articles we saw event rules and CI Binding ,if you have missed that then please go through them for proper understanding of correlation. Because CI binding also plays an vital role in correlation.
- Event Management : Event Rule Components and their Usage
- Event Management : Part 1 of Event Rules for CI Binding - Hardware CI's
- Event Management : Part 2 of Event Rules for CI Binding - Application/Database CI's
In this article we will see the next aspect of event management and that is "Alert Correlation" highlighted in sky blue color above. Before i go in-depth of for this topic i would like to that few people for there efforts and blogs which i took help from and i would like to give them credit. See this below article for other information:
-
Journey through Event Management - Alert Correlation by
@aleck.lin - Alert Correlation: Advanced Processing Example by
@vNickNOW
So what i am going to do is show you some use case's for each correlation type and explain how it works. So we will see how the golden circle works in alert correlation.
What is alert correlation?
This is process of grouping the alerts logically and classify them as primary and secondary alerts. Alert correlation also helps us to group this alerts into different groups. For example: if there is an alert from SCOM on CI "ABC" and if you see an alert from Splunk on the same CI "ABC" then there will be an automated alert group based on CI.
How alert correlations and grouping happens?
There are multiple ways how the new alert or the reopen alert is correlated with existing alert. ServiceNow does alert grouping and correlation in RAMC order which we will see with some uses cases below. Before we go to use case see below what RAMC stands for. This is really well explained by Aleck in his article.
R - Rule Based (We can configure alert rules as per requirement to decide the primary and secondary alerts using filters, scripts and relationships)
A - Automated (This is automated OOB alert correlation mechanism which works based on CI or Node name)
M - Manual (Self explanatory, where engineer do alert correlation manually by grouping them into few groups)
C - CMDB (Based on your CMDB CI relationships)
Lets look into this one by one and with few real time examples.
1.Rule Based Grouping:
As stated above this should be configured by an developer and as per our own requirement. This can be done under Alert Correlation section in left navigation.
Use Case:
I have a unique case regarding splunk search heads to show, where we have multiple events coming from different search heads for the same node which results into multiple alerts and hence increases noise. In return we need to go and close each alert which is kind of overhead. Now lets see how to correlate this kind of alerts.
Before Correlation Rule Creation:
Below you will see that before creating a correlation rule the alerts where independent and CI/Node is the same. So practically support team needs to work on each alert and acknowledge and solve them which creates more noise and overhead.
After Correlation Rule Creation:
We will create alert correlation rule as shown below where primary alert will be the alert coming from Search head one with instance as splunk and secondary alert will be the alert with other search heads for the same CI and Node as selected in relationship type with an interval of 60 min. This means only correlate alerts which are created in last one hour.
(Alert Correlation Rule)
Now in this section we will see how the events got created with different sources with respective nodes and alerts.
(Events created with Source and Source Instance)
Once the alerts are create and CI binding is done, alert correlations triggers which process the rule based alerts first with order specified on correlation rule, if there is a match then it is applied and no other rule or grouping mechanism is evaluated this is same as assignment rule or event rule. Below you can see that Alert0010144 is primary alert as the source is Splunk-sh_01 and group is RULE BASED with role of this alert in the group is Primary which means this is the primary alert of the group. So whatever happen with this alert is cascaded to below alerts i.e. secondary alerts like state, feedback of group,etc. I have highlighted secondary alerts as well in below screenshot where group field clearly shows that it is a secondary group of alerts.
(Rule Based Alert Correlation)
Reason for showing below screenshot is to explain that whenever the rule is applied then in group column you can see it show "R". Similarly we have "A", "M" and "C" for other groups.
2.Automated Grouping :
This grouping and correlation mechanism has second priority and it will run only when property "Enable alert aggregation (sa_analytics.aggregation_enabled)" is true. RCA and alert aggregation helps us to automatically group the alerts based on CI or Node field on alerts. If CI is empty it uses node field to aggregate those alerts. Also on important point to note is it create a virtual alert as primary alert and adds all other alerts as secondary alerts in that group. This type is a unique case of service analytics which incorporates machine learning to group alerts.
Use Case:
Group alerts coming from different sources for same Node/CI created in last 1 hr.
Event, Alert and Grouped Alert:
(Event)
Below you can see that we have 3 secondary alerts and one virtual alert whose source is Group Alert, this is automatically created. This is nothing but alert intelligence and you can see this on agent workspace under list as show below. You will only see primary alert there and not secondary alert.
(Alert and Grouped Alert)
3.Manual Grouping:
As the name suggest this is way where operator can add alerts to group manually by assigning parent to an alert meaning you can add alert into parent field of other alerts as below. Alert0010056 is made secondary by making alert0010049 as parent. Once you do this it will automatically group them into manual group and is highlighted with "M" as shown below. Once you do this automatically ServiceNow make note of this change and next time while grouping it will make use of this pattern which was used to group alerts.
(Alert form)
(Group Alert)
4.CMDB Grouping:
This type of grouping makes use of CI relationships in suggested CI relationships table. There are few properties in system which should be true to allow CMDB alert group. So please go through this link to see if the properties are enabled or not https://docs.servicenow.com/bundle/orlando-it-operations-management/page/product/event-management/co.... One thing which is very important to note is, this rule is applied only if automated and rule based grouping is not applied to alert. This can be easily seen by the service map why they are grouped.
Use case:
Create alert for different CIs on the same application service. They should be automatically grouped in CMDB group based on CI Relationship. We will create alerts on highlighted CIs as below:
Event, Alert and Grouped Alert:
See below screenshots which explains how the events are created with different different alerts Alert0010152,Alert0010153,Alert0010154 and Alert0010156. Once the alerts are created the Aggregation engine runs and try to group them into one virtual alert using grouping mechanism. So Alert0010155 is treated as parent virtual alert for all this alerts and it is shown below.
(Events)
(Virtual Grouped Alert)
(Grouped Alert on operator workspace)
Why it is required?
After looking at above examples there are few obvious reasons why this correlations are required. In short below are the reasons why it i required.
- Reduce Noise.
- Helps operators to solve alerts in bulk.
- Reduce overheads.
- Helps providing feedback in bulk for group alerts.
Concluding Notes:
We saw how to group alerts and create alert correlation rules, there utilization and when they are applied. Few important take away are:
- At any given time alert should be part of only one group.
- Correlation rule is applied only when new alert is created or the alert status changes from Closed / flapping to open or reopened.
- Closing of primary alert is cascaded to secondary alerts.
YouTube Video : https://youtu.be/P1OB48PZxLw
Please comment and please suggest if anything needs to be improved.
Please don’t forget to mark helpful ,bookmark this article and subscribe my YouTube channel.
Thanks and Regards,
Ashutosh Munot
- 19,690 Views
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Nice and helpful overview Ashutosh!

- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Thank you so much ROB!!!!!
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi Ashutosh,
Hope you are doing fine, in the "Automated Grouping" scenario which you explained above, you said the use case is to "Group alerts coming from different sources for same Node/CI created in last 1 hr". Where can we change the grouping parameter from 1 hour? Can we change it to 10 minutes or 2 hours?
For example, i need automated alert groups to be created for the alerts from the same Node/CI in last 2 hours. How can we achieve this?

- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
It is just for demo purpose and explanation. You can not set this parameter of 1 hr or 2 hr or minutes. Actually in case of Automated Alerts Groups - Behind the scene AI and ML alogo works and it looks for some kind of pattern in Alerts Data and accordingly ML model groups those.
=======
To create automated alert groups, aggregation algorithms rely on historical alerts with the same alert identifier (CI and metric identifier) and which occurred multiple times in the same time frame.
=======
In event Rule you can control the frequency and minutes after which event will convert into Alert but not the Alert Automated Group Rule Timing.
Hope this helps.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi Ashitosh - great article. What's the chance of an update with the latest automated (feature identifier and patterns) grouping examples?

- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Can you explain a bit what you want.
Thanks,
Ashutosh
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Yes - automated alert grouping uses "To create automated alert groups, aggregation algorithms rely on historical alerts with the same alert identifier (CI and metric identifier) and which occurred multiple times in the same time frame" per the documentation. An explanation of this would be great, as well as adding the new text based (NLP) grouping.

- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Will get back with some more information on this.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hello Ashutosh,
Thank you for the series of amazing articles. I would like to understand how the Patterns are working in the case of Automated as well as Manual Grouping. How does it process the historical data ? I found below scheduled jobs which query the historical data, Please help me understand how this actually works.

- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
HI,
Will come back with proper answer.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi Ashutosh,
First of all, thanks for creating an excellent and easy to understand article.
I have a query regarding the CMDB grouping, trying to implement the same use case but not working as expected.
Just pasting my properties here, please guide me if I am missing anything to get expected behavior.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi Ashutosh,
If we disable Alert grouping will it affect incident creation? or for automated alert groups if we extend the time difference for 2 alerts more than 10 min (sa_analytics.agg.query_dynamic_window) then the primary alert will not trigger an incident?
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi Ashutosh,
As per client, they want only the CMDB alert grouping. We tried to play with the correlation properties. But still we are getting the Automated alert grouping as well. Please advise what are all the properties needs to update if we want only the CMDB alert grouping.
Regards
Yokesh
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
how we can set the priority for the groups ? if an alert qualifies for different groups so how we can set priority for a particular group ?
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
@Ashutosh Munot1 its a 3 yr old article but its so well written that its still very relevant and useful, ty for your contribution !
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Informative article.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
@Ashutosh Munot1 Can you please help with the information if we have a requirement for "Automated alert grouping" for a specific LOB is it possible to achieve that?
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
@Ashutosh Munot1 Do you know how to manage and auto close group alert ?
1. Group alert has different message key ( not same as primary or secondary )
2. How can we close the primary alert automatically so that group alert and secondary alerts are closed?
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Is there any feature to ungroup based on certain fields? The requirement is to based on certain fields ,for instance metric name, the alerts has to be ungrouped so that the incidents can be assigned to different assignment group. Please advise.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi @Ashutosh Munot1 ,
Your examples showing multiple incidents , so can we avoid the multiple incident creation ?, Primary should have a incident and not Secondary alerts, I need you help on this
Thanks
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi @SK5555 ,
You can add a filter in the Alert Rule as "Group is not secondary" for incident creation. This way, only primary alerts will be created as incidents.
Thank you
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Great @Sravani36 ,
What is the difference between alert tag clustering and Alert correlation , Which one is best for to reduce incident noise, Please help on this
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
can we avoid multiple incident creation by using tag clustering
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hello team,
@Ashutosh Munot1 @Sravani36
If we use tag clustering, it will generate one Virtual Alert, and all related alerts will be grouped under it as secondary alerts. The issue is that when the first alert comes in, an incident is created before the Virtual Alert is generated. Later, when other alerts matching the tag are clustered under the Virtual Alert, those alerts may already have incidents. What I want is only one incident, and it should be created on the Virtual Alert.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi @SK5555 ,
I haven’t really worked with alert clustering yet. From what I understand, it needs tag-based discovery, and the tag should also be present in the alert field. In our case, we mainly rely on correlation rules and alert grouping.
The way it works is: when the first alert (primary alert) comes in with a node and CI, any new alert with the same node and CI within the defined time window will be grouped as a secondary alert based on the correlation rule. Based on our Alert Management rule incidents will only be created for the primary alerts —not the secondary ones. / For correlation, you can use any field—such as the metric name or any other attribute—you consider unique for grouping.
i just added sample alert rule for reference
Thank you