- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tuesday
Hello All,
We are experiencing an issue with Incidents created through the Logic Monitor integration. These incidents are generated for collector failovers, and the failover details are included in the Incident description.
When the failover is restored, Logic Monitor sends a notification, which is added as a work note to the Incident (e.g., indicating that the failover is back online for the collector). However, the Incident is not automatically closing based on this update.
Currently, users are manually closing these incidents, and when the same collector goes down again, the Incident is reopening as expected. If I implement an auto-close mechanism, it should not interfere with this expected reopen behavior, even if the Incident is in a closed state.
Is there a recommended approach to configure these Incidents to auto-close when a specific work note is added by the Logic Monitor integration (for example, when the work notes states that the failover has been resolved and the collector is back online)?
Any guidance or best practices would be greatly appreciated. Thank you!
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yesterday
Please see below for LogicMonitor documentation that explains how collectors are monitored and alerts are generated & cleared. I am not sure why your LM team are saying alerts will not be auto-cleared when that should be the case. You can request them to check with LogicMonitor support team to clarify on this if needed.
There are multiple scenarios when it comes to collector failover, failback and related alerting configuration. I have extensively worked on Monitoring & Event Management tools and integration with ITSM Systems for different vendors. Collectors are remote agents that collects data from underlying Infra components on scheduled interval [5 minutes, 15 minutes etc.,] and collected data would be compared against thresholds defined for the monitored attributes. When threshold is breached, alert will be generated in the system and can be integrated with ITSM system for automated incidents and acted upon to restore the services & monitored components.
In Production environment, collectors would be configured in High Availability and sometimes HA + DR setup. When a Primary collector goes down, Secondary collector automatically picks up data collection jobs and starts collecting the data. In this example, typically there will be 2 alerts, one for Primary Collector going down and another for data Collection being impacted. Based on failover configuration, Secondary collector will start picking up data collection jobs and data collection will resume. By this time, data collection impacted alert should be auto-cleared and Primary collector down alert would remain open. When Primary collector is back online, related alert would be cleared and depending on your failback configuration, data collection will be either from Primary or Secondary collector. At a high level, this is the same mechanism for all the vendors for Monitoring and Event/Alert Management.
LogicMonitor - ServiceNow integration uses import set and transform map for the integration.
https://store.servicenow.com/store/app/acdbabea1b246a50a85b16db234bcb15
If LogicMonitor Product team confirms this is the expected behavior and your team has configured collector failover alerts correctly, you can try below approach
Identify the Transform Event Script that carries out incident updates and add a condition to check for the alert category or unique filter condition for collector failover alerts and work note contains collector is back online [these fields will be part of import set table] and update incident state as per your requirements. This will make sure you are not introducing additional script or Flow Designer Action outside your integration and will be handled as part of existing configurations.
I hope you appreciate the efforts to provide you with detailed information. As per community guidelines, you can accept more than one answer as accepted solution. If my responses helped to guide you or answer your query, please mark it helpful & accept the solution.
Thanks,
Bhuvan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
4 hours ago
If you do not want to customize transform event script, you can either create Business Rule or Flow. Since this is a simple requirement, I would recommend Business Rule.
Create an after BR on incident with filter conditions specific to Collector Failover alert from LM [category or any field that can uniquely identify the incident created for failover] and work notes changes and caller is LogicMonitor. Check for latest work notes for the record from journal entry and if it contains 'Collector is back online' [change it with actual expected text], update state of the incident.
I would recommend you to submit a Support case with LogicMonitor and ServiceNow and request them to create known problem and enhance the integration on either LM or ServiceNow end. BR is more of a workaround and incident closure for Collector Failover should be handled automatically. If it not available now, they can enhance this for future app updates.
I hope you appreciate the efforts to provide you with detailed information. As per community guidelines, you can accept more than one answer as accepted solution. If my responses helped to guide you or answer your query, please mark it helpful & accept the solution.
Thanks,
Bhuvan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
3 hours ago
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tuesday - last edited Tuesday
@lvenna - i would do with via flow designer and will follow below steps
1. Trigger | Watches for work note updates | Table: Incident When: Record Updated Condition: Work Notes changes |
2. Check Source | Makes sure it's from Logic Monitor | If: work_notes contains "Logic Monitor" OR updated_by contains "logicmonitor" |
3. Check Status | Only acts on open incidents | If: state ≠ Resolved |
4. Look for Keywords | Finds restoration messages | If: work_notes contains any of: • collector back online • failover restored • connectivity restored • alert cleared |
5. Auto-Close | Updates the incident | Set: state = Resolved Add: resolution note Log: the action |
and if you are wonder why flow then please refer below:
https://www.servicenow.com/community/developer-blog/flow-designer-vs-business-rules/ba-p/3100188
https://developer.servicenow.com/dev.do#!/guides/zurich/now-platform/pro-dev-guide/pd-build-logic
https://www.servicenow.com/community/itsm-forum/flow-designer-over-business-rules/m-p/508225