Auto-Closing Incidents Triggered by Logic Monitor Integration

lvenna
Tera Contributor

Hello All,

 

We are experiencing an issue with Incidents created through the Logic Monitor integration. These incidents are generated for collector failovers, and the failover details are included in the Incident description.

When the failover is restored, Logic Monitor sends a notification, which is added as a work note to the Incident (e.g., indicating that the failover is back online for the collector). However, the Incident is not automatically closing based on this update.

Currently, users are manually closing these incidents, and when the same collector goes down again, the Incident is reopening as expected. If I implement an auto-close mechanism, it should not interfere with this expected reopen behavior, even if the Incident is in a closed state.

Is there a recommended approach to configure these Incidents to auto-close when a specific work note is added by the Logic Monitor integration (for example, when the work notes states that the failover has been resolved and the collector is back online)?

Any guidance or best practices would be greatly appreciated. Thank you!

 

2 ACCEPTED SOLUTIONS

@lvenna 

 

Please see below for LogicMonitor documentation that explains how collectors are monitored and alerts are generated & cleared. I am not sure why your LM team are saying alerts will not be auto-cleared when that should be the case. You can request them to check with LogicMonitor support team to clarify on this if needed.

Bhuvan_0-1757518676518.png

There are multiple scenarios when it comes to collector failover, failback and related alerting configuration. I have extensively worked on Monitoring & Event Management tools and integration with ITSM Systems for different vendors. Collectors are remote agents that collects data from underlying Infra components on scheduled interval [5 minutes, 15 minutes etc.,] and collected data would be compared against thresholds defined for the monitored attributes. When threshold is breached, alert will be generated in the system and can be integrated with ITSM system for automated incidents and acted upon to restore the services & monitored components.

 

In Production environment, collectors would be configured in High Availability and sometimes HA + DR setup. When a Primary collector goes down, Secondary collector automatically picks up data collection jobs and starts collecting the data. In this example, typically there will be 2 alerts, one for Primary Collector going down and another for data Collection being impacted. Based on failover configuration, Secondary collector will start picking up data collection jobs and data collection will resume. By this time, data collection impacted alert should be auto-cleared and Primary collector down alert would remain open. When Primary collector is back online, related alert would be cleared and depending on your failback configuration, data collection will be either from Primary or Secondary collector. At a high level, this is the same mechanism for all the vendors for Monitoring and Event/Alert Management.

 

LogicMonitor - ServiceNow integration uses import set and transform map for the integration. 

 

https://store.servicenow.com/store/app/acdbabea1b246a50a85b16db234bcb15

 

Bhuvan_1-1757519808865.png

If LogicMonitor Product team confirms this is the expected behavior and your team has configured collector failover alerts correctly, you can try below approach

 

Identify the Transform Event Script that carries out incident updates and add a condition to check for the alert category or unique filter condition for collector failover alerts and work note contains collector is back online [these fields will be part of import set table] and update incident state as per your requirements. This will make sure you are not introducing additional script or Flow Designer Action outside your integration and will be handled as part of existing configurations.

 

I hope you appreciate the efforts to provide you with detailed information. As per community guidelines, you can accept more than one answer as accepted solution. If my responses helped to guide you or answer your query, please mark it helpful & accept the solution.

 

Thanks,

Bhuvan

View solution in original post

@lvenna 

 

If you do not want to customize transform event script, you can either create Business Rule or Flow. Since this is a simple requirement, I would recommend Business Rule.

 

Create an after BR on incident with filter conditions specific to Collector Failover alert from LM [category or any field that can uniquely identify the incident created for failover] and work notes changes and caller is LogicMonitor. Check for latest work notes for the record from journal entry and if it contains 'Collector is back online' [change it with actual expected text], update state of the incident. 

 

I would recommend you to submit a Support case with LogicMonitor and ServiceNow and request them to create known problem and enhance the integration on either LM or ServiceNow end. BR is more of a workaround and incident closure for Collector Failover should be handled automatically. If it not available now, they can enhance this for future app updates.

 

I hope you appreciate the efforts to provide you with detailed information. As per community guidelines, you can accept more than one answer as accepted solution. If my responses helped to guide you or answer your query, please mark it helpful & accept the solution.

 

Thanks,

Bhuvan

View solution in original post

21 REPLIES 21

@lvenna 

 

Glad to know it helped !

 

Thanks,

Bhuvan

Chavan AP
Kilo Sage

@lvenna - i would do with via flow designer and will follow below steps

 

1. TriggerWatches for work note updates

Table: Incident

When: Record Updated

Condition: Work Notes changes

2. Check SourceMakes sure it's from Logic Monitor

If: work_notes contains "Logic Monitor"

OR updated_by contains "logicmonitor"

3. Check StatusOnly acts on open incidentsIf: state ≠ Resolved
4. Look for KeywordsFinds restoration messages

If: work_notes contains any of:

• collector back online

• failover restored

• connectivity restored

• alert cleared

5. Auto-CloseUpdates the incident

Set: state = Resolved

Add: resolution note

Log: the action

 

and if you are wonder why flow then please refer below:

https://www.servicenow.com/community/developer-blog/flow-designer-vs-business-rules/ba-p/3100188

https://developer.servicenow.com/dev.do#!/guides/zurich/now-platform/pro-dev-guide/pd-build-logic

https://www.servicenow.com/community/itsm-forum/flow-designer-over-business-rules/m-p/508225

 

 

 

 

 

Glad I could help! If this solved your issue, please mark it as Helpful and Accept as Solution so others can benefit too.*****Chavan A.P. | Technical Architect | Certified Professional*****