Trigger notification only if MID server is down for a certain period of time?

twoodruff
Kilo Explorer

Hello, friends!

I'm having a devil of a time trying to figure out how to do the following:

Trigger a (single) notification, ONLY if a (any) MID server is in DOWN state for 6 hours or more.

I figure I can create a notification that triggers when the server state changes and is down, but where to go from there? I've been searching the wiki, but couldn't find anything on how to tell the system to do something like...

Wait for 6 hours...

Then trigger the notification...

Except if the state changes to UP during those 6 hours...

Abort (do not send notification).

Has anyone got any ideas?

Thanks so much for any help you can provide. This has really bent my mind today. 😛

1 ACCEPTED SOLUTION

bernyalvarado
Mega Sage

Hi, there's actually multiple ways of doing this:



There's a script include called MonitorMIDServer. That same script include has a function named: hasRecentActivity the one you could leverage to see if a MID server haven't have any recent activity for the last 6hours




You also have the option to monitor the events "mid_server.down" and event "mid_server.up" and based on the time of those trigger the respective notification.




Here goes the script include function hasRecentActivity that you could leverage as a basline for your code


      /*


        * Check if the given agent name has written to the ecc_queue more recently than the given time.


        * This protects against marking an overloaded (but functioning) mid server as 'down' inappropriately.


        */


      hasRecentActivity: function(agentName, time) {


              var gr = new GlideRecord('ecc_queue');


              gr.addQuery('sys_created_on', '>', time);


              gr.addQuery('sys_created_on', '<=', gs.minutesAgo(0));


              gr.addQuery('agent', 'mid.server.' + agentName);


              gr.addQuery('queue', 'input');


              gr.query();


              if (gr.hasNext())


                      return true;


             


              return false;


      },



Thanks,


Berny


View solution in original post

9 REPLIES 9

Tim Woodruff
Mega Guru

Hey Berny,



Thanks again for your great answer. I feel like such a dunce, but I can't seem to figure this one out.


It seems to me that the solution would be to wait for the server to go down, then initiate some kind of a timer and check if the condition is still true (the server is still down). If it is, send the notification.


However, outside of a workflow, I just can't seem to figure out how to start a 'timer' or 'wait' or 'sleep' activity. Did you have something else in mind?



Or maybe I could just go off the "last_refreshed" field. That sounds easy enough. What do you think?


Hi Timothy, you're welcome!



My first thought will be that you could create a scheduled job that could run once every 15mins (or 30mins, or something like that). This scheduled job make a call the script include hasRecentActivity shared above. You will need to pass as parameter the agentName of your mid server and the time (6 hours ago). If the function returns false, then you can trigger an event/notification (http://wiki.servicenow.com/index.php?title=Events_and_Email_Notification#gsc.tab=0) to alert perhaps a MidServer Alert Group, which you could create to manage who needs to be notified when a MidServer goes down.



If you have multiple MidServer, then you can check for each MidServer's recent activity status on the same scheduled job.



I hope this is helpful! Cheers Timothy!



Thanks,


Berny


Hello,



Thanks again!


Quick question though --


Wouldn't this cause the notification to be sent every 15 minutes (or however long the interval on the scheduled job is)?


I suppose I could say "if the last udpated datetime is more than 6 hours but less than 6 hours 15 minutes...", but that seems a little.... hackey, doesn't it?



I don't suppose you know of a better way? 🙂


Hi Timothy, yes, that's right. It will continue notifying every 15mins. You could also do a query to the sys_email table to see if a "MID Server is down" email for that MID server was already sent out in the last X mins. Still, i do will recommend to continue sending emails with some recurrence if a MID server is down. The time recurrence depends on the criticality of your business and the functions that depend on the MID Servers.



Thanks,


Berny


San3
Kilo Contributor

Hi All,

If anyone can help on this,

if MID server, 

i).  Discovery run time exceeded 4 Hours then Alert should be generated and go to the respective team.

ii). And No Discovery run on particular location for 4 hours and alert should be generated.

 

Thanks in Adv.