Monitoring tool Integration and rules

abhijats
Tera Expert

Hi,

I have a query related to Monitoring tool Integration with SNOW:

1> There is a scenerio- If some CI or network device is down, an alarm or alert should be generated in monitoring tool then on the basis of that alarm a ticket should be opened in SNOW. And when the status of that particular alarm gets closed, the related incident ticket should get closed.
Is this possible, I have read the Document of Nimsoft Nimbus, but couldn't find this information.

2> Also tell how to make it possible that this integration filter only major alarms and produce tickets with critical priority.

Specifically, tell which Monitoring tool fulfill these two requirements?

13 REPLIES 13

Hi Jay,



Do you mind sharing details of this implementation?


Bhavesh


Hi Jay.Ford Would you able to share document on Nagios Integration as you have mentioned here, this would be very helpful for us as well.


AJ,



This was an old solution for us, since the writing of the original reply we switched our monitoring solution to Nimsoft but are switching Nagios back on for some things now. The original setup we had with Nagios used a Python script and SOAP installed on the Nagios Linux server. The Python script would lookup values in Service-Now to prevent duplicates based on the Nagios alert ID. We added two fields in our instance to house the Nagios alert ID. One holds the alert ID on a new alert, the other is for when the alert is over it changes ID's so the Nagios guys built logic into the python script to catch that. I don't know Python or anything about Nagios personally, I only handle the Service-Now side. From what I understand, you just call this script with the options when there is an alert triggered in Nagios.



Python script below.




################################################################################


# Import Modules


################################################################################


import sys, smtplib, socket


from optparse import OptionParser


from SOAPpy import SOAPProxy






################################################################################


# Option Parser


################################################################################


parser = OptionParser(version = "0.2.1")




parser.add_option('-n','--notify',


      dest='notify_info',


      default='',


      metavar='NOTIFY',


      help=('The type of incident notification being sent'))




parser.add_option('-H', '--hostname',


      dest='hostname_info',


      default='',


      metavar='HOST',


      help=('The hostname involved in the incident'))




parser.add_option('-a', '--hostaddr',


      dest='hostaddr_info',


      default='',


      metavar='ADDR',


      help=('IP address of the host involved in the incident'))




parser.add_option('-S', '--servicename',


      dest='servicename_info',


      default='',


      metavar='SERVICE',


      help=('Name of the service involved in the indicent'))




parser.add_option('-d', '--desc',


      dest='desc_info',


      default='',


      metavar='SVCDESC',


      help=('The description of the service involved in the incident'))




parser.add_option('-o', '--output',


      dest='output_info',


      default='',


      metavar='OUTPUT',


      help=('The output of the incident'))




parser.add_option('-T', '--state',


      dest='state_info',


      default='',


      metavar='STATE',


      help=('The state of the incident'))




parser.add_option('-L', '--laststate',


      dest='laststate_info',


      default='',


      metavar='LASTSTATE',


      help=('The last state of the incident'))




parser.add_option('-t', '--datetime',


      dest='datetime_info',


      default='',


      metavar='DATETIME',


      help=('The date/time in long format of the event'))




parser.add_option('-g', '--assigngrp',


      dest='assigngrp_info',


      default='',


      metavar='ASSIGNGROUP',


      help=('Which group to assign the incident to'))




parser.add_option('-i', '--alertid',


      dest='alertid_info',


      default='',


      metavar='ALERTID',


      help=('The globally unique problem ID assigned the incident'))




parser.add_option('-l', '--lastalertid',


      dest='lastalertid_info',


      default='',


      metavar='LASTALERTID',


      help=('The globally unique problem ID assigned the incident'))




parser.add_option('-x', '--opsys',


      dest='opsys_info',


      default='',


      metavar='OS',


      help=('The OS variable listed in the _xiwizard macro'))




parser.add_option('-C', '--cluster',


      dest='cluster_info',


      default='',


      metavar='CLUSTER',


      help=('The cluster variable in the _cluster macro'))




parser.add_option('--dev',


      dest='dev_info',


      action="store_true",


      default=False,


      metavar='ISDEV',


      help=('Declare the use of the Dev ServiceNow server'))




(opts, arg) = parser.parse_args()






################################################################################


# Global Variables


################################################################################


# Determine if the dev ServiceNow environment should be used instead of prod.


if opts.dev_info is False:


  # This section of code is for the production ServiceNow environment


  # The instance variable is the subdomain name of the server


  instance='prodinstancename'




  # The username variable is the user id used to auth to the instance


  username='user'




  # The password variable is used in conjuntion with the username variable


  # to authenticate to the instance.


  password='password'




  # The callerID variable is ServiceNow's unique ID of the user in the db


  callerId='sys_id_of_user'


else:


  # This section of code is for the development ServiceNow environment


  # The instance variable is the subdomain name of the server


  instance='devinstancename'




  # The username variable is the user id used to auth to the instance


  username='user'




  # The password variable is used in conjuntion with the username variable


  # to authenticate to the instance.


  password='password'




  # The callerID variable is ServiceNow's unique ID of the user in the db


  callerId='sys_id_of_user'




# The assignGrp variable is the default group which tickets are assigned to.   By


# default, the assignGrp variable is set to the Service Desk, unless the "-g"


# Option Parse flag has a value assigned, in which case, that value is used


# instead.


if opts.assigngrp_info == '':


  assignGrp = 'Service Desk'


else:


  assignGrp = opts.assigngrp_info




# The contactType variable is set to "nagios" to inform ServiceNow that the


# ticket is created by the monitoring system.   This is to allow ServiceNow to


# create business flows which integrate the ticket with the CMDB database.


contactType='nagios'




# The Business Service variable should, once the CMDB is completed, be populated


# on the ServiceNow side by using the CI info of the host.   Temporarily, all


# tickets created until the CMDB is funtional will have a "Business Service"


# value of "Monitoring".


###businessService='Monitoring'




# The incident_values dictionary variable stores all of the values which will be


# used to create/update/search the tickets within ServiceNow.


incident_values = {


  #'u_business_service': businessService,


  'caller_id': callerId,


  'assignment_group': assignGrp,


  'contact_type': contactType,


  'u_alert_id': opts.alertid_info,


  'u_last_alert_id': opts.lastalertid_info,


  'cmdb_ci': opts.hostname_info,


  'short_description': opts.notify_info+':'+opts.hostname_info+':'+opts.state_info,


  'description': '***** Nagios *****\n\nService Type: '+opts.opsys_info+'\nNotification Type: '+opts.notify_info+'\nHost: '+opts.hostname_info+'\nState: '+opts.state_info+'\nAddress: '+opts.hostaddr_info+'\nInfo: '+opts.output_info+'\n\nDate/Time: '+opts.datetime_info+'\n'}




# The following four variables are used in the sendMail function.


# Get the local hostname so that the "From" address will be accurate.


host_name=socket.gethostname()




# Declare the address which email will come from.


from_user="Critical Notification@"+host_name




# The users to send the email notification to.   This must be a python list!


notify_user=["james.conner@us.dunnhumby.com"]




# The name of the mail server which the function will directly connect to.


mail_server="mail.dunnhumby.com"






################################################################################


# Classes


################################################################################






################################################################################


# Functions


################################################################################


# The createIncident function is called by the searchIncident function if a


# pre-existing ticket does not exist. This function does not take additional


# variables beyond the global ones.


def createIncident():


  # This section of code creates the SOAP object named "server"


  proxy = 'https://%s:%s@%s.service-now.com/incident.do?SOAP' % (username, password, instance)


  namespace = 'http://www.glidesoft.com/'


  server = SOAPProxy(proxy, namespace)




  # The create_response variable is the returned object which is created


  # by calling the "server" SOAP object with an "insert" command. The


  # insert command is setting various attributes of the SOAP "server"


  # object to the values of the "incident_values" dictionary keys. Once


  # this is completed, a ticket has been created in ServiceNow.


  create_response = server.insert(caller_id=incident_values['caller_id'], assignment_group=incident_values['assignment_group'], short_description=incident_values['short_description'], description=incident_values['description'], contact_type=incident_values['contact_type'], u_alert_id=incident_values['u_alert_id'], u_last_alert_id=incident_values['u_last_alert_id'], cmdb_ci=incident_values['cmdb_ci'], u_business_service=incident_values['u_business_service'])






# The updateIncident function is called by the searchIncident function if a


# pre-existing ticket exists.   The 'sysID' variable is passed to this function


# by the searchIncident function because ServiceNow needs to know which ticket


# needs to be updated.


def updateIncident(sysID):


  # This section of code creates the SOAP object named "server"


  proxy = 'https://%s:%s@%s.service-now.com/incident.do?SOAP' % (username, password, instance)


  namespace = 'http://www.glidesoft.com/'


  server = SOAPProxy(proxy, namespace)




  # The update_response variable is the returned object which is provided


  # by calling the "server" SOAP object with an update command. The update


  # command is using the sysID local variable (passed by the


  # searchIncident function) as the primary key to update the ServiceNow


  # database with the rest of the information in the update command.   At


  # this point in time, we are only updating the "work_notes" field with


  # the contents of the incident_values dictionary's short_description


  # key.   It is important to note that the sysID variable is *NOT* the


  # ticket name, but rather the ticket's unique ID within the ServiceNow


  # database.


  update_response = server.update(sys_id=sysID, work_notes=incident_values['short_description'])






# The checkCritical function will take the incident_values['cmdb_ci'] list value


# and check it against the servicenow database to determine if the CI has been


# flagged as needing a callout when a ticket is created for it.


def checkCritical():


  # This section of code creates the SOAP object named "server"


  proxy = 'https://%s:%s@%s.service-now.com/cmdb_ci_computer.do?SOAP' % (username, password, instance)


  namespace = 'http://www.glidesoft.com/'


  server = SOAPProxy(proxy, namespace)




  # The critical_lookup variable makes a call against the cmdb_ci_computer


  # table, and looks to determine if the server in the cmdb_ci variable


  # has a callout value of 1, which is equal to "true" in the servicenow


  # web interface.


  critical_lookup = server.getRecords(name=incident_values['cmdb_ci'], u_callout=1)




  # If the callout flag is true for the cmdb_ci, then the critical_lookup


  # variable is populated with the cmdb_ci's record.   If the callout flag


  # was false, then the variable is completely empty.


  if hasattr(critical_lookup,'sys_id'):


  # Checking for the "sys_id" attribute was successful, so the


  # server is flagged as critical.   All critical cmdb_ci need to


  # send an email to the Unity Call Manager in order to phone the


  # Service Desk.


  sendMail()


  else:


  # The "sys_id" attribute doesn't exist, so the server is not


  # flagged as critical.   Do nothing in this case.


  pass






# The sendMail function sends an email.


def sendMail():


  # The first step is to create the message variable.   The message is


  # specifically formatted so that the email server receiving the smtplib


  # socket connection can parse it for the From, To, Subject and Body


  # portions of the email.


  message= """\


From: %s


To: %s


Subject: %s


X-Priority: 1


X-MSMail-Priority: High




Critical failure with server %s.


""" % (from_user, ", ".join(notify_user), incident_values['cmdb_ci'], incident_values['cmdb_ci'])




  # Create the SMTP socket against the mail server on port 25.


  mailServer=smtplib.SMTP(mail_server,25)


  # Send the email payload.


  mailServer.sendmail(from_user,notify_user,message)


  # Close the SMTP socket.


  mailServer.quit()






# This searchIncident function is the primary executor of this program. It is


# responsible for determining if an existing ticket for an alert exists or not,


# and then making the decision to either create a new ticket, or update an


# existing one as needed.


def searchIncident():


  # This section of code creates the SOAP object named "server"


  proxy = 'https://%s:%s@%s.service-now.com/incident.do?SOAP' % (username, password, instance)


  namespace = 'http://www.glidesoft.com/'


  server = SOAPProxy(proxy, namespace)




  # In nagios, a unique alert number is generated for each new problem.


  # When the alert is terminated (ie, the problem is fixed), the recovery


  # message's alert field is transitioned to "0".


  #


  # If an alert number in nagios is "0", then we need to use the last


  # alert number (which is another parameter passed by nagios) to


  # determine which ticket needs to be updated with the "Recovery" msg.


  if incident_values['u_alert_id'] == '0':


  # Since the alert number is "0", search for tickets using the


  # last alert number.   Ideally, there should only be


  # one match, since the alert number is unique.


  search_response = server.getRecords(active='true', u_alert_id=incident_values['u_last_alert_id'])


  else:


  # The alert number isn't "0", so use the current alert number


  # to perform the search of the ServiceNow database for matching


  # tickets.   There should be two possible outcomes for the


  # search_response variable at this point:


  #     1: There was no pre-existing ticket, so the variable object


  #           is empty.   This means the 'sys_id' attribute is empty.


  #     2: There was one record found, so the variable object should


  #           have a 'sys_id' attribute.


  search_response = server.getRecords(active='true', u_alert_id=incident_values['u_alert_id'])




  # After the search_response variable is set with an object returned from


  # the SOAP connection with ServiceNow, perform the test to see whether


  # or not the search_response's object contains a sys_id attribute.   If


  # there is a sys_id attribute, it means a ticket for the alert already


  # exists.   If there isn't a sys_id attribute, then a ticket needs to be


  # created.


  if hasattr(search_response,'sys_id'):


  # If the ticket has a sys_id, check to see if the ticket needs


  # updating. Only update the ticket if the STATE macro from


  # nagios has changed. To check this, compare the STATE to the


  # LASTSTATE variables.


  if opts.state_info == opts.laststate_info:


  # Since the states match, exit from the program without


  # updating the ticket in ServiceNow


  sys.exit(0)


  else:


  # The statuses are different, so update the ticket


  updateIncident(search_response['sys_id'])


  else:


  # Verify if the server has a callout flag in the cmdb.


  checkCritical()


  # Create a ticket, since the sys_id attribute wasn't found.


  #createIncident()






################################################################################


# Program Execution


################################################################################




# The searchIncident function is the primary function for the entire program.


# No variables need to be passed with this function, since all of the required


# variables are global to the program.


searchIncident()


Thanks Jay for sharing this with us here. It was helpful Indeed.


Hi Jay



We are looking into how to connect UIM to SetrviceNow to use ServiceNow for event management. How do you get the UIM events into ServiceNow, and do you have any experience with ServiceNow for event management



All the best


Per