ServiceNow Discovery – Common Pitfalls and How to Avoid Them (Part 1 of 3)

Charles Keown · ‎03-02-2021

When it comes to your ServiceNow Configuration Management Database (CMDB), accuracy is everything. An optimized CMDB supports the entire ServiceNow ecosystem, facilitates enhanced data analysis, and faster incident resolution.

The following article highlights some of the discovery pitfalls that may negatively affect the accuracy of the ServiceNow CMDB. Avoiding these challenges can help create a more reliable representation of your ServiceNow environment.

This is the 1st of 3 articles that will be posted over the coming few weeks on this topic.

Feedback/Comments welcomed. "We're all in this together..."

--------------------------------------------

Pitfall #1

Unknown or Missing Subnets/Networks

Background

Failure to have a full understanding of your environment may lead to gaps in your ServiceNow Discovery. The ServiceNow Discovery is agentless, which allows you to target all active endpoints across subnets and see what is active on the network. As a result, it is vital to have a solid foundation of all networks across your firm.

Corrective Steps

Obtain a full listing by working with your network team.
Populate your subnets in the IP Networks table in ServiceNow.
ServiceNow offers integration to systems like Infoblox to further integrate your network data with the CMDB.
If you don’t have a good grasp of your network, you can set up a unique discovery schedule that targets Networks.
- Set up proper SNMP credentials to make sure the router/switch discovery is correct.
- Create a discovery schedule that targets Networks.
- Add IP ranges for discovery. This can include subnets and/or large IP range guesses.
- Once completed, it returns a listing of subnets known/seen by the target router.

Pitfall #2

Linux servers are discovered as both SNMP and servers, creating Linux SNMP and SSH confusion. This may cause a Linux server to flip-flop classes.

Background
Some Linux servers may be configured to allow both SSH and SNMP interrogation, periodically changing the Configuration Item (CI) class in ServiceNow.

Corrective Steps
• Avoid enabling SNMP access to Linux servers.

How to Stop/Disable SNMP Service

Stop SNMP Service:

Use SSH to access your server with root login
Enter command #service snmpd stop (enter the command after the # symbol)

Disable SNMP Service from Running and Operating System Startup:

Use SSH to access your server with root login
Enter command #chkconfig snmpd off (enter the command after the # symbol)

Pitfall #3

SNMP devices are incorrectly classified as routers or switches in ServiceNow, causing incorrect network device classification. Wireless components are a good example.

Background
When ServiceNow Discovery determines that an active endpoint is an SNMP-enabled device, it will try to classify the device. The two primary classifications are routers and switches. However, other network classifications could be targeted: wireless controllers, access points, UPSs, VoIP phones, etc.

If the SNMP Object Identifier (OID) is not populated in ServiceNow, the system may perform a “best guess” device classification, causing data confusion and inaccuracies.

Corrective Steps
To avoid network device confusion, perform the following steps:

Add the appropriate entries to the SNMP OID Classification table.
For added details specific to a particular class/model, create both a new SNMP classification and discovery pattern.

Pitfall #4

Different discovery efforts overlap.

Background
When configuring an automated, scheduled discovery in ServiceNow, it is important to organize what and when you are running discovery. Overlaps can add performance hits on your MID servers, plus add unneeded overhead and processes to your instance.

Corrective Steps

Organize and group your target networks by location, if possible.
Avoid targeting very large subnets (ex. /16) and target smaller groups like /24.
Review performance history on your MID Server(s) to ensure proper balance and processing. Use the MID Server Dashboard for performance details.
It’s always good practice to test your discovery approach in your DEV instance for analysis before shifting to PROD.

doug_schulze · ‎03-02-2021

Charles thanks for this post. I'm a huge proponent of us all contributing our shared knowledge in the ITOM/Discovery space so this is a welcomed read.

If I could complement what you mention above...

#1 Here is a link to configuring the Networks based discovery and associated properties. However, I wouldn't put in a subnet because most likely you are only going to find an L2 distribution switch that wouldn't know anything about routing or L3 Networks. I would suggest identifying your core or peer routers as your targets, the more the better. This way you get the most specific and best of information to their OSPF peers so that we can collect as far and as wide as we can. Remember this is wholly dependant on SNMP being able to query these routers so if we're told a peer is in a different location that won't accept your MIDs queries because of Network/ACL restrictions, well we're not going to have that data to compile.

#2 Linux systems that have snmp enabled.

First, it's important to know that we do not classify compute with SNMP, we have no Linux classifiers nor any SNMP Sysoids for Unix/Linux servers so even if they respond to SNMP we wouldn't find a match and would move on with our day.. Secondly, I wouldn't recommend to our friends that they disable SNMP on their compute systems. Most likely it is playing an important role in other applications.

#3 SNMP Mis-Classifications.

Want to just say that there is no "guessing" when it comes to our classification 🙂 . You are correct we use the SNMP SysOIDs to match a potential classification however if you look at the Shazzam Sensor we evaluate specifically on a device's capabilities, capabilities being; does it route, print, switch, or power, and only on the result of that evaluation do we classify accordingly. Now that's not to say that assessment could be wrong, I've seen (back in the day) us classifying Solaris server as a printer because the SNMP return showed it had printing capabilities based on our measure, so that can not assess correctly, but its never a guess.

And if there's any value to be had in creating your own SNMP Classifier from scratch including loading new MIBs for custom queries, I did a video around just that.

Overall thanks again for sharing this great information and looking forward to the other installments.

Charles Keown · ‎03-09-2021

@doug.schulze,

Greatly appreciate you taking the time to review this article.

#1 Missing Networks - One of the challenges I've experienced in the past is with gaining a complete understanding of all networks that encapsulate a company's environment. Usually this wholistic knowledge is kept in the minds of a single "hero" and/or siloed across regional or global departments.

In the past I've enhanced the IP Networks table to include additional attributes like gateway, subnet mask, dmz (t/f) and so forth. I needed a place to centrally house and document all networks across the firm.

#2 SNMP Linux - Apologies for the misinformation. From past experiences, I knew Linux server teams that didn't realize SNMP was enabled. I've also seen SNMP OID 1.3.6.1.2.1.25.3.1.3 used incorrectly (shouldn't be used at all) and that probably led to confusion.

#3 SNMP Classifications - "guessing" - ha. I stand corrected! I'll have you know that your SNMP Classifier video was my first introduction to you and your in-depth knowledge. 🙂 In fact, I've used that video as a springboard for building numerous new classifications -> IP phones, video conference units, security appliances , etc.

There is definitely a lot of power there once the correct MIB and SNMP queries are known.

Again, thanks for the kind feedback and response.

jimit · ‎11-11-2022

Thank Charles,

These common pitfalls are awesome!

I have a question about discovering switches (Cisco Nexus) which I'm struggling with atm using ip network ranges.

When nodes such as switches and routers are onboarded, they are added and managed using their management ip address or node name to NMS, (such as Solarwinds, NNM, etc,..).

How can we avoid pitfall discovering switches that have multiple VR groups spanning multiple ip network ranges?

If we discover using only the management ip address, then we get results we want. However, if we scan the same switch with multiple VR group with several ip network ranges, and using ip network ranges in our discovery, then we find (in audit history of the CI) the ip address flips through to using one of the ip addresses in each VR group.

For example:

switch application ports ip addresses: 10.20.10.20-10.20.10.30, and in 'default' (1) VR group
switch management port ip address is 10.1.10.1, and in its own VR group named 'management' (2)
switch distribution ports ip addresses: 10.30.10.20-10.30.10.40, and in 'dist' (3) group

The audit will show something like this created/updated by discovery...

name, old ip address, new ip address
switch1, <empty>,10.1.10.1
switch1, 10.1.10.1, 10.20.10.20
switch1, 10.20.10.20, 10.30.10.20
switch1, 10.30.10.20, 10.30.10.21
switch1, 10.30.10.21, 10.30.10.22

et cetera, ...

If we have all the above ip ranges covered in a discovery, we find the CI ip address flips through one ip address in each VR group.

Note: I am considering using the option glide.discovery.exclude_ip_sync_classes to exclude cmdb_ci_ip_switch, not tested yet but something tells me this may still not work as I would expect, bases on which IP address returns some result first from that ip range...

Should, in this case, adding only the ip address of the management ip of the switch, and exclude all the ip addresses in each of the other groups? Should the Network team provide mgmt ip to discover, and exclusion range ip addresses for each of their VR groups?