Having trouble with snmp timeouts during Discovery on devices that are being discovered mostly successfully

Erik36 · ‎04-19-2018

Hi All,

I think I've read just about all I can find regarding these errors & am a bit stuck/confused. I have a handful of Cisco devices across our datacenters that seem to be exhibiting similar "failures" during discovery.

Based on what are network admins have provided, our snmp community credential seems to be good & is working for ~90% of all devices being discovered. However, in a few cases, we receive several entries near the end of each device's discovery that say "SNMP probe timed out. Target is either unreachable or there are no valid credentials for it."

We've checked MID server resources, thread counts, increased the timeout for both the "mid.snmp.session.timeout" & "mid.snmp.request.timeout" values significantly for science. We've tried using a SNMP only behavior.

I'm not sure where else to look for the potential details to understand what about these devices are actually failing. I have a meeting with one of our network admins tomorrow to see if they can monitor a discovery to one of the devices in question from within the device to see if it logs anything useful.

Just curious if there's something I'm completely overlooking within our SN Instance for either config or troubleshooting purposes.

I've included a couple attachments that hopefully help a little. Please let me know if there's anythign else I can provide to help you help me.

Erik36 · ‎04-26-2018

Update to original post, after meeting with our network admin, we have a little more clarity but I'm still not sure if there's an actual problem here. According to them, this timeout behavior is most likely due to internal routes that either route to other networks beyond the local network or are marked as non-routable, which they suspect woul cause discovery to eventually timeout.

I'm now wondering from a Discovery perspective, if there's a definitive place where I can see the actual detailed results of these timeouts, such as the potential switch interface its happening on? I tried checking the logs for each run but don't seem to find any helpful details.

Any thoughts?

Community Alums · ‎04-27-2018

Similar situation here, random timeouts happening which causes SNMP - Switch / Router pattern to fail. Any suggestions?

hannu · ‎04-28-2018

Hi mukul,

if your pattern is failing then my suggestion is to check the identification part of the pattern.

else there is one more workaround for the same.
use probes for those device rather than pattern, it will surely discover those devices.
just change trigger probes for those particular OID record from horizontalpattern probe to the required probes.
That's it.
Just give it a try.
I have already faced the pattern failed issue.
and the step i told you above had helped me.

let me know if i can help you more.

hannu · ‎04-28-2018

can you also attach the screenshot for the new discovery in which you are facing snmp timeout issue.