- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
3 weeks ago
ServiceNow Discovery: SSH credentials work initially, then stop working later (credentials appear to be cleared/overwritten)
Hi everyone,
I’m looking for outside ideas on an intermittent Discovery issue.
Context:
- Environment includes Avaya/telephony devices and Linux servers.
- Discovery works at first, and targets are discovered successfully.
- After some time, Discovery can no longer authenticate over SSH.
- In some cases, it looks like the SSH credential mapping/reference is no longer usable (for example, empty credential reference at runtime).
- SNMP may still run, but SSH authentication path does not continue.
Observed behavior:
- Re-adding the SSH key or recreating the discovery user restores discovery temporarily.
- Later, the problem comes back.
- This suggests something is changing after initial success (automation/policy/sync/rotation?).
What we already checked:
- SSH port is reachable.
- Manual server-side key files/permissions looked correct at check time.
- No obvious manual deletion process identified on our team side.
- We suspect an automated process may be overwriting/removing credential data over time.
Questions:
- Has anyone seen Discovery credentials (especially SSH key-based) become invalid after initial successful runs?
- What are the most common root causes in ServiceNow for this pattern?
- Which logs/tables are best to prove what changed the credential reference (audit, scheduled jobs, integrations, MID activity)?
Any pointers, known defects, or troubleshooting checklists would be appreciated.
Thanks!
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
3 weeks ago
Hi @MarxA ,
Thanks for the clarification—this is a key observation. Since the issue is isolated to Avaya/telephony devices, while other device groups in ServiceNow Discovery are working fine, this strongly points toward device-side behavior rather than a Discovery framework issue.
Based on similar scenarios, here are some Avaya-specific areas to validate:
1. Check if SSH Keys Are Being Overwritten
On Avaya systems, user environments are often managed by provisioning or sync tools (e.g., System Manager / LDAP). These processes can:
Recreate user profiles
Reset
.ssh/authorized_keysRemove previously added keys
Verify whether the SSH key used by Discovery still exists after failure and if the file timestamp changes automatically.
2. Validate SSH Key Persistence & Permissions
Avaya platforms are stricter with SSH:
~/.ssh→700authorized_keys→600
Incorrect permissions can silently break authentication even if the key exists.
3. Review Avaya Security / Hardening Policies
Check SSH configuration (sshd_config) for:
PubkeyAuthenticationAuthorizedKeysFileAny restrictions enforcing password-only access
Some Avaya builds disable or override key-based authentication during policy enforcement.
4. Check Provisioning / Sync Jobs (Most Common Cause)
This is typically the root cause in such cases.
👉 Validate if:
The user is managed via LDAP / Avaya System Manager
There are scheduled sync jobs resetting user configurations
If yes, these processes may remove or overwrite Discovery credentials after initial success.
5. Use a Dedicated Discovery User (Recommended)
Instead of shared/system accounts, create a dedicated user (e.g., sn_discovery) and:
Add SSH key manually
Exclude it from Avaya provisioning/sync
This helps ensure credential persistence.
6. Validate Shell Access
Some Avaya users are configured with restricted shells:
/sbin/nologinLimited CLI environments
Ensure the user has a valid shell (e.g., /bin/bash) required for Discovery commands.
7. Check Account Expiry / Lock Policies
Avaya systems may enforce:
Password expiry
Account lock/disable policies
Even if initial authentication succeeds, the account may later become unusable.
8. Correlate with Logs
In ServiceNow:
Check ECC Queue for SSH/authentication errors
Correlate timestamps with any Avaya-side sync or policy jobs
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
3 weeks ago
Hi @MarxA
Probable cause could be -
1. Private keys must be in OpenSSH format rather than Putty’s default .ppk format.
2. the MID server might try to use outdated cached credentials.
3. If the MID server is upgraded, it may no longer support older algorithms . If the key uses an old algorithm, it will fail.
Refer following post for solution :
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
3 weeks ago
Thanks, this is helpful.
We validated most of those points as well (key format, MID mapping, and account setup), but our issue appears to be more specific:
- It only affects a subset of Avaya/telephony devices and related servers.
- Other Linux/server groups using the same Discovery + MID setup remain stable.
- Initial discovery works, then later SSH auth fails until we re-add the key or recreate the service account.
So this looks less like a global PEM/PPK setup problem and more like a post-discovery drift on that Avaya subset (credential overwrite, account reset, key/policy sync, or automation on endpoint side).
If anyone has seen this specifically with Avaya environments, I’d appreciate guidance on:
- Avaya processes that may reset service accounts or authorized_keys over time
- Known key/algorithm compatibility changes after patch/upgrade on telephony nodes
