ServiceNow Discovery: SSH credentials work initially, then stop working later
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
3 hours ago
ServiceNow Discovery: SSH credentials work initially, then stop working later (credentials appear to be cleared/overwritten)
Hi everyone,
I’m looking for outside ideas on an intermittent Discovery issue.
Context:
- Environment includes Avaya/telephony devices and Linux servers.
- Discovery works at first, and targets are discovered successfully.
- After some time, Discovery can no longer authenticate over SSH.
- In some cases, it looks like the SSH credential mapping/reference is no longer usable (for example, empty credential reference at runtime).
- SNMP may still run, but SSH authentication path does not continue.
Observed behavior:
- Re-adding the SSH key or recreating the discovery user restores discovery temporarily.
- Later, the problem comes back.
- This suggests something is changing after initial success (automation/policy/sync/rotation?).
What we already checked:
- SSH port is reachable.
- Manual server-side key files/permissions looked correct at check time.
- No obvious manual deletion process identified on our team side.
- We suspect an automated process may be overwriting/removing credential data over time.
Questions:
- Has anyone seen Discovery credentials (especially SSH key-based) become invalid after initial successful runs?
- What are the most common root causes in ServiceNow for this pattern?
- Which logs/tables are best to prove what changed the credential reference (audit, scheduled jobs, integrations, MID activity)?
Any pointers, known defects, or troubleshooting checklists would be appreciated.
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
3 hours ago
Hi @MarxA ,
This behavior is typically caused by credential changes after initial success (rotation, overwrite, or access issues), not Discovery itself.
Below is a structured step-by-step solution aligned with ServiceNow best practices to identify and fix the issue.
Step-by-Step Solution
Step 1: Identify the Failing Credential
Go to Discovery → Status
Open a failed Discovery run
Check:
Which SSH credential was used
Error message (authentication / permission)
Step 2: Validate Credential Record
Navigate to:
Discovery → Credentials
Open the SSH credential
Verify:
Username
Private key / password
Active = true
Check Last updated and Updated by fields
Step 3: Check if Credential is Being Modified
Enable auditing (if not already):
Table:
discovery_credentials/ssh_private_key
Review:
History → Audit
Look for:
Unexpected updates
System or integration user changes
Step 4: Check for External Credential Rotation
Verify if credentials are managed by:
CyberArk / Vault / any external tool
Confirm:
Whether SSH keys/passwords are rotated periodically
If yes:
Update Discovery to always use latest credential
Avoid hardcoded or outdated keys
Step 5: Validate MID Server Behavior
Go to:
MID Server → Servers
Check:
Status = Up
Validated
Restart MID Server (test purpose)
This clears credential cache issues
Step 6: Check ECC Queue
Navigate to System Logs → ECC Queue
Filter:
Topic contains SSH / Discovery
Review:
Input/output payload
Errors related to authentication
Step 7: Validate Target Server (Linux)
On target machine:
Check:
authorized_keysfileFile permissions (600 / 700)
Confirm:
SSH key still exists
User is not locked/expired
Step 8: Review Scheduled Jobs / Integrations
Go to:
System Scheduler → Scheduled Jobs
Check for:
Jobs updating credentials
Import sets / sync processes
Disable temporarily (for testing)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
3 hours ago
Thanks, this is very helpful.
We’ve already started checking these steps, and I want to clarify one key point: this issue is not global across our Discovery environment. It is only happening on a specific subset of Avaya/telephony devices and related servers.
Other device groups using the same Discovery framework are stable, which suggests this may be tied to Avaya-specific behavior (account/key handling, sync/provisioning, or policy on those systems) rather than a general ServiceNow Discovery problem.
If anyone has seen this specifically with Avaya devices/servers, I’d appreciate targeted guidance on:
- Avaya-side processes that may overwrite/remove SSH credentials after initial success
- Known interactions between Avaya management/sync tools and SSH key persistence
- Best way to keep Discovery credentials persistent for this device family
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
3 hours ago
Hi @MarxA ,
Thanks for the clarification—this is a key observation. Since the issue is isolated to Avaya/telephony devices, while other device groups in ServiceNow Discovery are working fine, this strongly points toward device-side behavior rather than a Discovery framework issue.
Based on similar scenarios, here are some Avaya-specific areas to validate:
1. Check if SSH Keys Are Being Overwritten
On Avaya systems, user environments are often managed by provisioning or sync tools (e.g., System Manager / LDAP). These processes can:
Recreate user profiles
Reset
.ssh/authorized_keysRemove previously added keys
Verify whether the SSH key used by Discovery still exists after failure and if the file timestamp changes automatically.
2. Validate SSH Key Persistence & Permissions
Avaya platforms are stricter with SSH:
~/.ssh→700authorized_keys→600
Incorrect permissions can silently break authentication even if the key exists.
3. Review Avaya Security / Hardening Policies
Check SSH configuration (sshd_config) for:
PubkeyAuthenticationAuthorizedKeysFileAny restrictions enforcing password-only access
Some Avaya builds disable or override key-based authentication during policy enforcement.
4. Check Provisioning / Sync Jobs (Most Common Cause)
This is typically the root cause in such cases.
👉 Validate if:
The user is managed via LDAP / Avaya System Manager
There are scheduled sync jobs resetting user configurations
If yes, these processes may remove or overwrite Discovery credentials after initial success.
5. Use a Dedicated Discovery User (Recommended)
Instead of shared/system accounts, create a dedicated user (e.g., sn_discovery) and:
Add SSH key manually
Exclude it from Avaya provisioning/sync
This helps ensure credential persistence.
6. Validate Shell Access
Some Avaya users are configured with restricted shells:
/sbin/nologinLimited CLI environments
Ensure the user has a valid shell (e.g., /bin/bash) required for Discovery commands.
7. Check Account Expiry / Lock Policies
Avaya systems may enforce:
Password expiry
Account lock/disable policies
Even if initial authentication succeeds, the account may later become unusable.
8. Correlate with Logs
In ServiceNow:
Check ECC Queue for SSH/authentication errors
Correlate timestamps with any Avaya-side sync or policy jobs
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
3 hours ago
Thank you! This is very helpful. We’ll apply these checks on our Avaya/telephony subset and validate them step by step.
We’ll compare a working vs failing host, correlate with provisioning/sync timing, and test a dedicated discovery account excluded from sync.
I appreciate the guidance! I'll let you know how it works out!
