- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
ServiceNow Discovery: SSH credentials work initially, then stop working later (credentials appear to be cleared/overwritten)
Hi everyone,
I’m looking for outside ideas on an intermittent Discovery issue.
Context:
- Environment includes Avaya/telephony devices and Linux servers.
- Discovery works at first, and targets are discovered successfully.
- After some time, Discovery can no longer authenticate over SSH.
- In some cases, it looks like the SSH credential mapping/reference is no longer usable (for example, empty credential reference at runtime).
- SNMP may still run, but SSH authentication path does not continue.
Observed behavior:
- Re-adding the SSH key or recreating the discovery user restores discovery temporarily.
- Later, the problem comes back.
- This suggests something is changing after initial success (automation/policy/sync/rotation?).
What we already checked:
- SSH port is reachable.
- Manual server-side key files/permissions looked correct at check time.
- No obvious manual deletion process identified on our team side.
- We suspect an automated process may be overwriting/removing credential data over time.
Questions:
- Has anyone seen Discovery credentials (especially SSH key-based) become invalid after initial successful runs?
- What are the most common root causes in ServiceNow for this pattern?
- Which logs/tables are best to prove what changed the credential reference (audit, scheduled jobs, integrations, MID activity)?
Any pointers, known defects, or troubleshooting checklists would be appreciated.
Thanks!
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
Hi @MarxA ,
Thanks for the clarification—this is a key observation. Since the issue is isolated to Avaya/telephony devices, while other device groups in ServiceNow Discovery are working fine, this strongly points toward device-side behavior rather than a Discovery framework issue.
Based on similar scenarios, here are some Avaya-specific areas to validate:
1. Check if SSH Keys Are Being Overwritten
On Avaya systems, user environments are often managed by provisioning or sync tools (e.g., System Manager / LDAP). These processes can:
Recreate user profiles
Reset
.ssh/authorized_keysRemove previously added keys
Verify whether the SSH key used by Discovery still exists after failure and if the file timestamp changes automatically.
2. Validate SSH Key Persistence & Permissions
Avaya platforms are stricter with SSH:
~/.ssh→700authorized_keys→600
Incorrect permissions can silently break authentication even if the key exists.
3. Review Avaya Security / Hardening Policies
Check SSH configuration (sshd_config) for:
PubkeyAuthenticationAuthorizedKeysFileAny restrictions enforcing password-only access
Some Avaya builds disable or override key-based authentication during policy enforcement.
4. Check Provisioning / Sync Jobs (Most Common Cause)
This is typically the root cause in such cases.
👉 Validate if:
The user is managed via LDAP / Avaya System Manager
There are scheduled sync jobs resetting user configurations
If yes, these processes may remove or overwrite Discovery credentials after initial success.
5. Use a Dedicated Discovery User (Recommended)
Instead of shared/system accounts, create a dedicated user (e.g., sn_discovery) and:
Add SSH key manually
Exclude it from Avaya provisioning/sync
This helps ensure credential persistence.
6. Validate Shell Access
Some Avaya users are configured with restricted shells:
/sbin/nologinLimited CLI environments
Ensure the user has a valid shell (e.g., /bin/bash) required for Discovery commands.
7. Check Account Expiry / Lock Policies
Avaya systems may enforce:
Password expiry
Account lock/disable policies
Even if initial authentication succeeds, the account may later become unusable.
8. Correlate with Logs
In ServiceNow:
Check ECC Queue for SSH/authentication errors
Correlate timestamps with any Avaya-side sync or policy jobs
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
Hi @MarxA ,
This behavior is typically caused by credential changes after initial success (rotation, overwrite, or access issues), not Discovery itself.
Below is a structured step-by-step solution aligned with ServiceNow best practices to identify and fix the issue.
Step-by-Step Solution
Step 1: Identify the Failing Credential
Go to Discovery → Status
Open a failed Discovery run
Check:
Which SSH credential was used
Error message (authentication / permission)
Step 2: Validate Credential Record
Navigate to:
Discovery → Credentials
Open the SSH credential
Verify:
Username
Private key / password
Active = true
Check Last updated and Updated by fields
Step 3: Check if Credential is Being Modified
Enable auditing (if not already):
Table:
discovery_credentials/ssh_private_key
Review:
History → Audit
Look for:
Unexpected updates
System or integration user changes
Step 4: Check for External Credential Rotation
Verify if credentials are managed by:
CyberArk / Vault / any external tool
Confirm:
Whether SSH keys/passwords are rotated periodically
If yes:
Update Discovery to always use latest credential
Avoid hardcoded or outdated keys
Step 5: Validate MID Server Behavior
Go to:
MID Server → Servers
Check:
Status = Up
Validated
Restart MID Server (test purpose)
This clears credential cache issues
Step 6: Check ECC Queue
Navigate to System Logs → ECC Queue
Filter:
Topic contains SSH / Discovery
Review:
Input/output payload
Errors related to authentication
Step 7: Validate Target Server (Linux)
On target machine:
Check:
authorized_keysfileFile permissions (600 / 700)
Confirm:
SSH key still exists
User is not locked/expired
Step 8: Review Scheduled Jobs / Integrations
Go to:
System Scheduler → Scheduled Jobs
Check for:
Jobs updating credentials
Import sets / sync processes
Disable temporarily (for testing)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
Thanks, this is very helpful.
We’ve already started checking these steps, and I want to clarify one key point: this issue is not global across our Discovery environment. It is only happening on a specific subset of Avaya/telephony devices and related servers.
Other device groups using the same Discovery framework are stable, which suggests this may be tied to Avaya-specific behavior (account/key handling, sync/provisioning, or policy on those systems) rather than a general ServiceNow Discovery problem.
If anyone has seen this specifically with Avaya devices/servers, I’d appreciate targeted guidance on:
- Avaya-side processes that may overwrite/remove SSH credentials after initial success
- Known interactions between Avaya management/sync tools and SSH key persistence
- Best way to keep Discovery credentials persistent for this device family
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
Hi @MarxA ,
Thanks for the clarification—this is a key observation. Since the issue is isolated to Avaya/telephony devices, while other device groups in ServiceNow Discovery are working fine, this strongly points toward device-side behavior rather than a Discovery framework issue.
Based on similar scenarios, here are some Avaya-specific areas to validate:
1. Check if SSH Keys Are Being Overwritten
On Avaya systems, user environments are often managed by provisioning or sync tools (e.g., System Manager / LDAP). These processes can:
Recreate user profiles
Reset
.ssh/authorized_keysRemove previously added keys
Verify whether the SSH key used by Discovery still exists after failure and if the file timestamp changes automatically.
2. Validate SSH Key Persistence & Permissions
Avaya platforms are stricter with SSH:
~/.ssh→700authorized_keys→600
Incorrect permissions can silently break authentication even if the key exists.
3. Review Avaya Security / Hardening Policies
Check SSH configuration (sshd_config) for:
PubkeyAuthenticationAuthorizedKeysFileAny restrictions enforcing password-only access
Some Avaya builds disable or override key-based authentication during policy enforcement.
4. Check Provisioning / Sync Jobs (Most Common Cause)
This is typically the root cause in such cases.
👉 Validate if:
The user is managed via LDAP / Avaya System Manager
There are scheduled sync jobs resetting user configurations
If yes, these processes may remove or overwrite Discovery credentials after initial success.
5. Use a Dedicated Discovery User (Recommended)
Instead of shared/system accounts, create a dedicated user (e.g., sn_discovery) and:
Add SSH key manually
Exclude it from Avaya provisioning/sync
This helps ensure credential persistence.
6. Validate Shell Access
Some Avaya users are configured with restricted shells:
/sbin/nologinLimited CLI environments
Ensure the user has a valid shell (e.g., /bin/bash) required for Discovery commands.
7. Check Account Expiry / Lock Policies
Avaya systems may enforce:
Password expiry
Account lock/disable policies
Even if initial authentication succeeds, the account may later become unusable.
8. Correlate with Logs
In ServiceNow:
Check ECC Queue for SSH/authentication errors
Correlate timestamps with any Avaya-side sync or policy jobs
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
Thank you! This is very helpful. We’ll apply these checks on our Avaya/telephony subset and validate them step by step.
We’ll compare a working vs failing host, correlate with provisioning/sync timing, and test a dedicated discovery account excluded from sync.
I appreciate the guidance! I'll let you know how it works out!
