ServiceNow Discovery: SSH credentials work initially, then stop working later

MarxA · ‎03-25-2026

ServiceNow Discovery: SSH credentials work initially, then stop working later (credentials appear to be cleared/overwritten)

Hi everyone,
I’m looking for outside ideas on an intermittent Discovery issue.

Context:

Environment includes Avaya/telephony devices and Linux servers.
Discovery works at first, and targets are discovered successfully.
After some time, Discovery can no longer authenticate over SSH.
In some cases, it looks like the SSH credential mapping/reference is no longer usable (for example, empty credential reference at runtime).
SNMP may still run, but SSH authentication path does not continue.

Observed behavior:

Re-adding the SSH key or recreating the discovery user restores discovery temporarily.
Later, the problem comes back.
This suggests something is changing after initial success (automation/policy/sync/rotation?).

What we already checked:

SSH port is reachable.
Manual server-side key files/permissions looked correct at check time.
No obvious manual deletion process identified on our team side.
We suspect an automated process may be overwriting/removing credential data over time.

Questions:

Has anyone seen Discovery credentials (especially SSH key-based) become invalid after initial successful runs?
What are the most common root causes in ServiceNow for this pattern?
Which logs/tables are best to prove what changed the credential reference (audit, scheduled jobs, integrations, MID activity)?

Any pointers, known defects, or troubleshooting checklists would be appreciated.

Thanks!

ayushraj7012933 · ‎03-25-2026

Hi @MarxA ,

Thanks for the clarification—this is a key observation. Since the issue is isolated to Avaya/telephony devices, while other device groups in ServiceNow Discovery are working fine, this strongly points toward device-side behavior rather than a Discovery framework issue.

Based on similar scenarios, here are some Avaya-specific areas to validate:

1. Check if SSH Keys Are Being Overwritten

On Avaya systems, user environments are often managed by provisioning or sync tools (e.g., System Manager / LDAP). These processes can:

Recreate user profiles
Reset .ssh/authorized_keys
Remove previously added keys

Verify whether the SSH key used by Discovery still exists after failure and if the file timestamp changes automatically.

2. Validate SSH Key Persistence & Permissions

Avaya platforms are stricter with SSH:

~/.ssh → 700
authorized_keys → 600

Incorrect permissions can silently break authentication even if the key exists.

3. Review Avaya Security / Hardening Policies

Check SSH configuration (sshd_config) for:

PubkeyAuthentication
AuthorizedKeysFile
Any restrictions enforcing password-only access

Some Avaya builds disable or override key-based authentication during policy enforcement.

4. Check Provisioning / Sync Jobs (Most Common Cause)

This is typically the root cause in such cases.

👉 Validate if:

The user is managed via LDAP / Avaya System Manager
There are scheduled sync jobs resetting user configurations

If yes, these processes may remove or overwrite Discovery credentials after initial success.

5. Use a Dedicated Discovery User (Recommended)

Instead of shared/system accounts, create a dedicated user (e.g., sn_discovery) and:

Add SSH key manually
Exclude it from Avaya provisioning/sync

This helps ensure credential persistence.

6. Validate Shell Access

Some Avaya users are configured with restricted shells:

/sbin/nologin
Limited CLI environments

Ensure the user has a valid shell (e.g., /bin/bash) required for Discovery commands.

7. Check Account Expiry / Lock Policies

Avaya systems may enforce:

Password expiry
Account lock/disable policies

Even if initial authentication succeeds, the account may later become unusable.

8. Correlate with Logs

In ServiceNow:

Check ECC Queue for SSH/authentication errors
Correlate timestamps with any Avaya-side sync or policy jobs

View solution in original post

ayushraj7012933 · ‎03-25-2026

Hi @MarxA ,

This behavior is typically caused by credential changes after initial success (rotation, overwrite, or access issues), not Discovery itself.

Below is a structured step-by-step solution aligned with ServiceNow best practices to identify and fix the issue.

Step-by-Step Solution

Step 1: Identify the Failing Credential

Go to Discovery → Status
Open a failed Discovery run
Check:
- Which SSH credential was used
- Error message (authentication / permission)

Step 2: Validate Credential Record

Navigate to:
- Discovery → Credentials
Open the SSH credential
Verify:
- Username
- Private key / password
- Active = true

Check Last updated and Updated by fields

Step 3: Check if Credential is Being Modified

Enable auditing (if not already):
- Table: discovery_credentials / ssh_private_key
Review:
- History → Audit

Look for:

Unexpected updates
System or integration user changes

Step 4: Check for External Credential Rotation

Verify if credentials are managed by:
- CyberArk / Vault / any external tool
Confirm:
- Whether SSH keys/passwords are rotated periodically

If yes:

Update Discovery to always use latest credential
Avoid hardcoded or outdated keys

Step 5: Validate MID Server Behavior

Go to:
- MID Server → Servers
Check:
- Status = Up
- Validated
Restart MID Server (test purpose)

This clears credential cache issues

Step 6: Check ECC Queue

Navigate to System Logs → ECC Queue
Filter:
- Topic contains SSH / Discovery
Review:
- Input/output payload
- Errors related to authentication

Step 7: Validate Target Server (Linux)

On target machine:

Check:
- authorized_keys file
- File permissions (600 / 700)
Confirm:
- SSH key still exists
- User is not locked/expired

Step 8: Review Scheduled Jobs / Integrations

Go to:
- System Scheduler → Scheduled Jobs
Check for:
- Jobs updating credentials
- Import sets / sync processes

Disable temporarily (for testing)

MarxA · ‎03-25-2026

Thanks, this is very helpful.

We’ve already started checking these steps, and I want to clarify one key point: this issue is not global across our Discovery environment. It is only happening on a specific subset of Avaya/telephony devices and related servers.

Other device groups using the same Discovery framework are stable, which suggests this may be tied to Avaya-specific behavior (account/key handling, sync/provisioning, or policy on those systems) rather than a general ServiceNow Discovery problem.

If anyone has seen this specifically with Avaya devices/servers, I’d appreciate targeted guidance on:

Avaya-side processes that may overwrite/remove SSH credentials after initial success
Known interactions between Avaya management/sync tools and SSH key persistence
Best way to keep Discovery credentials persistent for this device family

ayushraj7012933 · ‎03-25-2026

Hi @MarxA ,

Thanks for the clarification—this is a key observation. Since the issue is isolated to Avaya/telephony devices, while other device groups in ServiceNow Discovery are working fine, this strongly points toward device-side behavior rather than a Discovery framework issue.

Based on similar scenarios, here are some Avaya-specific areas to validate:

1. Check if SSH Keys Are Being Overwritten

On Avaya systems, user environments are often managed by provisioning or sync tools (e.g., System Manager / LDAP). These processes can:

Recreate user profiles
Reset .ssh/authorized_keys
Remove previously added keys

Verify whether the SSH key used by Discovery still exists after failure and if the file timestamp changes automatically.

2. Validate SSH Key Persistence & Permissions

Avaya platforms are stricter with SSH:

~/.ssh → 700
authorized_keys → 600

Incorrect permissions can silently break authentication even if the key exists.

3. Review Avaya Security / Hardening Policies

Check SSH configuration (sshd_config) for:

PubkeyAuthentication
AuthorizedKeysFile
Any restrictions enforcing password-only access

Some Avaya builds disable or override key-based authentication during policy enforcement.

4. Check Provisioning / Sync Jobs (Most Common Cause)

This is typically the root cause in such cases.

👉 Validate if:

The user is managed via LDAP / Avaya System Manager
There are scheduled sync jobs resetting user configurations

If yes, these processes may remove or overwrite Discovery credentials after initial success.

5. Use a Dedicated Discovery User (Recommended)

Instead of shared/system accounts, create a dedicated user (e.g., sn_discovery) and:

Add SSH key manually
Exclude it from Avaya provisioning/sync

This helps ensure credential persistence.

6. Validate Shell Access

Some Avaya users are configured with restricted shells:

/sbin/nologin
Limited CLI environments

Ensure the user has a valid shell (e.g., /bin/bash) required for Discovery commands.

7. Check Account Expiry / Lock Policies

Avaya systems may enforce:

Password expiry
Account lock/disable policies

Even if initial authentication succeeds, the account may later become unusable.

8. Correlate with Logs

In ServiceNow:

Check ECC Queue for SSH/authentication errors
Correlate timestamps with any Avaya-side sync or policy jobs

MarxA · ‎03-25-2026

Thank you! This is very helpful. We’ll apply these checks on our Avaya/telephony subset and validate them step by step.

We’ll compare a working vs failing host, correlate with provisioning/sync timing, and test a dedicated discovery account excluded from sync.

I appreciate the guidance! I'll let you know how it works out!