ServiceNow Discovery: SSH credentials work initially, then stop working later

MarxA
Tera Contributor

ServiceNow Discovery: SSH credentials work initially, then stop working later (credentials appear to be cleared/overwritten)

Hi everyone,
I’m looking for outside ideas on an intermittent Discovery issue.

Context:

  • Environment includes Avaya/telephony devices and Linux servers.
  • Discovery works at first, and targets are discovered successfully.
  • After some time, Discovery can no longer authenticate over SSH.
  • In some cases, it looks like the SSH credential mapping/reference is no longer usable (for example, empty credential reference at runtime).
  • SNMP may still run, but SSH authentication path does not continue.

Observed behavior:

  • Re-adding the SSH key or recreating the discovery user restores discovery temporarily.
  • Later, the problem comes back.
  • This suggests something is changing after initial success (automation/policy/sync/rotation?).

What we already checked:

  • SSH port is reachable.
  • Manual server-side key files/permissions looked correct at check time.
  • No obvious manual deletion process identified on our team side.
  • We suspect an automated process may be overwriting/removing credential data over time.

Questions:

  1. Has anyone seen Discovery credentials (especially SSH key-based) become invalid after initial successful runs?
  2. What are the most common root causes in ServiceNow for this pattern?
  3. Which logs/tables are best to prove what changed the credential reference (audit, scheduled jobs, integrations, MID activity)?

Any pointers, known defects, or troubleshooting checklists would be appreciated.

Thanks!

 

1 ACCEPTED SOLUTION

Hi @MarxA ,

Thanks for the clarification—this is a key observation. Since the issue is isolated to Avaya/telephony devices, while other device groups in ServiceNow Discovery are working fine, this strongly points toward device-side behavior rather than a Discovery framework issue.

Based on similar scenarios, here are some Avaya-specific areas to validate:

1. Check if SSH Keys Are Being Overwritten

On Avaya systems, user environments are often managed by provisioning or sync tools (e.g., System Manager / LDAP). These processes can:

  • Recreate user profiles

  • Reset .ssh/authorized_keys

  • Remove previously added keys

Verify whether the SSH key used by Discovery still exists after failure and if the file timestamp changes automatically.

2. Validate SSH Key Persistence & Permissions

Avaya platforms are stricter with SSH:

  • ~/.ssh  700

  • authorized_keys  600

Incorrect permissions can silently break authentication even if the key exists.

3. Review Avaya Security / Hardening Policies

Check SSH configuration (sshd_config) for:

  • PubkeyAuthentication

  • AuthorizedKeysFile

  • Any restrictions enforcing password-only access

Some Avaya builds disable or override key-based authentication during policy enforcement.

4. Check Provisioning / Sync Jobs (Most Common Cause)

This is typically the root cause in such cases.

👉 Validate if:

  • The user is managed via LDAP / Avaya System Manager

  • There are scheduled sync jobs resetting user configurations

If yes, these processes may remove or overwrite Discovery credentials after initial success.

5. Use a Dedicated Discovery User (Recommended)

Instead of shared/system accounts, create a dedicated user (e.g., sn_discovery) and:

  • Add SSH key manually

  • Exclude it from Avaya provisioning/sync

This helps ensure credential persistence.

6. Validate Shell Access

Some Avaya users are configured with restricted shells:

  • /sbin/nologin

  • Limited CLI environments

Ensure the user has a valid shell (e.g., /bin/bash) required for Discovery commands.

 7. Check Account Expiry / Lock Policies

Avaya systems may enforce:

  • Password expiry

  • Account lock/disable policies

Even if initial authentication succeeds, the account may later become unusable.

8. Correlate with Logs

In ServiceNow:

  • Check ECC Queue for SSH/authentication errors

  • Correlate timestamps with any Avaya-side sync or policy jobs

View solution in original post

6 REPLIES 6

Tanushree Maiti
Kilo Patron

Hi  @MarxA 

 

Probable cause could be -

1. Private keys must be in OpenSSH format rather than Putty’s default .ppk format.

2.  the MID server might try to use outdated cached credentials.

3. If the MID server is upgraded, it may no longer support older algorithms . If the key uses an old algorithm, it will fail.

 

Refer following post for solution :

 

 https://www.servicenow.com/community/itom-forum/discovery-credentials-ssh-private-key/m-p/949068#:~:....

 

 

 

 

Please mark this response as Helpful & Accept it as solution if it assisted you with your question.
Regards
Tanushree Maiti
ServiceNow Technical Architect
Linkedin:

Thanks, this is helpful.

We validated most of those points as well (key format, MID mapping, and account setup), but our issue appears to be more specific:

  • It only affects a subset of Avaya/telephony devices and related servers.
  • Other Linux/server groups using the same Discovery + MID setup remain stable.
  • Initial discovery works, then later SSH auth fails until we re-add the key or recreate the service account.

So this looks less like a global PEM/PPK setup problem and more like a post-discovery drift on that Avaya subset (credential overwrite, account reset, key/policy sync, or automation on endpoint side).


If anyone has seen this specifically with Avaya environments, I’d appreciate guidance on:

  1. Avaya processes that may reset service accounts or authorized_keys over time
  2. Known key/algorithm compatibility changes after patch/upgrade on telephony nodes