The Zurich release has arrived! Interested in new features and functionalities? Click here for more

4/10 Q. ITOM Project Help !! (Please No Paste CHATGPT ANSWER)

SandeepKSingh
Kilo Sage

I must ensure Discovery and outbound integrations keep running even if one MID Server goes down. How we can Build a high-availability  + failover setup for two (or more) MID Servers across the same network, so that:

  • Jobs are load-balanced during normal operation.

  • If MID-A fails, jobs auto-failover to MID-B (and vice-versa) with minimal disruption.

  • The design remains upgrade-safe and easy to support.

2 ACCEPTED SOLUTIONS

Ravi Gaurav
Giga Sage
Giga Sage

Hi @SandeepKSingh 
As you must be knowing
Every MID Server sends a heartbeat to the ServiceNow instance (by default every 40 seconds).

If that heartbeat stops coming in (because the service crashed, server rebooted, or network broke), ServiceNow marks the MID as Down.

You can see this in MID Servers > Status (shows Up, Down, Validate).

 

 

According to me You must ensure Discovery and outbound integrations keep running even if one MID Server ...goes down. 

Jobs are load-balanced during normal operation.

If MID-A fails, jobs auto-failover to MID-B (and vice-versa) with minimal disruption.

The design remains upgrade-safe and easy to support.

 

Example :

MID-A and MID-B: Same site/subnet

Selection method: MID Server Cluster.

Cluster: the one containing MID-A and MID-B.

Keep IP ranges common to the cluster (no hard-binding to a single MID).

Configure monitoring/notifications when a MID goes Down (e.g., notify NOC if heartbeat older than X minutes).

 

 

--------------------------------------------------------------------------------------------------------------------------


If you found my response helpful, I would greatly appreciate it if you could mark it as "Accepted Solution" and "Helpful."
Your support not only benefits the community but also encourages me to continue assisting. Thank you so much!

Thanks and Regards
Ravi Gaurav | ServiceNow MVP 2025,2024 | ServiceNow Practice Lead | Solution Architect
CGI
M.Tech in Data Science & AI

 YouTube: https://www.youtube.com/@learnservicenowwithravi
 LinkedIn: https://www.linkedin.com/in/ravi-gaurav-a67542aa/

View solution in original post

@Ravi Gaurav 

 

As per my understanding, it is not mandatory to setup heartbeat and MID Server cluster configuration is enough for failover to work. 

 

You can refer Business Rule 'MID Server Cluster Management' for more information on the logic.

Bhuvan_0-1756275444658.png

https://www.servicenow.com/docs/bundle/zurich-servicenow-platform/page/product/mid-server/reference/...

 

https://noderegister.service-now.com/kb?id=kb_article_view&sysparm_article=KB0661756

 

https://www.servicenow.com/docs/bundle/zurich-servicenow-platform/page/product/mid-server/concept/c_...

 

Monitoring heartbeat, failed jobs and job status in ServiceNow MID Server Dashboard is useful for tracking overall health of MID Servers.

 

If this helped to answer your query, please mark it helpful & accept the solution.

 

Thanks,

Bhuvan

View solution in original post

11 REPLIES 11

You’re right that MID Server Cluster configuration alone is sufficient for ServiceNow to perform load-balancing and failover of jobs — that’s handled by the Business Rule: “MID Server Cluster Management”, which automatically re-routes work when one node in the cluster goes down.

 

But in a real-world production setup, relying only on cluster logic without health/heartbeat monitoring is risky:

 

Failover Logic: Yes, ServiceNow will shift jobs to another MID in the same cluster when one goes Down. No manual heartbeat config is needed for this — it’s built-in.

 

Operational Monitoring: However, heartbeat status is the only visibility admins have into whether a MID is healthy. Without monitoring alerts, you might not even know a node is dead until jobs start failing or users complain.

 

That’s why I suggested both — cluster config for HA and heartbeat alerts for operational assurance.

--------------------------------------------------------------------------------------------------------------------------


If you found my response helpful, I would greatly appreciate it if you could mark it as "Accepted Solution" and "Helpful."
Your support not only benefits the community but also encourages me to continue assisting. Thank you so much!

Thanks and Regards
Ravi Gaurav | ServiceNow MVP 2025,2024 | ServiceNow Practice Lead | Solution Architect
CGI
M.Tech in Data Science & AI

 YouTube: https://www.youtube.com/@learnservicenowwithravi
 LinkedIn: https://www.linkedin.com/in/ravi-gaurav-a67542aa/

@Ravi Gaurav 

 

Yes, I meant the same in my explanation that heartbeat monitoring is useful but is not responsible for load balancing or failover.

 

When MID server takes a job from ServiceNow and if it fails before responding with payload, in-flight job will not be automatically transferred to cluster MID node and the job will be updated with status error/failed. We need to manually re-run the job or setup auto retry mechanism for such scenarios. For these functionalities, MID Server Heartbeat monitoring, heartbeat failure events and automated retry logic would be helpful.

 

Thanks,

Bhuvan

I already setup the cluster notification is something I need

 

Ravi Gaurav
Giga Sage
Giga Sage

Hi @SandeepKSingh 
As you must be knowing
Every MID Server sends a heartbeat to the ServiceNow instance (by default every 40 seconds).

If that heartbeat stops coming in (because the service crashed, server rebooted, or network broke), ServiceNow marks the MID as Down.

You can see this in MID Servers > Status (shows Up, Down, Validate).

 

 

According to me You must ensure Discovery and outbound integrations keep running even if one MID Server ...goes down. 

Jobs are load-balanced during normal operation.

If MID-A fails, jobs auto-failover to MID-B (and vice-versa) with minimal disruption.

The design remains upgrade-safe and easy to support.

 

Example :

MID-A and MID-B: Same site/subnet

Selection method: MID Server Cluster.

Cluster: the one containing MID-A and MID-B.

Keep IP ranges common to the cluster (no hard-binding to a single MID).

Configure monitoring/notifications when a MID goes Down (e.g., notify NOC if heartbeat older than X minutes).

 

 

--------------------------------------------------------------------------------------------------------------------------


If you found my response helpful, I would greatly appreciate it if you could mark it as "Accepted Solution" and "Helpful."
Your support not only benefits the community but also encourages me to continue assisting. Thank you so much!

Thanks and Regards
Ravi Gaurav | ServiceNow MVP 2025,2024 | ServiceNow Practice Lead | Solution Architect
CGI
M.Tech in Data Science & AI

 YouTube: https://www.youtube.com/@learnservicenowwithravi
 LinkedIn: https://www.linkedin.com/in/ravi-gaurav-a67542aa/

Let me try on this .. To create a notification for hearbeat.. will let you know. I already setup the cluster