MID server selection
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yesterday
We're using ITOM Discovery to populate our CMDB with data from our cloud accounts, however, most schedules are set up to use any available MID server.
This becomes an issue when we need to do any type of work on any specific MID server as they could be in use but an active schedule. Would it cause any issues (data corruption/inconsistency) in our CMDB to just shutdown a MID server and restart any failed schedules afterwards?
What would the best approach be to decom a MID server that is constantly in use?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yesterday - last edited yesterday
The concern regarding data integrity during MID Server maintenance is common, but from a technical standpoint, the risk is minimal. Since ServiceNow uses the IRE (Identification and Reconciliation Engine), the platform handles interrupted updates gracefully. If a MID Server is shut down during a scan, the CMDB simply retains its current state until the next successful execution.
However, to ensure high availability and avoid operational friction, the best practice is to move from a manual intervention model to a resilient architecture:
1. Implementing MID Server Clusters Instead of relying on a single 'available' server, the most effective approach is to organize MID Servers into Load Balancing Clusters. This creates a self-healing layer: if one server needs to be decommissioned or patched, the cluster automatically redistributes the workload to the remaining nodes without failing active schedules.
2. The 'Graceful Shutdown' Process Rather than an immediate shutdown of the service, a more controlled method involves using the 'Paused' status:
-
By setting the MID Server to 'Paused', the platform is prevented from assigning new tasks to that node.
-
This allows any active probes or patterns currently in the ECC Queue to complete their processing.
-
Once the queue is clear, the server can be safely decommissioned without impacting the discovery cycle.
3. Strategic Governance (IP Ranges and Capabilities) From an Enterprise Architecture perspective, governance is key. Assigning specific IP Ranges and Capabilities to MID Servers ensures that maintenance on a specific segment of the infrastructure is predictable and doesn't create blind spots in cloud discovery.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yesterday
Hey,
My general recommendation with using the discovery is to have dedicated MID servers for it. These can be grouped together in a MID server cluster record. The main reason for this recommendation is twofold:
Firstly, the discovery can block up a ECC queue for a mid-server by quite a lot, especially when you have windows discovery running. This is expected behavior by design, as the ECC queue gets filled with all actions triggered by the waterfall approach of the discovery (starting with a port scan, then queueing all further actions at once).
Secondly, putting MID server into a cluster record (link to the documentation) results in these MID servers to act as if they share a ECC queue. If a server becomes unavailable (e.g. through decommissioning) then a server of the cluster can take over.
This generally also works without using MID server clusters (as you do it at the moment). However, this has the downside that a "backup" MID server might also be used for other business relevant processes (e.g. imports, exports or other automations). Using a cluster instead will be precise in which MID server is to be used.
Remember, from a business perspective, the ServiceNow discovery is (usually) not time critical. If you take down all MID servers except one, it will be totally acceptable, if the discovery stops running. However, event management connectors must run at all cost (as an example) to ensure infrastructure issues are still worked on.
In short: Generally shutting down a MID server or restarting one while the discovery is running won't be too much of an issue. However, if you are not using explicit MID server assignments or clusters (or restrictions through the capabilities/applications setting), you may block the operation of another MID server, which will now be occupied running the discovery.
My recommendation: Use a MID server cluster and assign all discovery schedules to run with just that cluster (or if you are running a global discovery, sometimes clusters per region make sense as well). This way restarting a MID server will just halt the discovery (it will pick up where it left again), but no other MID server activities will be impacted.
Generally - at least in production - your business critical automation should run on a separate MID server cluster (at least with 2 MID servers) separate from all scheduled activities. These activities are:
- Scheduled bulk imports
- Discovery activities
So in general I personally recommend at least 4 MID servers (resulting in 2 clusters) whenever you are using the Discovery. And depending on additional applications (e.g. event management should run on its on 2-server-cluster) or networking size, more MID servers may be reasonable.
Hope this helps.
Regards
Fabian
ps.: Generally a crash in the discovery should not impact the discovery negatively (except data not being up-to-date).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yesterday
Hey,
As midserver is a backbone for discovery, creating mid server clusters and distributing the mid servers would be the best option.
Example -
Cluster A - MidServer1, Mid Server2, MidServer3
Cluster B - MidServer4, MidServer 5, MidServer6
Cluster C - MidServer1, MidServer2, MidServer4
Cluster D - MidServer4, MidServer 5, MidServer1
..........so on
You can choose the combination as per your business requirement.
Highlights having the midserver cluster.
1. MidServer cluster will also take care of the load balancing where the jobs gets distributed which will not impact the performance.
2. Maintenance of any mid server will not impact any of the discovery schedule as there would be other midservers which can take care of the load.
3. Having multiple midserver cluster combination is recommended so that the same server set combination are not highly utilised and can be assigned to different schedules.
Thank You
