MID Server Upgrade During Patching: Why Your Change Window Needs to Be 5+ Hours

Selva Arun · ‎12-16-2025

MID Server Upgrade During Patching: Why Your Change Window Needs to Be 5+ Hours

A Real Production Incident: How Manual Restarts Caused 2+ Hours of Yo-Yo Behavior

Author: Selva Arun (Meena Arun) | ServiceNow Rising Star 2024 & 2025

Published: December 2025

Environment: Healthcare Production | 6 MID Servers | Yokohama Patch 7 HF2b

📚 Continuation of My Previous Article

This article is a continuation of my previous community post: "MID Server Pre-Upgrade Readiness Checklist for Any Upgrades" (October 2025), which focused on pre-upgrade validation.

While that article helps you prepare BEFORE the upgrade, this article focuses on what happens DURING and AFTER the upgrade process, based on a real production incident we experienced in December 2025.

Together, these two articles provide a complete guide to MID Server upgrade management.

Why I Created This Article

In December 2025, our organization experienced a production incident during a ServiceNow patching event. What should have been a routine 5-10 minute MID Server upgrade turned into a 2+ hour crisis with our MID Servers going up and down multiple times (yo-yo behavior).

After extensive investigation using Event Viewer logs, wrapper.log analysis, ServiceNow heartbeat data, and team discussions, we identified the root cause and implemented solutions. I'm sharing our findings hoping it will help someone in the ServiceNow community avoid the same pain.

🚨 The Real Production Incident: December 11, 2025

What Happened

Time	Event
17:08 PM	Instance patched to Yokohama Patch 7 HF2b
18:45 PM	Change window started (CHG0106263)
20:36 PM	Change window ended ⚠️ (too early!)
20:38 PM	MID Server started upgrade (OUTSIDE change window!)
20:39 PM	Alerts sent to NOC (incidents created)
21:27 PM	NOC manually restarted MID Server (1st restart)
22:16 PM	NOC manually restarted (2nd restart)
22:25 PM	NOC manually restarted (3rd restart)
22:34 PM	NOC manually restarted (4th restart)
22:42 PM	MID Server started upgrade again (left alone this time)
22:48 PM	✅ Upgrade completed successfully (5 minutes!)

The Numbers Tell the Story

Total restart cycles: 5
Total disruption time: 2+ hours
Actual upgrade time (when left alone): 5 minutes
Manual stops by users: 0 (all system-initiated)
System reboots: 0

Root Cause Analysis

Finding #1: The Upgrade Process Works Correctly

When left alone, the final upgrade attempt (22:42 PM) completed in exactly 5 minutes 19 seconds with zero errors. The wrapper.log showed: "Upgrade process completed successfully"

Finding #2: Change Window Was Too Short

Issue	What Happened
Change window duration	1 hour 51 minutes (18:45 - 20:36)
MID Server upgrade started	20:38 PM (2 minutes AFTER change closed!)
Result	Alerts sent to NOC (outside maintenance window)

Finding #3: NOC Manual Restarts Caused Yo-Yo Behavior

Each time NOC manually restarted the MID Server service, it interrupted the natural upgrade process:

NOC receives "MID Server Down" alert
NOC restarts service per standard procedure
Service starts, upgrade process begins again
Upgrade stops service to deploy files
ServiceNow marks "Down" after 100 seconds (heartbeat timeout)
NOC receives another alert
Cycle repeats...

Finding #4: Heartbeat Timeout vs Upgrade Duration Mismatch

Setting	Value	Impact
Heartbeat Interval	40 seconds	MID sends "I'm alive" every 40 sec
Heartbeat Timeout	100 seconds	Marked "Down" after 100 sec silence
Upgrade Duration	300+ seconds (5 min)	Always exceeds timeout!

⚠️ Key Insight: Since 100 seconds < 300 seconds, alerts will ALWAYS be triggered during a normal upgrade. This is expected behavior - which is why maintenance windows are critical!

What We Discussed and Decided

After our investigation, we held a meeting with our ServiceNow Team, DevOps, and NOC to discuss findings and make decisions:

Decision 1: Change Window Must Be Minimum 5 Hours

Instance Patch:     Hour 0
MID Detection:      Hour 1-3 (staggered across servers)
MID Download:       Hour 1-4 (upgrade packages from install.service-now.com)
MID Upgrade:        Hour 3-5 (5-10 min per server)
All Complete:       Hour 5

Decision 2: No Manual Intervention During Upgrades

Manual restarts during the change window cause yo-yo behavior and extend downtime from 10 minutes to 2+ hours.

Decision 3: ServiceNow Team Validates (Not NOC)

Since we are the application owners, the ServiceNow Team is responsible for validating MID Server upgrade success - not NOC. NOC's role is monitoring only.

Decision 4: Post-Implementation Validation Process

At end of change window, ServiceNow Team checks:

Status: Up
Validated: Yes
Version: Matches new patch version

What Happens During a MID Server Upgrade

Understanding the process helps explain why manual intervention causes problems:

Instance Patched → Instance upgraded to new version
MID Detection → MID Servers detect upgrade needed (hourly AutoUpgrade.3600 check)
Download → MID Servers download upgrade packages from install.service-now.com
Pre-Upgrade Check → Validates prerequisites (permissions, disk space, PowerShell)
Service Stop → MID Server stops Windows service (5-10 min downtime begins)
File Deployment → ServiceNow Platform Distribution Upgrade replaces files
Auto-Restart → Service restarts automatically via start.bat
Validation → MID Server sends heartbeat, status changes to "Up"

✅ Key Point: When left alone, this entire process completes automatically in 5-10 minutes per MID Server!

Critical Rules During Change Window

❌ DO NOT During Change Window:

Restart MID Server services manually
Respond to MID Server "Down" alerts by restarting services
Run troubleshooting scripts on MID Servers

✅ DO During Change Window:

Acknowledge alerts (but take no action)
Wait for automatic recovery
Monitor change ticket for updates

Post-Implementation Validation Steps

Step 1: Navigate to MID Servers

All > MID Server > Servers

Step 2: For each MID Server, verify:

Field	Expected Value
Status	Up
Validated	Yes ✅
Version	Matches new patch version

Step 3: Compare version to expected version in change ticket

Decision Tree for Troubleshooting

All MID Servers show Status=Up, Validated=Yes, Version=Expected?
│
├── YES → Change successful. Close change ticket.
│
└── NO → Which issue?
          │
          ├── Status = Down?
          │   → Wait 20 more minutes
          │   → Check wrapper.log - did upgrade complete successfully?
          │   → If upgrade completed, restart MID Server service (from UI or on server)
          │   → If upgrade failed, troubleshoot errors in wrapper.log
          │
          ├── Validated = No?
          │   → Click "Validate" button, wait 5 minutes
          │   → If still No, check MID Server issues
          │
          └── Version = Wrong?
              → Check wrapper.log for upgrade errors
              → Restart MID Server service to trigger upgrade retry

Troubleshooting MID Server Upgrade Issues

If your MID Server upgrade fails, here are the steps to diagnose and fix:

Step 1: Check Upgrade History in ServiceNow

All > MID Server > Upgrade History

Look for failed stages: Pre Upgrade Check, Download, Extract, Deploy Binary Files

Step 2: Check wrapper.log on MID Server, this is where it is located on our servers

D:\ServiceNow MID Server <server_name>\agent\logs\wrapper.log

Key phrases to look for:

Phrase	Meaning
"Checking to see if MID server needs to upgrade"	Upgrade check started
"Setting mid status to Upgrading"	Upgrade beginning
"Pre-upgrade validation tests successful"	Pre-checks passed
"Upgrading MID server"	File extraction starting
"Stopping MID server. Bootstrapping upgrade."	Service stopping for file deployment
"Upgrade complete"	Files deployed successfully
"Upgrade process completed successfully"	Full success ✅

Step 3: Check Windows Event Viewer

Event Viewer > Windows Logs > System
Filter: Event ID 7036 (Service state changes)

PowerShell command to check service restarts:

Get-EventLog -LogName System -After (Get-Date).AddDays(-1) |
  Where-Object { $_.EventID -eq 7036 -and $_.Message -like "*ServiceNow MID*" } |
  Sort-Object TimeGenerated |
  Format-Table TimeGenerated, Message -AutoSize

Step 4: Common Issues and Fixes

Issue	Cause	Fix
MID stuck in "Upgrading" status	Upgrade process hung	Wait 30 min, then restart service once
Version not updated	Upgrade failed silently	Check wrapper.log for errors, restart to retry
"Access denied" errors	Service account permissions (PRB1547917)	Grant FullControl to service account on MID folder
Multiple restart cycles (yo-yo)	Manual intervention during upgrade	Stop intervening! Let upgrade complete naturally
"null (vnull)" capabilities	PowerShell execution policy restricted	Set execution policy to RemoteSigned
Pre-upgrade check failed	Various (permissions, disk, PowerShell)	Run my Pre-Upgrade Validation Script (previous article)
File lock errors	Antivirus or Application Experience	Whitelist MID folder, enable Application Experience

Step 5: If All Else Fails - Manual Restart

After confirming upgrade completed in wrapper.log, restart the service:

From ServiceNow UI:

MID Server record > Related Links > Restart MID

From Windows Server:

services.msc > ServiceNow MID Server_[name] > Right-click > Restart

From Command Line:

net stop "ServiceNow MID Server_MIDSERVERNAME"
net start "ServiceNow MID Server_MIDSERVERNAME"

Change Ticket Requirements

For future ServiceNow patches, ensure your change ticket includes:

1. Change Window Duration

Minimum: 5 hours
Recommended: 6 hours (with buffer)

2. Affected CIs - Include All MID Servers

MIDSERVER01
MIDSERVER02
... (all production MID servers)

3. Instructions for Operations Team

IMPORTANT: MID Server Upgrade Instructions

During this change window:
- MID Server services will stop and restart AUTOMATICALLY
- DO NOT manually restart MID Server services
- DO NOT respond to MID Server "Down" alerts
- Upgrade takes 5-10 minutes per server - this is NORMAL

Post-implementation (at end of change window):
- ServiceNow Team validates MID Server Status/Validated/Version

Summary: Key Takeaways

Key Information

Item	Value
Change Window	Minimum 5 hours
Manual Intervention	NOT required - upgrade is automatic
Expected Downtime	5-10 minutes per MID Server
Heartbeat Timeout	100 seconds (alerts expected during upgrade)
Root Cause of Yo-Yo	Manual restarts interrupt upgrade process
Solution	Maintenance windows + no manual intervention

MID Server Upgrade During Patching: Why Your Change Window Needs to Be 5+ Hours

MID Server Upgrade During Patching: Why Your Change Window Needs to Be 5+ Hours

A Real Production Incident: How Manual Restarts Caused 2+ Hours of Yo-Yo Behavior

📚 Continuation of My Previous Article

Why I Created This Article

🚨 The Real Production Incident: December 11, 2025

What Happened

The Numbers Tell the Story

Root Cause Analysis

Finding #1: The Upgrade Process Works Correctly

Finding #2: Change Window Was Too Short

Finding #3: NOC Manual Restarts Caused Yo-Yo Behavior

Finding #4: Heartbeat Timeout vs Upgrade Duration Mismatch

What We Discussed and Decided

Decision 1: Change Window Must Be Minimum 5 Hours

Decision 2: No Manual Intervention During Upgrades

Decision 3: ServiceNow Team Validates (Not NOC)

Decision 4: Post-Implementation Validation Process

What Happens During a MID Server Upgrade

Critical Rules During Change Window

❌ DO NOT During Change Window:

✅ DO During Change Window:

Post-Implementation Validation Steps

Decision Tree for Troubleshooting

Troubleshooting MID Server Upgrade Issues

Step 1: Check Upgrade History in ServiceNow

Step 2: Check wrapper.log on MID Server, this is where it is located on our servers

Step 3: Check Windows Event Viewer

Step 4: Common Issues and Fixes

Step 5: If All Else Fails - Manual Restart

Change Ticket Requirements

1. Change Window Duration

2. Affected CIs - Include All MID Servers

3. Instructions for Operations Team

Summary: Key Takeaways

Key Information

Related ServiceNow Documentation

📝 This Is What We Learned

💡 Did This Help You?