- Post History
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
3 hours ago
MID Server Upgrade During Patching: Why Your Change Window Needs to Be 5+ Hours
A Real Production Incident: How Manual Restarts Caused 2+ Hours of Yo-Yo Behavior
Author: Selva Arun (Meena Arun) | ServiceNow Rising Star 2024 & 2025
Published: December 2025
Environment: Healthcare Production | 6 MID Servers | Yokohama Patch 7 HF2b
Tags: MID Server | Upgrade | Patching | ITOM | Best Practices | Change Management | NOC | Troubleshooting
📚 Continuation of My Previous Article
This article is a continuation of my previous community post: "MID Server Pre-Upgrade Readiness Checklist for Any Upgrades" (October 2025), which focused on pre-upgrade validation.
While that article helps you prepare BEFORE the upgrade, this article focuses on what happens DURING and AFTER the upgrade process, based on a real production incident we experienced in December 2025.
Together, these two articles provide a complete guide to MID Server upgrade management.
Why I Created This Article
In December 2025, our organization experienced a production incident during a ServiceNow patching event. What should have been a routine 5-10 minute MID Server upgrade turned into a 2+ hour crisis with our MID Servers going up and down multiple times (yo-yo behavior).
After extensive investigation using Event Viewer logs, wrapper.log analysis, ServiceNow heartbeat data, and team discussions, we identified the root cause and implemented solutions. I'm sharing our findings hoping it will help someone in the ServiceNow community avoid the same pain.
🚨 The Real Production Incident: December 11, 2025
What Happened
| Time | Event |
|---|---|
| 17:08 PM | Instance patched to Yokohama Patch 7 HF2b |
| 18:45 PM | Change window started (CHG0106263) |
| 20:36 PM | Change window ended ⚠️ (too early!) |
| 20:38 PM | MID Server started upgrade (OUTSIDE change window!) |
| 20:39 PM | Alerts sent to NOC (incidents created) |
| 21:27 PM | NOC manually restarted MID Server (1st restart) |
| 22:16 PM | NOC manually restarted (2nd restart) |
| 22:25 PM | NOC manually restarted (3rd restart) |
| 22:34 PM | NOC manually restarted (4th restart) |
| 22:42 PM | MID Server started upgrade again (left alone this time) |
| 22:48 PM | ✅ Upgrade completed successfully (5 minutes!) |
The Numbers Tell the Story
- Total restart cycles: 5
- Total disruption time: 2+ hours
- Actual upgrade time (when left alone): 5 minutes
- Manual stops by users: 0 (all system-initiated)
- System reboots: 0
Root Cause Analysis
Finding #1: The Upgrade Process Works Correctly
When left alone, the final upgrade attempt (22:42 PM) completed in exactly 5 minutes 19 seconds with zero errors. The wrapper.log showed: "Upgrade process completed successfully"
Finding #2: Change Window Was Too Short
| Issue | What Happened |
|---|---|
| Change window duration | 1 hour 51 minutes (18:45 - 20:36) |
| MID Server upgrade started | 20:38 PM (2 minutes AFTER change closed!) |
| Result | Alerts sent to NOC (outside maintenance window) |
Finding #3: NOC Manual Restarts Caused Yo-Yo Behavior
Each time NOC manually restarted the MID Server service, it interrupted the natural upgrade process:
- NOC receives "MID Server Down" alert
- NOC restarts service per standard procedure
- Service starts, upgrade process begins again
- Upgrade stops service to deploy files
- ServiceNow marks "Down" after 100 seconds (heartbeat timeout)
- NOC receives another alert
- Cycle repeats...
Finding #4: Heartbeat Timeout vs Upgrade Duration Mismatch
| Setting | Value | Impact |
|---|---|---|
| Heartbeat Interval | 40 seconds | MID sends "I'm alive" every 40 sec |
| Heartbeat Timeout | 100 seconds | Marked "Down" after 100 sec silence |
| Upgrade Duration | 300+ seconds (5 min) | Always exceeds timeout! |
What We Discussed and Decided
After our investigation, we held a meeting with our ServiceNow Team, DevOps, and NOC to discuss findings and make decisions:
Decision 1: Change Window Must Be Minimum 5 Hours
Instance Patch: Hour 0 MID Detection: Hour 1-3 (staggered across servers) MID Download: Hour 1-4 (upgrade packages from install.service-now.com) MID Upgrade: Hour 3-5 (5-10 min per server) All Complete: Hour 5
Decision 2: No Manual Intervention During Upgrades
Manual restarts during the change window cause yo-yo behavior and extend downtime from 10 minutes to 2+ hours.
Decision 3: ServiceNow Team Validates (Not NOC)
Since we are the application owners, the ServiceNow Team is responsible for validating MID Server upgrade success - not NOC. NOC's role is monitoring only.
Decision 4: Post-Implementation Validation Process
At end of change window, ServiceNow Team checks:
- Status: Up
- Validated: Yes
- Version: Matches new patch version
What Happens During a MID Server Upgrade
Understanding the process helps explain why manual intervention causes problems:
- Instance Patched → Instance upgraded to new version
- MID Detection → MID Servers detect upgrade needed (hourly AutoUpgrade.3600 check)
- Download → MID Servers download upgrade packages from install.service-now.com
- Pre-Upgrade Check → Validates prerequisites (permissions, disk space, PowerShell)
- Service Stop → MID Server stops Windows service (5-10 min downtime begins)
- File Deployment → ServiceNow Platform Distribution Upgrade replaces files
- Auto-Restart → Service restarts automatically via start.bat
- Validation → MID Server sends heartbeat, status changes to "Up"
Critical Rules During Change Window
❌ DO NOT During Change Window:
- Restart MID Server services manually
- Respond to MID Server "Down" alerts by restarting services
- Run troubleshooting scripts on MID Servers
✅ DO During Change Window:
- Acknowledge alerts (but take no action)
- Wait for automatic recovery
- Monitor change ticket for updates
Post-Implementation Validation Steps
Step 1: Navigate to MID Servers
All > MID Server > Servers
Step 2: For each MID Server, verify:
| Field | Expected Value |
|---|---|
| Status | Up |
| Validated | Yes ✅ |
| Version | Matches new patch version |
Step 3: Compare version to expected version in change ticket
Decision Tree for Troubleshooting
All MID Servers show Status=Up, Validated=Yes, Version=Expected?
│
├── YES → Change successful. Close change ticket.
│
└── NO → Which issue?
│
├── Status = Down?
│ → Wait 20 more minutes
│ → Check wrapper.log - did upgrade complete successfully?
│ → If upgrade completed, restart MID Server service (from UI or on server)
│ → If upgrade failed, troubleshoot errors in wrapper.log
│
├── Validated = No?
│ → Click "Validate" button, wait 5 minutes
│ → If still No, check MID Server issues
│
└── Version = Wrong?
→ Check wrapper.log for upgrade errors
→ Restart MID Server service to trigger upgrade retry
Troubleshooting MID Server Upgrade Issues
If your MID Server upgrade fails, here are the steps to diagnose and fix:
Step 1: Check Upgrade History in ServiceNow
All > MID Server > Upgrade History
Look for failed stages: Pre Upgrade Check, Download, Extract, Deploy Binary Files
Step 2: Check wrapper.log on MID Server, this is where it is located on our servers
D:\ServiceNow MID Server <server_name>\agent\logs\wrapper.log
Key phrases to look for:
| Phrase | Meaning |
|---|---|
| "Checking to see if MID server needs to upgrade" | Upgrade check started |
| "Setting mid status to Upgrading" | Upgrade beginning |
| "Pre-upgrade validation tests successful" | Pre-checks passed |
| "Upgrading MID server" | File extraction starting |
| "Stopping MID server. Bootstrapping upgrade." | Service stopping for file deployment |
| "Upgrade complete" | Files deployed successfully |
| "Upgrade process completed successfully" | Full success ✅ |
Step 3: Check Windows Event Viewer
Event Viewer > Windows Logs > System Filter: Event ID 7036 (Service state changes)
PowerShell command to check service restarts:
Get-EventLog -LogName System -After (Get-Date).AddDays(-1) |
Where-Object { $_.EventID -eq 7036 -and $_.Message -like "*ServiceNow MID*" } |
Sort-Object TimeGenerated |
Format-Table TimeGenerated, Message -AutoSize
Step 4: Common Issues and Fixes
| Issue | Cause | Fix |
|---|---|---|
| MID stuck in "Upgrading" status | Upgrade process hung | Wait 30 min, then restart service once |
| Version not updated | Upgrade failed silently | Check wrapper.log for errors, restart to retry |
| "Access denied" errors | Service account permissions (PRB1547917) | Grant FullControl to service account on MID folder |
| Multiple restart cycles (yo-yo) | Manual intervention during upgrade | Stop intervening! Let upgrade complete naturally |
| "null (vnull)" capabilities | PowerShell execution policy restricted | Set execution policy to RemoteSigned |
| Pre-upgrade check failed | Various (permissions, disk, PowerShell) | Run my Pre-Upgrade Validation Script (previous article) |
| File lock errors | Antivirus or Application Experience | Whitelist MID folder, enable Application Experience |
Step 5: If All Else Fails - Manual Restart
After confirming upgrade completed in wrapper.log, restart the service:
From ServiceNow UI:
MID Server record > Related Links > Restart MID
From Windows Server:
services.msc > ServiceNow MID Server_[name] > Right-click > Restart
From Command Line:
net stop "ServiceNow MID Server_MIDSERVERNAME" net start "ServiceNow MID Server_MIDSERVERNAME"
Change Ticket Requirements
For future ServiceNow patches, ensure your change ticket includes:
1. Change Window Duration
Minimum: 5 hours Recommended: 6 hours (with buffer)
2. Affected CIs - Include All MID Servers
MIDSERVER01 MIDSERVER02 ... (all production MID servers)
3. Instructions for Operations Team
IMPORTANT: MID Server Upgrade Instructions During this change window: - MID Server services will stop and restart AUTOMATICALLY - DO NOT manually restart MID Server services - DO NOT respond to MID Server "Down" alerts - Upgrade takes 5-10 minutes per server - this is NORMAL Post-implementation (at end of change window): - ServiceNow Team validates MID Server Status/Validated/Version
Summary: Key Takeaways
Key Information
| Item | Value |
|---|---|
| Change Window | Minimum 5 hours |
| Manual Intervention | NOT required - upgrade is automatic |
| Expected Downtime | 5-10 minutes per MID Server |
| Heartbeat Timeout | 100 seconds (alerts expected during upgrade) |
| Root Cause of Yo-Yo | Manual restarts interrupt upgrade process |
| Solution | Maintenance windows + no manual intervention |
Related ServiceNow Documentation
- KB0696937: MID Server upgrade process - What actually happens
- KB0596459: Troubleshoot MID Server upgrade issues
- KB0713557: How to manually restore or upgrade a MID Server after failed auto-upgrade
- KB0779816: How to continue a MID Server upgrade after it has crashed
- KB1001745: MID Server fails to restart after upgrade (PRB1547917)
📝 This Is What We Learned
This article represents our perspective based on a real production incident and investigation. Every environment is different, and your experience may vary.
I'd love to hear from you:
- How do YOU handle MID Server upgrades during patching?
- Have you experienced similar yo-yo behavior?
- What's your change window duration?
- If anything has changed in your environment, please share!
Please comment below with your experience! Let's learn from each other.
💡 Did This Help You?
If you found this article helpful, please mark it as 'Helpful'. This helps other community members who might have the same question find the answer more easily.
Thank you for your consideration!
Connect with me:
🎥 NowDivas YouTube Channel: https://www.youtube.com/@TheNowDivas
🏆 ServiceNow Rising Star 2024 & 2025
💬 Feel free to ask questions in the comments below!
Tags: MID Server | Upgrade | Patching | ITOM | Best Practices | Change Management | NOC | Troubleshooting | KB0696937 | KB0596459 | Heartbeat | Maintenance Window | Yo-Yo Behavior | Production Incident | Healthcare