Upgrading and Cloning Instances in Parallel

Tim Grindlay
Kilo Sage

Hi All, 

We've been working with ServiceNow since Geneva back in 2016 and have run 90% of upgrades and patching without the support of a Service partner and over the years we've worked on cutting down the timeframe to patch and upgrade and generally we work to a 2 - 3 week timeframe from upgrading development to upgrading production. 

We have 3 instance environments DEV, UAT and Production and our DEV/UAT portion of the upgrade plan will look something like this:

Day 1

  • (Morning / Afternoon) Pre-clone activities, comms etc.
  • (5pm) Lock out testers and developers. (Upgrade administrator access only)
  • (5pm) Backup all update sets in development.

Day 2

  • (3am) : Scheduled clone for both DEV and UAT.
  • (5am) : Scheduled upgrade both DEV and UAT.
  • (8am) : Post clone / upgrade smoke tests
  • (9am) : Process skipped updates

Once the skipped updates have been processed and we're happy the instances are in a good state, we'll open the environments to testers and developers.

We're shifting this activity to our Service Partner so our core team can focus on other work, but they're of the opinion that the environments, clones, and upgrades should be done on different days as there is a risk in cloning and upgrading both environments on the same day which means this portion of the process gets extended by 3 or 4 days.

What risk is there in doing the clones and upgrades on the same day?

1 ACCEPTED SOLUTION

Eric W
Tera Guru

If you've been able to follow a similar cadence since Geneva for 90% of your upgrades, you've got a pretty good sense of where things can go wrong with your upgrade.

It also suggests to me that your instances are small enough to be cloned and upgraded comparatively quickly, and that you've managed to keep customization to a minimum, making the skip list process go fairly smoothly.

As instances get larger, and customizations (and integrations) get more complex, then there are more opportunities to fail -- both in the clone and the upgrade.  If your organization has the ability to absorb those atypical events when they happen, I would personally recommend against *planning* to fail by pre-emptively moving clones and upgrades to different days.  However, it is important to be able to be resilient in the case one of the actions do fail, and to learn and adapt for "next time."

When an instance clone by itself takes between 1-3 days, and it fails on the third day requiring a restart that takes another 3 days, and this has happened on each of your last 2 clones, then definitely plan more time next time. The same consideration applies for the upgrades.

View solution in original post

15 REPLIES 15

Tim Grindlay
Kilo Sage

To close out this thread I'm going to mark Eric's response as correct as I feel it best summarises our situation by balancing risk, learning and adjusting after each iteration, and not planning to fail. I also don't feel that any of the risks mentioned in other replies would justify changing our process given the former points. Again, thank you all for your responses.