A Lean (mean) ServiceNow release process

SimonMorris · ‎05-30-2013

I came back from Knowledge13 two weeks ago energised, motivated and inspired by ServiceNow customers, partners and employees. I also came back with two main pieces of feedback from speaking to customers each night on the ServiceNow stand in the Expo hall.

"We love the transparency of ServiceNow"

and

"We struggle to keep up with the demand from the business to customise our instance"

Firstly - what about this transparency? Customers could walk up to the booth with a question about Asset Management and speak to Bryan who runs the team developing that product. Or with a PPM question and grab some of Mohans time.

I also see this most days on Twitter as well where customers can get in touch with employees and ask questions or get help. Transparency rocks! And I'm glad we haven't lost that culture in the 21 months I've been in the company and we've grown from ~500 to ~1500 employees.

Secondly - what about these struggles in releasing customisations? It isn't particularly hard to add functionality to applications we provide or applications written on the platform. But ServiceNow is a sticky experience (I'm sure our sales people like this fact) and once IT organisations expand their Service Management capability to new processes they start to see a backlog of improvements ready to be tested and deployed.

You might start off thinking this is a nice problem to have - lots of features with business value, ready to go into production. But volume becomes a problem and customers told me that they need more advice on how to release their work safely and smoothly.

Of course, this problem amplifies once organisations start to use ServiceNow outside of the IT department - into shared services and facilities and ultimately for line of business applications.

A robust release process is a must have.

So what could I blog about that combines our tendency for transparency and helps customers release into production smoothly….

An unhealthy release process

Rewind a little while and we suffered a release backlog problem that is similar to the issues I heard from customers a few weeks ago. We were getting swamped with requests for change from the business which we were focussing on building using Scrum - but we were really struggling to release them into production smoothly.

Signs of our dysfunctions were:

Big backlogs of finished but unreleased work.
Long periods of time between "Dev complete" and "Released".
Some releases didn't follow a repeatable process
Confusion as to the state of releases - have they been tested yet or not.
Confusion as to who was doing what and when
Lots of thrashing and rework, having to revisit work that was stuck in UAT

click to tweet

These dysfunctions were both obvious and painful. As a development team we were feeling them and the business certainly was.

We were getting poor feedback such as:

We're not communicating well about the status of work that we've called "done"
We don't know when this feature will hit production
Our output is variable - the business doesn't get a consistent experience
You guys need to do better!!

Dear customer - does this sound familiar to you? Having a backlog of unreleased work was starting to hurt and we resolved to fix it with a new process.

Using Lean principles to define a release process

Being big fans of Agile in general and Scrum in particular we knew where to start looking for a new release process.

It's worth mentioning that Scrum does nothing special to help you in releasing software. The end of the Scrum process - when the story (or feature) is in a closed state and the work has been demoed to stakeholders the release process is just starting to be involved.

Define an end-to-end process

Actually - a story that has been "done" by the development team and has made it out of the sprint represents NO value to end users whatsoever. They only start to care when the feature is in their hands in a production environment. Scrum is an excellent methodology for the construction of a feature but we are always going to handoff to the release process for additional work - at least User Acceptance Testing and Change Management.

click to tweet

A big tip is to think about your Software Development Lifecycle as an end-to-end process. It starts with a request from the business and ends with a released feature. Optimising only part of the process - i.e. Scrum - is a local optimisation and has no effect on the value delivered to the end user.

click to tweet

In fact - for us in Internal Systems Development - a really efficient and optimised Scrum process was hurting the overall effectiveness of the system. We were overloading a weak release process with too much work every sprint and it wasn't efficient enough at testing and deployment resulting in that painful backlog.

For anyone involved in building solutions - in ServiceNow or otherwise - I'd recommend getting a copy of Disciplined Agile Delivery. Reading this book a few years back helped me think about Scrum in an end-to-end context.

Disciplined Agile Delivery talks about software delivery as a 3 phase process - Inception, Construction and Transition. Scrum handles the Construction phase brilliantly. This article talks about the Transition phase. You need to think about all three.

click to tweet

Visualise your work

We knew we had a backlog of work but it wasn't until we wrote them all down onto paper cards and literally stuck them to the wall in lanes that we knew where in the process we had a bottleneck.

Instantly, by visualising the work we could see where to attack first. This took some custom development of our own to transplant the visualisation back into the tool but for some time we maintained paper cards so that we had an instant report on how many releases we had in progress and where they were.

Kanban - the act of visualising work with the aim of controlling it and making work through the system - is one of the most powerful techniques I can recommend for building a release process.

Pull, don't push work

We quickly saw that we were overloading our User Acceptance Testing users by pushing work onto them before they were ready. Classic mistake.

That work was stacking up, nothing was coming out from UAT and we were dumbly pushing more into that phase.

Another Lean principle we used in the process was to pull work when ready and not push. Have clear definitions of where work is ready to be pulled and what is currently in process.

click to tweet

Our Release process for pushing to production

So this is our release process for taking work from the development phase and tracking it all the way to production.

(Download the PDF version here)

I love the phase "Scrum is simple, but not easy". Thats also true of our release process - hopefully it's easy to understand but it took a few iterations to get where it is today.

To follow this process you would need:

Multiple instances - Dev, Test, UAT and Production
A ScrumMaster or Development Manager
A Product Owner
Some UAT testers
An admin to apply the update sets

Of course you might have some people performing multiple roles in the process - but I hope you have real users performing your UAT, right?

The process defines the following "Work Centers"

In Development
Dev Complete
Ready for UAT
In UAT
Ready to Deploy
Done!!

Whats in a Work Center (or phase)

In this context a Work Center represents a number of steps performed by an individual or team. A Work Center has a consistent input, consistent processing and a consistent output.

It can be defined by it's Man, Method, Materials and its Measurement.

The Man (and apologies for any potential latent sexism here - Man or Woman, but if you chose Man you get 4 M's in a row) defines which person or team is responsible for the activity in this phase. Taking the In Development Work Center as an example the Scrum Master and Development Team are responsible for the work in this phase.

The Method defines which actions are taken in this phase of work. It could be as small as "Move the update set and close the change" or as big as "Build the solution in a 2 week sprint"

The Materials define which artefacts are generated and worked upon in this phase. In the "In Development" the team work with Scrum Stories and Tasks, with Update Sets, test cases and Change request records

And lastly the Measurement defines that quality checks that are performed before work leaves this phase. Using "In Development" as our example the teams Definition of done checklist ensures consistent quality leaving that phase.

Whats nice about the Man-Method-Materials-Measurement definition is that it's easier to track what happened in the event of a failure. Lets say that something goes wrong in UAT and we want to know how to prevent that happening again. Look at the previous work centres using the 4 dimensions. Was it a systematic process problem in that phase (Method) or a human error (Man). Did the UAT phase receive a crappy test case (Materials) and what quality checks (Measurements) were in place. Do they need to be improved.

click to tweet

Doing lanes and Inventory lanes

The aim of the process is to allow work to flow smoothly from development to production. But we know that there will be some period of time between the two events and we're not aiming to release the day after the sprint demo.

To indicate phases where it's acceptable for work to temporarily stop we defined lanes as being "Doing" or "Inventory".

Work that is currently in an "Inventory" lane should be progressed as soon as possible, ideally by someone that has the responsibility of monitoring this queue.

Work in an "Doing" lane is permitted to wait there for sometime but we measure the time to track any lurkers.

Some examples - we aren't going to rush work out of the "In Development" lane as it's a "Doing" lane. But as soon as it's placed into the "Dev Complete" lane it's now just waiting for the Man (in our example the ScrumMaster) to do something with it and there should be the minimum of delay.

"In UAT" is a phase that will take some time so it's a "Doing" lane. As soon as the Product Owner moves the release into the "Ready to Deploy (Inventory)" lane nothing more can happen until it's moved to production which should be as soon as possible.

click to tweet

Handing exceptions in the process

When we ran the process through a couple of times we found a few examples where we needed additional flows. What happens for example if a release can't progress because testers aren't available. Should we have a new lane for this.

Our solution for these edge cases was to have a blocked flag on each release which can be set and annotated with a reason. This is a visual indicator that the release can't move out of it's current lane and that the reason should be read to understand the exception.

Measuring the process

We found this release process easy to measure. The most important metric to track is the overall cycle time - how long does it take a release from starting the process until it's been deployed. Knowing this metric and understanding the time taken in each lane gives us a focus in reducing the cycle time and increasing the speed of our deployments.

(Download the PDF version here)

Another measurement is the number of changes in each lane and the variance from previous days or weeks. An increase of changes in UAT shows that this phase is becoming a bottleneck. We were also able to set thresholds per lane and provide an "at a glance" health check of the process.

click to tweet

In summary

There you go! Releasing features into production is a critical moment for users of the solutions you build.

Using Lean principles to speed releases through the process relieves the pressure on development teams as they have less work in progress stuck in the system.

Let me know your thoughts and questions in the comments below!