Reduction in incident noise via correlation and deduplication
Reduction in P1 and P2 incidents via event management
Hours saved annually due to automation
In productivity gains
To fulfill this promise, we need high-performance, high-availability business services. Our enterprise applications and digital infrastructure must be rock-solid. Service outages and bottlenecks aren’t acceptable. They cost too much, damage our business, and–most importantly–they let down our customers. And to accelerate digitalization, we also need an incredibly agile IT environment–one where automation turns days into minutes.
Creating visibility, increasing reliability, and improving agility
IT operations is the engine that drives service reliability and agility. By leveraging the combined power of ServiceNow IT Operations Management and ServiceNow IT Service Management, we’ve put our own IT operations engine into high gear–and we continue to move faster and faster.
In this story, we look at the challenges we faced, how we addressed them, and the benefits we’ve realized. We hope our story helps guide you on your journey into the digital future.
ServiceNow Global IT Operations
Meet Joe Corpion and his team
Joe Corpion runs ServiceNow global IT operations. He’s led his team through a far-reaching transformation, dramatically increasing business service quality, accelerating service delivery, and increasing operational efficiency.
Incident overload, slow response, and lack of service visibility
When Joe began this journey, he had many of the same challenges that other IT operations teams face. He says, “We were overloaded with P1 and P2 incidents, which put a huge strain on our resources. Because of this, we were slow to respond, which directly impacted the business. And, we operated in infrastructure silos because we had no end-to-end service visibility. That meant we had several people chasing different symptoms of the same underlying issue, creating more work and more delays.”
Too many emails and escalations
Russ Blaesing, who oversees Joe’s global operations team, agrees. “When there was an issue, everyone’s inboxes would fill up with emails from our monitoring systems. The event noise was enormous, and juggling hundreds of emails made it worse. The voice team thought there was a voice problem. The network team thought there was a network problem. The systems team thought that there was a server problem. And so on. Meanwhile, we were dealing with multiple escalations, which created chaos in our NOC. It would often take us an hour or more before we realized everything was related.”
From NOC to SRT
How did Joe and his team drive this shift from infrastructure to services?
Building the foundation
For Joe, it starts with the CMDB. He explains, “The CMDB and discovery are the foundation for everything. You need a consistent and reliable view of your infrastructure. Otherwise, you can’t manage it effectively. By adopting a standardized CMDB model and keeping it up to date with discovery, you get that visibility. If your CMDB model is customized, consider going back to the out-of-the-box CMDB model and fix any significant data issues, such as duplicates or other inconsistencies.”
Joe also stresses the importance of governance. “We put a lot of focus on defining processes, roles, and responsibilities around our CMDB. Proper governance is critical to keep your CMDB healthy. Otherwise, your CMDB will fall into disrepair, and your effort is wasted. Remember, your CMDB isn’t just an infrastructure database. For example, it contains critical process information such as CI owners, and you need to make sure that the right stakeholders keep this up to date and accurate as well.”
Creating service visibility across ITOM and ITSM
Next, Joe’s team mapped their most important business services using ServiceNow® Service Mapping, making their CMDB serviceaware. Joe says, “Service maps are essential when you’re trying to diagnose and resolve service issues. And, they also tie directly into ITSM processes. For example, if you have a change request that involves taking a server in Dallas offline, how do you find out which business services will be affected? Without service maps, you just don’t know.”
“We started with our most critical services. Some of these have very complex topologies. For example, our Cisco contact center service has more than 50 types of components. However, by following best practices and working with our professional services team, we mapped this successfully. And, once we map a service, Service Mapping keeps the map up to date, so we don’t have to keep on refreshing and maintaining it–unlike with Visio.”
Managing business service health
Now that Joe’s team had a service-aware CMDB, they were ready to transform their NOC into an SRT. They used ServiceNow® Event Management to integrate, normalize, deduplicate, and correlate data across their monitoring systems, including SolarWinds, Splunk, and SAP Solution Manager.
According to Joe, “We’ve seen a 98% reduction in incident noise, even though our monitoring systems were already relatively well optimized. And, we no longer have to deal with emails. Instead, Event Management automatically reduces all of our events into a small number of actionable incidents, which ServiceNow® Incident Management automatically routes to the right SRT team members.”
However, the benefits go far beyond noise reduction. Because Event Management uses service maps to correlate events, Joe’s team now has a real-time view of business service health. Joe says, “The Event Management dashboard gives us a single pane of glass where we see the health of all of our business services. Instead of struggling with siloed infrastructure on multiple monitoring screens, we work together as a team to quickly resolve service issues. That’s critical. It’s powered our shift from a NOC to a true SRT.”
50% MTTR reduction
“When we started out, our goal was to reduce MTTR by 50%,” Joe relates. “We’ve achieved that. There are a lot of contributing factors, including process improvements and skills development, but Event Management plays a huge role. Because we see the status of all our business services, we can prioritize better, respond instantly, and get the right people together quickly. And, since we can drill down from top-level business health into the underlying service maps, we can see right away which CIs are causing the issue. That means we diagnose and resolve service outages much faster.”
Thangavel Viswam, who leads Joe’s infrastructure team, gives an example. “Our Master Data Management business service handles employee and customer information across ServiceNow, so it’s critical,” he says. “In one case, we had a significant outage. We saw this on the Event Management dashboard and contacted the MDM team right away. MDM depends on many applications, so we used our service maps to help the MDM team diagnose and resolve the issue. That saved us two precious hours.”
67% reduction in high-priority incidents
With Event Management, we’ve also dramatically reduced the number of high-priority incidents. According to Joe, “When you’re dealing with fewer incidents, you can focus on what’s important and fix things faster–which is another reason why our MTTR has come down. We’ve seen our P1 and P2 incidents in the NOC reduced by 67%, which has a huge impact.”
“Part of that reduction is because we now get a single actionable incident, rather than multiple uncorrelated incidents,” Joe adds. “But, we’re also able to prevent incidents. With ITOM’s Operational Intelligence, we use machine learning to spot anomalies, such as abnormally high numbers of CPU utilization alerts. That means we can investigate and fix the issue before things get worse and cause an outage.”
Increasing efficiency and agility
With ITOM, Joe and his team are saving 25,000 hours a year. However, that's just the start. They are also using ITOM to automate IT operations processes, slashing the time it takes to get things done. In fact, work is already underway to automate more than 20 IT operations processes using ServiceNow® Orchestration, with more to come.
Joe points to a process automation success–creating virtual machines (VMs). “Creating a VM involves more than a dozen tasks,” he explains. “Before, when someone needed a new VM, we would create it manually. On average, users had to wait 40 hours to get their VM. Now, we use ServiceNow® Cloud Management to automate the entire process. Users simply enter a self-service request in the ServiceNow portal, and Cloud Management uses Orchestration to carry out all the steps. Instead of wasting 40 hours, the user gets their VM in 30 minutes.”
“Once the VM is successfully created, Cloud Management automatically closes out the change. And, if it doesn’t work, Cloud Management marks the change as failed and automatically opens an incident so we can investigate. That’s a great example of how ITOM and ITSM work together.”
What advice does Joe have for other organizations that want to transform their IT operations? He makes three key points:
The bottom line
Joe has led his team through a remarkable transformation. By leveraging ServiceNow ITOM and ServiceNow ITSM portfolios, he has aligned IT operations with our business, creating service visibility and dramatically increasing service availability. The results prove that efficiency and service quality are two sides of the same coin:
The momentum continues to build. By automating IT operations processes, Joe is creating an agile IT operations environment that responds quickly and accurately to the needs of our business, slashing service delivery times from days to minutes. That’s accelerating our digitalization–so that we can continue to deliver great experiences for our customers and employees.
Explore the solution that helped us save 25,000 hours annually via automation
See how using our own solutions led to better visibility, self-service, and automation.
Learn how ServiceNow success depends on a close partnership between IT and the business.
Learn how we use our own security operations solution for 6X faster processing.