ServiceNow Best Practice for major outages
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-25-2018 05:03 PM
We are currently on Jakarta, considering an upgrade to Kingston
What is the Best Practice from ServiceNow on handling an Outage.
Definition: Outage
An issue that prevents 25% of a department from functioning properly, or one that directly affects revenue.
Currently, we are using Incidents for individual or noncritical issues. For major outages, we are using Problems.
I am aware that this is not the correct way to use them, so I would like to know what is recommended.
- Labels:
-
Incident Management
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-25-2018 06:34 PM
Always start with WHY, not HOW.
What outcome are you trying to achieve?
- Identify sources of outages and costs to kill root cause?
- Respond faster?
The WHY of the solution will significantly inspire the HOW. I've seen no less than 10 "best practice" "major incident management solutions" all die days after go-live... all because the focus was on How, not Why.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-26-2018 08:21 AM
Primarily documentation and reporting.
This will be used so that the correct group (those working on it and the people they report to) can see who is working on it and what is being done. It will also be used to document the RCA/After action report.
Additionally it will be used for reporting on these specific types of Outages, separate from Incidents and Problems.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-30-2018 08:51 AM
Well, there you go. None of that necessitates using Problem for outages.
The *best* scenario would be an event management solution that determines the compromised CI's, rolls that up to a business service, and creates a single Incident & Outage record. Without that level of CMDB/Event maturity, I'd be looking at the manual creation of Outage records after the outage cause was identified.
That'll keep your Incident & Problem management modules "pure" while prepping for greater CDMB / Event maturity down the road.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-26-2018 08:37 AM
I would say use incidents for raising the outages and use problems to work on root cause analysis for major P1s.
First is to work on providing relief through the incident and second is to work on RCA through PRB
Hope this helps
Shruti