What is Problem Management?

Problem management is a core component of the ITSM framework, and is the process for identifying and managing root causes and potential IT incidents.

Problem management identifies and manages problems using preventative methods and identifying underlying causes to help prevent future issues. With a structured workflow for diagnosing root causes and fixing problems, it helps eliminate recurring incidents and minimize the impact of unexpected disruptions. Problem Management makes it possible to identify the root cause of service-affecting problems, and can likewise help prevent issues before they occur.

Problem management can have several benefits when executed correctly.

Continuous service improvement

Taking the time to fix a problem can prevent low-level performance and prevent further problems that can interrupt services in the future. Seamless integration between problems and all other ITSM processes enables organizations to proactively mitigate issues and eliminate recurring incidents.

Avoid costly incidents

Incidents as a result of problems can cost an organization a lot of time and money if not properly managed. On the other hand, reducing incidents using effective problem management saves organizations significant amounts, but eliminating major issues before they can damage services, products, or a businesses reputation.

Increased productivity

A company can be more productive if they don’t spend time and resources responding to problems that can be prevented.

Decreased time to resolution

Best practices surrounding problem analysis will help teams more quickly and accurately respond to service interruptions and prevent any downtime. Use structured problem analysis to correlate problems and coordinate workflows to find the fastest way to root-cause.

Learn from underlying causes

Teams can consistently learn from incidents when they effectively practice problem management.

Increase customer and employee satisfaction

Customers and employees are more satisfied when there are fewer problems along the way. Patience can run thin if there are problems—especially if the problems are consistently the same.

Speed up service restoration

Services can benefit when there is visibility into known errors and established workarounds for IT staff.

Minimize service disruptions

Teams can detect problems before they evolve into something more critical, which prevents downtime and service interruptions. IT can proactively use built-in dashboards for service performance and configurations.

Accelerate root cause resolution

IT teams can create structured problem analyses by correlating problems and coordinating workflows. With a consolidated view of the incidents and related changes, IT can deliver faster responses and solutions.

Problem management vs. knowledge management

The management of knowledge is the creation of a repository of documentation and solutions that include incidents and problems. Knowledge management is utilized to help in the solving of problems. When a known error is documented, a single click generates a Known Error Article in the knowledge base—saving time and effort in fixing recurring issues in the future.

Problem management vs. incident management

Problems are potential causes of one or several incidents. While problem and incident management overlap in a few disciplines, there are key differences. If a recent deployment creates a lapse in services, it can be rolled back—this resolves the incident of service interruption. But, rolling it back didn’t solve the problem that caused the incident. The underlying problem is still there.

Incident management has a shorter timeline, and the goal is primarily to resolve the incident and return services back to their former state. Problem management is a bit more complex; it can take longer and looks to identify what lies beneath an incident, why the incident occurred, and what can prevent the incident from occurring again.

Problem management vs. change management

Change management describes a process in which changes are planned, tracked, and released without any service disruption. In the event that a change causes a disruption, the change is analyzed during a problem management process. Change Management provides a systematic approach to control the life cycle of all changes, facilitating beneficial changes to be made with minimum disruption to IT services.

Proactive problem detection

Identify problems that can be fixed, or find workarounds for problems before incidents can happen.

Categorize and prioritize

Keep teams organized and working on the most important problems by tracking and assessing known problems.

ITIL Problem Management Process

Investigate and diagnose

Identify the cause of the problem and outline the best possible course of action to remedy the problem.

Create a known error record

Recording information about problems leads to less downtime in the event that a problem triggers an incident. Creating an error record can keep information about workarounds readily available to lessen the impact of an incident or problem.

Create a workaround, if necessary

Always consider temporary solutions to reduce the impact of problems, and work to prevent them from becoming incidents. While workarounds aren’t ideal, they can limit business impact in the event that a problem arises that can’t be easily identified.

Resolve and close the problem

A closed problem has been eliminated and will not cause another incident.

Major problem review

Take the time to review the resolution of a problem, and ensure that the problem has been fully eliminated. Record lessons learned, and identify preventative actions that should be taken in the event that the problem occurs again.

Don’t rely on reactive, root-cause analysis

There usually isn’t a single cause behind a problem. Teams should consider all potential factors and not just identify a single factor reactively.

Encourage an open environment where problems are shared

Team members should always have an open conversation where they are encouraged to share their findings and facts without punishment or retribution.

Focus on critical services

Address the problems that affect services which deliver the most value to the organization and prioritize them over lesser problems.

Ask questions

Foster an environment where team members ask questions of each other and systems.

Spread knowledge

Teams should always share knowledge with each other, and hopefully other teams can learn from them.

Foster learning

Problem management effectively never ends, even for the best-performing teams. Teams should constantly be learning, iterating on processes, and improving them to ensure that problems have a smaller impact on customers and other teams.

Track follow-up

Develop a standardized method to stay on top of follow-ups. Utilize software that can help team members prioritize tasks, track progress, and follow-up with problems.

Brainstorming

Assemble stakeholders into a single place and discuss the possible causes of a problem—this is a good method for teams to eliminate any possible silos. Brainstorming involves:

  • Round robin discussions
  • A larger volume of ideas in a short time
  • Diverse idea generation
  • Full participation from each individual as they contribute to the problem analysis.

Ishikawa / Fishbone / Cause and Effect Analysis

A cause and effect are a problem and its possible causes—this method analyzes causes and defines their relationships to effects. It involves primary and secondary causes of a problem, and their different categories like people, processes, products, etc.

Kepner Tregoe Problem Analysis

A more logical approach that begins with the identification of the problem, then the description. Causes are established, tests are conducted, then the cause of the problem is identified and verified.

Known error

A documented problem with a known root cause and a workaround.

Known error database (KEDB)

Created by problem management and applied to manage known errors.

Problem

A cause of one or more incidents. When a problem record is created, the cause of the problem is usually not known.

Problem management report

A report that supplies problem information to other service management processes.

Problem Record

Contains all the details of a problem and documents the history of the problem from initial detection to resolution.

Suggested new known error

A suggestion of the creation of a new entry in the KEDB.

Suggested new problem

A notification that alerts employees about a suspected problem that may lead to further investigation.

Suggested new workaround

The suggestions to add a new workaround into the KEDB.

Workaround

A solution to a problem that is meant to be temporary and is meant to eliminate the impact of a known error or problem when a resolution is not yet available. Workarounds tend to minimize the impact of incidents and problems when the cause cannot be identified.

Problem management is an approach that is designed to effectively manage the life cycles of current and potential issues. The goal of problem management is to eliminate recurring incidents, prevent future incidents, and minimize the impact of incidents which cannot be prevented. This involves diagnosing root causes and talking the proper steps towards resolving the issue. Additionally, problem management is designed to improve in effectiveness over time, by recording relevant data associated with problems, and acting as a database of effective solutions and workarounds.

ServiceNow brings automated workflows to problem management, allowing managers to accurately document solutions, and freeing up IT teams to focus on other relevant concerns. When unexpected issues occur, restore services quickly using structured workflows to easily identify and remediate root causes, document solutions and resolutions, and provide IT teams with a consolidated view of incidents—all from a single, cloud-based platform.

The end result? Reduced service disruptions, improved problem analysis, and less impact from issues over time.

By providing a fully consolidated view of causes, incidents, issues, and changes, organizations can deliver faster responses and quicker, more effective solutions—ServiceNow makes it all possible.

Get started with Problem Management

ServiceNow® Problem Management makes it possible to minimize disruptions, speed up service, and accelerate root cause resolution.