Problem management is a core component of the ITSM framework, and is the process for identifying and managing root causes and potential IT incidents.
Problem management identifies and manages problems using preventative methods and identifying underlying causes to help prevent future issues. With a structured workflow for diagnosing root causes and fixing problems, it helps eliminate recurring incidents and minimise the impact of unexpected disruptions. Problem Management makes it possible to identify the root cause of service-affecting problems, and can likewise help prevent issues before they occur.
Problem management can have several benefits when executed correctly.
Taking the time to fix a problem can prevent low-level performance and prevent further problems that can interrupt services in the future. Seamless integration between problems and all other ITSM processes enables organisations to proactively mitigate issues and eliminate recurring incidents.
Incidents as a result of problems can cost an organisation a lot of time and money if not properly managed. On the other hand, reducing incidents using effective problem management saves organisations significant amounts, by eliminating major issues before they can damage services, products, or a business's reputation.
A company can be more productive if they don’t spend time and resources responding to problems that can be prevented.
Best practices surrounding problem analysis will help teams more quickly and accurately respond to service interruptions and prevent any downtime. Use structured problem analysis to correlate problems and coordinate workflows to find the fastest way to root-cause.
Teams can consistently learn from incidents when they effectively practice problem management.
Customers and employees are more satisfied when there are fewer problems along the way. Patience can run thin if there are problems—especially if the problems are consistently the same.
Services can benefit when there is visibility into known errors and established workarounds for IT staff.
Teams can detect problems before they evolve into something more critical, which prevents downtime and service interruptions. IT can proactively use built-in dashboards for service performance and configurations.
IT teams can create structured problem analyses by correlating problems and coordinating workflows. With a consolidated view of the incidents and related changes, IT can deliver faster responses and solutions.
The management of knowledge is the creation of a repository of documentation and solutions that include incidents and problems. Knowledge management is utilised to help in the solving of problems. When a known error is documented, a single click generates a Known Error Article in the knowledge base—saving time and effort in fixing recurring issues in the future.
Problems are potential causes of one or several incidents. While problem and incident management overlap in a few disciplines, there are key differences. If a recent deployment creates a lapse in services, it can be rolled back—this resolves the incident of service interruption. But, rolling it back didn’t solve the problem that caused the incident. The underlying problem is still there.
Incident management has a shorter timeline, and the goal is primarily to resolve the incident and return services back to their former state. Problem management is a bit more complex; it can take longer and looks to identify what lies beneath an incident, why the incident occurred, and what can prevent the incident from occurring again.
Change management describes a process in which changes are planned, tracked, and released without any service disruption. In the event that a change causes a disruption, the change is analysed during a problem management process. Change Management provides a systematic approach to control the life cycle of all changes, facilitating beneficial changes to be made with minimum disruption to IT services.
Identify problems that can be fixed, or find workarounds for problems before incidents can happen.
Keep teams organised and working on the most important problems by tracking and assessing known problems.
Identify the cause of the problem and outline the best possible course of action to remedy the problem.
Recording information about problems leads to less downtime in the event that a problem triggers an incident. Creating an error record can keep information about workarounds readily available to lessen the impact of an incident or problem.
Always consider temporary solutions to reduce the impact of problems, and work to prevent them from becoming incidents. While workarounds aren’t ideal, they can limit business impact in the event that a problem arises that can’t be easily identified.
A closed problem has been eliminated and will not cause another incident.
Take the time to review the resolution of a problem, and ensure that the problem has been fully eliminated. Record lessons learnt, and identify preventative actions that should be taken in the event that the problem occurs again.
There usually isn’t a single cause behind a problem. Teams should consider all potential factors and not just identify a single factor reactively.
Team members should always have an open conversation where they are encouraged to share their findings and facts without punishment or retribution.
Address the problems that affect services that deliver the most value to the organisation and prioritise them over lesser problems.
Foster an environment where team members ask questions of each other and systems.
Teams should always share knowledge with each other, and hopefully other teams can learn from them.
Problem management effectively never ends, even for the best-performing teams. Teams should constantly be learning, iterating on processes, and improving them to ensure that problems have a smaller impact on customers and other teams.
Develop a standardised method to stay on top of follow-ups. Utilise software that can help team members prioritise tasks, track progress, and follow-up with problems.
Assemble stakeholders into a single place and discuss the possible causes of a problem—this is a good method for teams to eliminate any possible silos. Brainstorming involves:
A cause and effect are a problem and its possible causes—this method analyses causes and defines their relationships to effects. It involves primary and secondary causes of a problem, and their different categories like people, processes, products etc.
A more logical approach that begins with the identification of the problem, then the description. Causes are established, tests are conducted, then the cause of the problem is identified and verified.
A documented problem with a known root cause and a workaround.
Created by problem management and applied to manage known errors.
A cause of one or more incidents. When a problem record is created, the cause of the problem is usually not known.
A report that supplies problem information to other service management processes.
Contains all the details of a problem and documents the history of the problem from initial detection to resolution.
A suggestion of the creation of a new entry in the KEDB.
A notification that alerts employees about a suspected problem that may lead to further investigation.
The suggestions to add a new workaround into the KEDB.
A solution to a problem that is meant to be temporary and is meant to eliminate the impact of a known error or problem when a resolution is not yet available. Workarounds tend to minimise the impact of incidents and problems when the cause cannot be identified.
Problem management is an approach that is designed to effectively manage the life cycles of current and potential issues. The goal of problem management is to eliminate recurring incidents, prevent future incidents, and minimise the impact of incidents that cannot be prevented. This involves diagnosing root causes and taking the proper steps towards resolving the issue. Additionally, problem management is designed to improve in effectiveness over time, by recording relevant data associated with problems, and acting as a database of effective solutions and workarounds.
ServiceNow brings automated workflows to problem management, allowing managers to accurately document solutions, and freeing up IT teams to focus on other relevant concerns. When unexpected issues occur, restore services quickly using structured workflows to easily identify and remediate root causes, document solutions and resolutions, and provide IT teams with a consolidated view of incidents—all from a single, cloud-based platform.
The end result? Reduced service disruptions, improved problem analysis, and less impact from issues over time.
By providing a fully consolidated view of causes, incidents, issues, and changes, organisations can deliver faster responses and quicker, more effective solutions—ServiceNow makes it all possible.
ServiceNow® Problem Management makes it possible to minimise disruptions, speed up service, and accelerate root cause resolution.