Niveditha Arula
Mega Expert

Hi,

INCIDENT MANAGEMENT:

An IT organization that follows an ITSM framework should have an incident management process in place. While it is understood that incidents should be discovered and remedied as quickly as possible, IT organizations should also track and analyze incidents as part of a continual improvement process. By understanding which incidents occur most frequently and which cost the business the most, resources – including IT, development, and vendor compliance – can be deployed to have the maximum impact.

 

An incident will typically be caused by one or more problems. Again, various ITSM frameworks have different definitions of problems, but a very nice one comes from Rob England in The ITSM Review where he states, “a problem is in fact the cause of zero or more incidents.” We can certainly understand that a problem can be the cause of one or more incidents – perhaps a piece of hardware failed which generated a service outage. However, England’s definition also allows for the possibility of latent incidents. For example, if a printer ceases to operate outside of business hours it is a problem but not yet an incident since no one is around to attempt to use the printer. Further, the code your developers write may have undiscovered bugs that haven’t yet been triggered and thus comprise a problem that no one is aware of that may or may not lead to an incident in the future.

 

PROBLEM MANAGEMENT:

An IT organization following an ITSM framework should have a problem management process in place. This process will include the discovery of root causes of problems as well as mitigation of those causes. As with incident management, problems should be tracked and analyzed so that commonalities can be discovered. Perhaps a certain brand or model of disk drive has a higher failure rate than others, or a particular IaaS or PaaS vendor is discovered to have difficulty meeting their SLAs on a consistent basis.

 

Problem management can be either reactive or proactive. Reactive management occurs when problems have already caused incidents and steps must be taken to resolve the current incident and prevent future incidents. Proactive management includes solving problems before they are noticed by service users (i.e. before they cause incidents), as well as activities such as auditing code to find bugs.

CHANGE MANAGEMENT:

Change management is the process of making changes to the IT infrastructure in a standardized and systematic manner. Changes can include replacing or upgrading the capacity of hardware, upgrading to a new version or rolling back to an old version of software, or switching to new vendors of IaaS and PaaS solution. Changes can both be a response to problems and incidents as well as causes of them.

 

Many organizations will have a change advisory board that is required to sign off on all changes before they take place. This board will carefully examine the impact of any changes so as to prevent incidents stemming from them. The board may either veto a proposed change or require mitigation measures to be put in place before the change such as standing up a fail-over site.

 

Thank you.