Summary

The purpose of Incident Resolution and Prevention (IRP) (CMMI-SVC) is to ensure timely and effective resolution of service incidents and prevention of service incidents as appropriate.

Description

The Incident Resolution and Prevention process area involves the following activities:

  • Identifying and analyzing service incidents
  • Initiating specific actions to address incidents
  • Monitoring the status of incidents, tracking progress of incident status, and escalating as necessary
  • Identifying and analyzing the underlying causes of incidents
  • Identifying workarounds that enable service to continue
  • Initiating specific actions to either address the underlying causes of incidents or to provide workarounds
  • Communicating the status of incidents to relevant stakeholders
  • Validating the complete resolution of incidents with relevant stakeholders


The term “incident” is used to mean “service incident” in this process area and in other areas of the model where the context makes the meaning clear. The term “service incident” is used in the glossary and in other parts of the model to clearly differentiate this specially defined term from the everyday use of the word “incident.” (See the definition of “service incident” in the glossary.)

Incidents are events that, if not addressed, eventually can cause the service provider organization to break its service commitments. Hence, the service provider organization should address incidents in a timely and effective manner according to the terms of the service agreement.

Addressing an incident can include the following activities:
  • Removing an underlying cause or causes
  • Minimizing the impact of an incident
  • Monitoring the condition or series of events causing the incident
  • Providing a workaround


Incidents can cause or be indications of interruptions or potential interruptions to a service.

 

Examples of interruptions to a service include a software application that is down during normal operating hours, an elevator that is stuck, a hotel room that is double booked, and baggage that is lost in an airport.


 

Examples of potential interruptions to a service include a broken component in resilient equipment, a line at a counter of a supermarket with more than three people in it, and an understaffed call center.


Customer complaints are a special type of potential interruption. A complaint indicates that the customer perceives that a service does not meet his or her expectations, even if the customer is in error about what the agreement calls for. Therefore, complaints should be handled as incidents and are within the scope of the Incident Resolution and Prevention process area.

All incidents have one or more underlying causes, regardless of whether the service provider is aware of the cause or not. For example, each system outage has an underlying cause, whether it is a memory leak, a corrupt database, or an operator error.

An underlying cause of an incident is a condition or event that contributes to the occurrence of one or more incidents. Not all underlying causes result in incidents immediately. For example, a defect in an infrequently used part of a system may not result in an incident for a long time.

Underlying causes can be any of the following:
  • Root causes that are within the service provider’s control and can and should be removed
  • Positive or negative conditions of a service that may or may not be removed
  • Conditions that the service provider cannot change (e.g., weather conditions)


Underlying causes and root causes (as described in the Causal Analysis and Resolution process area) are not synonymous. A root cause is a type of underlying cause that begins a chain of causes for some outcome of interest. We don’t normally look for the cause of a root cause and we normally expect to achieve the greatest reduction in the occurrence of incidents when we address a root cause.

Sometimes, we are unable to address a root cause for practical or budgetary reasons, and so instead we can focus on other non-root underlying causes. It doesn’t always make business sense to remove all underlying causes either. Under some circumstances, addressing incidents with workarounds or simply resolving incidents on a case-by-case basis can be more effective.

Effective practices for incident resolution start with developing a process for addressing incidents with the customers, end users, and other relevant stakeholders who report incidents. Organizations can have a collection of known incidents, underlying causes of incidents, and workarounds, as well as separate but related activities designed to create the actions for addressing selected incidents and underlying causes. Processing all incidents and analyzing selected incidents and their underlying causes to define approaches to addressing those incidents are two reinforcing activities that can be performed in parallel or in sequence.

Thus, the Incident Resolution and Prevention process area has three specific goals. The Prepare for Incident Resolution and Prevention goal helps to ensure an approach is established for timely resolution of incidents and effective prevention of incidents when possible. The specific practices of the goal to Identify, Control, and Address Individual Incidents are used to treat and close incidents as they occur, often by applying workarounds or other actions defined in the goal to Analyze and Address Causes and Impacts of Selected Incidents.

References

Refer to the Capacity and Availability Management (CAM) (CMMI-SVC) process area for more information about monitoring and analyzing capacity and availability.


Refer to the Service Delivery (SD) (CMMI-SVC) process area for more information about establishing service agreements.


Refer to the Causal Analysis and Resolution (CAR) (CMMI-SVC) process area for more information about determining causes of selected outcomes.


Refer to the Configuration Management (CM) (CMMI-SVC) process area for more information about tracking and controlling changes.


Refer to the Risk Management (RSKM) (CMMI-SVC) process area for more information about identifying and analyzing risks and mitigating risks.


Refer to the Work Monitoring and Control (WMC) (CMMI-SVC) process area for more information about providing an understanding of the project’s progress so that appropriate corrective actions can be taken when the project’s performance deviates significantly from the plan.

Contains

IRP.SG 1 Prepare for Incident Resolution and Prevention
Preparation for incident resolution and prevention is conducted.
IRP.SG 2 Identify, Control, and Address Individual Incidents
Individual incidents are identified, controlled, and addressed.
IRP.SG 3 Analyze and Address Causes and Impacts of Selected Incidents
Causes and impacts of selected incidents are analyzed and addressed.