Establish and maintain an approach to incident resolution and prevention.
The approach to incident resolution and prevention describes the organizational functions involved in incident resolution and prevention, the procedures employed, the support tools used, and the assignment of responsibility during the lifecycle of incidents. Such an approach is typically documented.
Often, the amount of time needed to fully address an incident is defined before the start of service delivery and documented in a service agreement.
In many service domains, the approach to incident resolution and prevention involves a function called a “help desk,” “service desk,” or one of many other names. This function is typically the one that communicates with the customer, accepts incidents, applies workarounds, and addresses incidents. However, this function is not present in all service domains. In addition, other functional groups are routinely included to address incidents as appropriate.
Example Work Products
- Incident management approach
- Incident criteria
1. Define criteria for determining what an incident is.
To be able to identify valid incidents, criteria are defined that enable service providers to determine what is and what is not an incident. Typically, criteria also are defined for differentiating the severity and priority of each incident.
2. Define categories for incidents and criteria for determining which categories an incident belongs to.
The resolution of incidents is facilitated by having an established set of categories, severity levels, and other criteria for assigning types to incidents. These predetermined criteria can enable prioritization, assignment, and escalation actions quickly and efficiently.
- Probes or scans of internal or external systems (e.g., networks, web applications, mail servers)
- Administrative or privileged (i.e., root) access to accounts, applications, servers networks, etc.
- Distributed denial of service attacks, web defacements, malicious code (e.g., viruses)
- Insider attacks or other misuse of resources (e.g., password sharing)
- Loss of personally identifiable information
Criteria are established that enable service staff to quickly and easily identify major incidents.
- Critical, high, medium, low
- Numerical scales (e.g., 1-5 with 1 being the highest)
3. Describe how responsibility for processing incidents is assigned and transferred.
- Who is responsible for addressing underlying causes of incidents
- Who is responsible for monitoring and tracking the status of incidents
- Who is responsible for tracking the progress of actions related to incidents
- Escalation procedures
- How responsibility for all of these elements is assigned and transferred
4. Identify one or more mechanisms that customers and end users can use to report incidents.
These mechanisms account for how groups and individuals can report incidents.
5. Define methods and acquire tools to use for incident management.
6. Describe how to notify all relevant customers and end users who may be affected by a reported incident.
How to communicate with customers and end users is typically documented in the service agreement.
7. Define criteria for determining severity and priority levels and categories of actions and responses to be taken based on severity and priority levels.
8. Identify requirements on the amount of time defined for the resolution of incidents in the service agreement.
Often, the minimum and maximum amounts of time needed to resolve an incident is defined and documented in the service agreement before the start of service delivery.
9. Document criteria that define when an incident should be closed.
Not all underlying causes of incidents are addressed and not all incidents have workarounds either. Incidents should not be closed until the documented criteria are met.
Often closure codes are used to classify each incident. These codes are useful when data are analyzed further.