The benefits of describing ITIL incidents in object-deviation format

Posted on October 18th, 2008

Those of you familiar with Kepner Tregoe problem solving will have come across the standard format used to describe problems – object-deviation. An object is the “thing” that is experiencing a deviation. A deviation occurs where there is a positive or negative difference between an expected condition and a measured condition.

An example;
An Oracle financial report normally completes within 15 minutes (the measured condition). Today it is taking 45 minutes to complete.
The object is the Oracle financial report (it’s helpful to be specific – let’s call it report FIN001). The deviation is that it is taking 30 minutes longer than usual to complete.

This would be written as:
Oracle financials report FIN001 taking extra 30 minutes to run

It is unlikely that the customer would report the fault in this way. It is much more likely to be reported like;

  • Report stuck
  • Financials not working
  • Report is slow
  • Oracle is taking too long
With some quick and easy questions however (based around identifying the object and the deviation), it could quickly be turned into object deviation format;
  • Object: “What part of Oracle are you using?”
  • Object: “Which Oracle form is it?”
  • Deviation: “How long does it normally take?” (identifying expected condition)
  • Deviation: “How long is it taking today?”
(side note: it is also feasible for Service Desk agents at first point of contact to being asking some of the basic problem solving questions such as “Which other objects are not showing this fault/deviation?” and “When was the deviation/fault first observed?”, but that’s a topic for another day)

The benefits of describing the fault using object-deviation are manifold:

  • The description is specific enough to begin clear incident resolution activities. With the object-deviation description, a technical staff member is likely to first check Oracle Financials, check the host on which it’s running for load, check the job scheduler to see if FIN001 is running and check the data FIN001 is using. Because the deviation has only been reported for FIN001, it also services as a point of different for the  ‘distinctions’ exercise in Kepner Tregoe.
    The technical staff member doesn’t go on a goose chase, for example checking the server on which Oracle Financials is running (although this may be a valid activity after other more likely sources are ruled out), checking disk space etc.
  • Mean time to resolution is reduced and the customer is up and running more quickly. Because the technical staff member has a clear description, resolution activities are more focussed, meaning the cause (and workaround) are applied quicker
  • Lower missclassification rate. Because the object is known, it becomes easier to classify the service and/or configuration item. This seems like a very general or vague benefit, but it goes back to the old principle of garbage in, garbage out. How reliable is your reporting or trend analysis information?
  • Easier trend analysis. Because symptomatically related service calls are recorded with similar descriptions, it becomes easier to spot and verify trends of faults
  • Easier error matching. Because problems are described in object-deviation format, it is easier to search problems for the deviation reported in the fault to see if there is a workaround

The counter argument to describing faults in object-deviation format at first point of contact with the Service Desk is that the description should use the customer’s words and language. That is, by using object-deviation format, the specification of the fault will use terms unfamiliar to the customer.

There is a false assumption here – that the fault cannot be described in both customer langugage and object-deviation format at the same time. It is quite easy to do both. Here are some examples;

Object deviation example 1

The customer calls the service desk to report that his email is not working. At first glance, this could be logged as;

  • Email not working
  • Email not being received
  • Can’t get email
If the service desk agent asks some simple questions we can be much more specific about the object and the deviation:
  • Object: “Which email program are you using?”
  • Object: “Is that on Windows or a Mac?”
  • Deviation: “What happens when you click ‘Get Mail’?”
  • Deviation: “Can you tell me the error message?”
So, we come up with something like:
Thunderbird on Mac shows error “Connection refused to company.com” when ‘Get Mail’ is clicked
Does this use the customer’s language? No. Not the language the customer used to report the fault. Is it something the customer is likely to understand? Yes. They use Thunderbird. They use a Mac. That’s the error message that appears when they click the ‘Get Mail’ button. Therefore the customer is unlikely to feel confused or that what they have reported hasn’t been acknowledged or understood.

Object deviation example 2

The customer calls the service desk to report that their fan is noisy. Again this could be reported in a number of (non-specific) ways;

  • Noisy fan
  • Computer is noisy
  • Fan is overworked

Again, by asking some simple questions we can really start to be specific about the object and the deviation.

  • Object: “What does the sound on the computer sound like?”
  • Object: “What type of computer is it?”
  • Deviation: “When does the sound start?”
  • Deviation: “Does it normally do this?”
We arrive at something like;
Generic Computer Model 999 has noisy fan for 30 seconds on startup
Does it use the language the customer reported? No. Will the customer associate with the description? Yes. They have a Generic Model 999 in front of them. It has a noisy fan. And the fan is noisy for 30 seconds on startup. This is an accurate description of what they have told the service desk agent.

© Klog: Kathy Reid’s Blog • Powered by Wordpress • Using the Swiss Cool theme.