|
|
|
Incidents are usually detected because an error occurs or the system does not respond as
expected. Some of the simplest resolutions find that user expectation is to blame. The system is
expected to run some software that has not been installed or the user has not followed
instructions, or the instructions on how to operate the system or software are not clear enough to
be understood.
It is important to discover whether or not an error really exists. This is where good detection can
prevent an unnecessary callout to a technician.
How to detect whether an error has occurred
The basic steps to detecting if an error has occurred can be performed by the User.
1. Repeat the process that produced the error and see if this occurs again. For example if you were
trying to print, click on the print document icon. This time, take note of what did or did not happen.
2. Try another way to achieve the desired result - most software will have functions operated by a
click on an icon, selecting from a bar menu at the top of the screen or a combination of key strokes
3. Look at the screen to see if any error messages are displayed. Windows can often hide the error
windows if you have several applications open, so check the bar at the bottom of the window. Are
any of the symbols flashing? If so, click on the symbol to see if an error window has been opened.
4. Write down any error messages on the incident sheet. Also write down which applications were
open at the same time. Most importantly write down when - to your knowledge - this function last
worked on this system.
5. If there are no error messages but you did resolve the incident, it may be worth letting your
service desk know if the instructions for using the system or software are difficult to understand,
don't exist, or need amendment. You could be saving someone else's time with these observations.
Use the incident sheet to let the Service Desk know.
The next step from detection is the initial incident investigation, usually performed by the Service
Desk on receipt of the completed incident sheet.
|
|
If an incident is detected it must be reported. Do not worry if someone else has reported the
incident; the service desk will be able to identify duplicate reports.
The impact of not reporting an incident can be great and lead to a lot of disruption if not resolved.
All staff have a responsibility to ensure that others have a good working system and errors cannot
be left unresolved. You wouldn't walk past a fire and hope that someone else raises the alarm; the
same approach should follow for computer system incidents and problems.
The incident reporting process should be simple enough to follow and should not impact on the
time or other resources of staff that first detect the error.
|
|
|
Incident/request sheet
|
|
|
|
User guide to completing the incident/request sheet
|
|
|
|
Service Desk guide to completing the incident/request sheet
|
|
|
|
Call log
|
|
|
|
Service Desk guide to completing the call log
|
|
|
|
|
|
|
Benefit
|
Disadvantage
|
|
No technician in school
|
|
|
|
When a fault occurs, there is not a technician to tell. Therefore putting the details of the
fault into writing removes the ambiguity or brevity that can occur during a verbal
explanation. For example telling a technician 'it doesn't work' is not a helpful explanation.
|
x
|
|
|
Clear details of a fault written down can provide time for the technician to prioritise
workload and understand the work involved in solving the fault. This can reduce time and
improve efficiency. For example the resolution may involve a quick phone call to the
school instead of a scheduled visit.
|
x
|
|
|
All faults would need detailed recording to ensure the best possible chance of getting a
solution, especially when you do not know who will be dealing with the call.
|
|
x
|
|
Staff would not be able to get quick solution, as a technician is not on site. Quick
solutions may involve the technician knowing the solution or finding out the solution from
a knowledge base - either available in the school or through other sources such as the
Internet.
|
|
x
|
|
|
|
|
|
Technician in school
|
|
|
|
When a fault occurs, it is easy for staff to just tell the technician the fault details. The
staff feel that they have passed on the responsibility for the fault and can expect the
technician to deal with the incident.
|
x
|
|
|
Technicians are immediately available to resolve high priority problems and reduce the
impact of an incident.
|
x
|
|
|
Ambiguity or brevity can occur during a verbal explanation. For example, telling a
technician 'it doesn't work' is not a helpful explanation.
|
|
x
|
|
Verbal contact with a technician in a corridor will not ensure that the work is recorded or
prioritised.
|
|
x
|
|
The technician can feel 'put upon' and this may discourage their efforts to try to prioritise
their workload. This will reduce the benefits of their service.
|
|
x
|
|
|
Initial investigation.
The incident log will be checked using key words from the incident sheet. For example, you can
search for an error message code in the call log using the find button in the spreadsheet.
With experience, the service desk will know if the resolution to the problem can be found in the
schools knowledge base or if a technician is to be contacted. The schools knowledge base will be
checked for a resolution.
If the knowledge base provides a solution, then this should be tried before contacting a technician.
This is where the system agreed by the school will be followed.
The options are these:
1. Someone in the school tries the resolution and a technician is not called.
2. The technician is contacted and given the resolution found in the knowledge base.
|
|
If a resolution has not been found, the technician will be contacted and provided with details from
the incident sheet. Again the system agreed by the school will be followed.
The options are as follows
1. Email, post or fax the sheet to the technician
2. Telephone the technician and discuss the incident and action taken so far.
3. Leave the incident sheet for collection by the technician at their next scheduled visit.
What to do if upgrades are required
This is where Incident Management links into Release Management and Configuration
Management.
Upgrades must be planned - even if to one system. If the upgrade is not tested for compatibility
with the other software on the system, further errors could occur.
|
|
The user could perform the first stage of the investigation, if they have the diagnostics tools
available and the confidence to proceed. You can implement User self-service tools assist with
this. Otherwise the first stage is performed by the schools single point of contact at the school's
Service Desk.
In the early days of a Service Desk, there will be little previous information to check, so all calls will
be passed to the person providing technical support.
After approximately three weeks of using a Service Desk, there should be enough information to
enable the SPOC to check previous calls and see if there is any similarity.
These checks should be done in the following way:
1. Check the summary of initial action taken and see if a similar incident occurred previously. If a
previous incident exists, find the incident sheet which should be filed in date order, then check the
details of the incident. If the summary of the incident is the same, where possible try to see if the
resolution can be implemented by a member of staff, before a technician is called.
2. Check any diagnostics sheets or diagnostic information supplied to the school to see if there is
any help available.
3. Report the incident to the technician.
In all cases, the action taken should be recorded on the incident sheet by ticking or circling the
appropriate box.
|
|
|
|
|
User self-service offers users a strategy that enables them to use for obtaining support services
without direct intervention from a technician
The most important thing to identify is who will use the tool and what the tool is to be used for.
The tools can be
- written lists of things to check
- flow diagrams with easy to follow instructions
- on a CD or on the school network, created by the school technical staff or provider
- online through the Internet, or downloaded to the school intranet (if it has one)
- diagnostics supplied by the hardware manufacturer or software manufacturer
- telephone support.
|
|
How user self-service is implemented can vary significantly, depending upon what the school wants
to achieve and the range of services being offered:
- Users register their own requests and check on their progress.
- Users then have direct access to support information and knowledge.
- Users are able to manage support transactions themselves.
- Users can search knowledge bases for solutions.
- Users can download program updates or bug fixes.
- Users can order goods or services.
- Ease of access and speed of resolution is increased.
- Demand on support resources is reduced.
|
|
A successful user self-service strategy depends on several important factors:
School Leadership Commitment
- Any initiative that entails change within a school requires leadership support and commitment
to execute the initiative.
- See the Change Management process on how to introduce any changes within ICT to your
school.
- It is essential to put the right processes and tools in place to ensure that while the user is in
control, they are following a path that is carefully designed by the school or provider.
- Users need to know what user self-service channels are in place, along with the value and
responsibilities of using them.
- If the decision has been taken to supplement technical support with a self help tool, users
must understand that if the system in unavailable, they should wait and try again and not pick
up the phone.
- Email contact should be used, together with online communities to share the information
obtained, where possible.
Support processes are maintained
- It is important that none of the existing Change Management and Release Management
processes are bypassed or invalidated.
- Maintain the process of completing the incident form, even if the self- service tools enable the
incident to be resolved, as time and effort were still spent on the incident.
- The effectiveness of the service is monitored by measuring what self-help services are being
requested, how often and what for.
- Feedback will be required on how effective the ideas were on resolution, how well they were
presented, did the incident recur.
Content of the self-service system
- Any system that is not easy to use or that does not contain high-quality content will fail.
- If the users are unable to get the information they need when they need it, they will
immediately pick up the telephone next time they encounter a problem.
- In a worst-case scenario, the support team will find itself supporting yet another application -
the self-service system itself.
|
|
Buy into a provided user self-service system
- Does the system provide benefit to your school?
- Is it more cost effective to use a user self- service system by reducing staff costs
- Can you be sure the advice is current and accurate?
- Can you carry out the advice given by the user self-service method?
Creating your own user self-service system
- Do you have the resource, both now and in the future, to plan, implement, upgrade and
maintain your own user self-service system?
- Who will support your own user self-service system?
- How long will it take to develop?
- Who is going to pay for it?
- When will it be ready?
- What if your 'experts' leave?
|
|
A known error is a problem that has previously been successfully diagnosed and for which a
workaround has been identified. For example, where the cause of the incident is an existing
problem with the version of the software, a workaround is a software 'patch' that can be installed.
The problem will only be fixed with the next release of the software by the manufacturer.
A known error can also be referred to as the root cause of a problem (or incident).
Information about known errors can be supplied by manufacturers of hardware and software.
What to do when a manufacturer notifies of an error condition
Manufacturer notification of error conditions.
- These will be cascaded through manufacturer websites, suppliers, computer magazines,
blanket emails and word of mouth.
- It may also be the outcome of a reported incident - to find out it is a known error. Usually
known errors of this type have a fix or workaround that can be applied. However, it is frustrating
to discover, after reinstalling software, that the error would occur anyway and has been
acknowledged by the manufacturer.
- It is useful to have the ability to use the internet and know how search engines work. Putting
the details of any error message or details of the error into a search field may produce several
thousand results. Filtering the results will help in finding the more useful advice on what to do
next.
- With experience and confidence the Service Desk may be able to use this approach to find
known errors before a technician is contacted.
|
|
- A workaround is a method of avoiding an incident or problem, either from a temporary fix or
from a technique that means the user is not reliant on a particular aspect of a service that is
known to have a problem.
- Workarounds are an acceptable way to resolve an incident, they achieve the aims of incident
management - to get the user working again. The Service Desk or technician must then
acknowledge that the underlying problem still needs fixing, but the time to fix is not impacting
on the user.
- This leads from a reactive situation into a proactive situation.
Workaround examples
1. One of the safest workarounds is to use a different computer or printer where possible until the
incident or underlying problem is resolved.
2. If lots of windows and applications are open on the computer, close them down and use the
software producing an error on its own to see if the incident recurs. It may be that too many
windows were open and the computer doesn't have enough memory to run lots of applications at
once. The workaround is to use one or two applications at a time until more memory can be added
to the computer.
3. If the error affects printing, try copying the file to be printed onto a floppy disk or saving it the file
server and printing from another printer. Try to use a printer that has more memory, as printing
problems often occur when a complicated file is sent to a printer without much memory. The errors
often exhibited do not suggest a memory problem, but by successfully printing to a larger-memory
printer, you can prove that this was the cause of the incident.
4. Have a list of memory-hungry applications available to the users, to help them decide which
applications to shut down first if the computer appears slow or unresponsive.
5. Make sure everyone knows the rules for password resets or email resets. One school has
decided to make the password rename to 'diarrhoea', to encourage users to remember their original
passwords more often!
Spare equipment
Think about the following to see if this could be an effective workaround in your school.
1. Ensure that there are four spare computers. Really spare!
2. Configure two of the computers with the school's computer image (standard build, see Release
Management for further information) ready to swap out when required.
3. When an incident occurs that cannot be resolved in 15 minutes, replace the computer with one
of the spares.
4. Bring the faulty computer back to the technicians area.
Do not try to fix the problem.
5. Re-image the faulty computer with the school's computer image.
6. Run a set of pre-approved tests onto the re- imaged computer.
7. Make the re-imaged computer one of your two spares.
With the other two computers, configure one with any new software and create the new image.
Then test the new image on the second computer.
These four computers are not equipment for use in the school as ordinary equipment. If you need
to allocate one as an additional resource, ensure that a replacement is ordered and put back into
the group of four.
The cost of these four computers compared to lost teaching time, hours of technician time each
week, unresolved incidents and problems should pay for the cost of them over and over again. But
you must stick to the rules and re- image. Do not try to resolve the incident or problem. This may
be boring - but it can be very effective.
|
|
It is possible to own a library of books about IT and ICT and never come across the concept of
incidents or errors, except in software programming! It is not an easy subject to describe or on
which to produce a guide to the best approach.
Once an incident is reported and passed to a technician to resolve, you then move into the real art
of technical support. Sometimes this can appear to the user or customer - delete as a 'black art',
the area where the person with the technical knowledge has a real grasp of the subject and no one
else can expect to achieve that end result. However, this is not true. Users can do most of the
diagnostics themselves and provide a pointer towards where the root cause of the incident really
lies. This can help in the speed of resolution and increases the productivity of the technician.
Here is one approach to incident diagnosis that you may like to try.
There are several steps but one of the most important steps to take is the pause. The pause is the
step where the decision is made about which action to take first. If action is taken before the
diagnostics, it often becomes more difficult to resolve the incident.
|
|
Use the incident diagnostics sheet - see toolkit
1. Establish current status by deciding which area is the likely cause.
- hardware
- software
- network
- user guide
- other.
2. Pause and decide which action or actions to take.
It is important not to act rashly as this could create further incidents!
3. Take action and record the results. This could be an iterative process, but it is vitally important
to record what was done.
There are many examples of what should have been a five minute fix taking several hours because
the technician failed to record the actions taken.
|
|
In the early days of incident management, it is likely that the first checks will be in the call log
using a keyword search.
When the call log grows larger or the decision is made to use databases, it is important to ensure
that a search facility it available. The school should decide which words or type of words should be
recorded in the incident resolution. This will enable searches to be made using these words to aid
in checking the knowledge base.
Mature systems may implement the use of categories to help with quick searches of the
knowledge base. However, the incorrect use of categories can reduce the effectiveness of
searching and the overall usefulness of the knowledge base.
|
|
The technician does not require an arsenal of diagnostics tools for incidents. More in depth
analysis is performed in Problem Management.
Go to the toolkit to download the incident diagnostics sheet
|
|
The aim of incident resolution is to establish a resolution or work-around as quickly as possible, in
order to restore the service to users with minimum disruption to their work.
Incident Management at this stage can often be at odds with Problem Management
- Incident Management aims to get the system back up and running and a quick fix will do.
- Problem Management seeks to identify the cause of the incident to prevent it being repeated
and a quick fix will prevent the problem diagnosis required to identify the cause.
After resolution of the cause of the incident and restoration of the agreed service, the incident is
closed.
|
|
The
process for handling incidents from detection through to closure is shown in the diagram below.
|
|
There are various ways to log incidents and make requests. The success of the method used
depends on the size of the school and the flexibility of those providing technical support.
|
Corridor approach
This is a similar way of logging calls as the 'visit office' approach. In this instance the
technician or the person providing technical support may not even have the opportunity to
write down the details of the incident. The user is confident they have 'logged' the incident or
request and then feels let down when the call is not actioned appropriately.
|
|
Visit office
The user visits the technical support office to report an incident or make a request. This
approach inspires confidence in the user - they have discussed the problem with technical
support and know that action will ensue.
Often this approach does not benefit anyone if those providing technical support are beseiged
with visitors and do not have time to prioritise their workload or start working on the incidents.
The staff providing technical support feel they are very busy, but not proritising their work
reduces their effectiveness. This reactive situation does not embrace best practice.
|
|
Paper record of call
The user completes a paper form with details of the incident and posts it in an in-tray used by
support staff. The tray is often placed in staff rooms or near reception. Multipart copies are
useful in giving users a copy of the details they have logged.
Using this system relies on technical staff collecting the forms and allocating priorities
sufficiently quickly to encourage staff to continue using the system. It will fail if users find
their form still in the in-tray later in the day.
|
|
Registering details by phone or email
External service desks may use phone or email to speed up the process of logging calls. The
user must be armed with information about the system they are calling about, which may
include an allocated asset tag number and machine type.
The speed of response is not determined by the speed with which the call can be logged.
Users may become frustrated if they are required to provide lots of information to supply to
the support team, only to find that the response is not what they anticipated. It is important to
make all users aware of the agreed response times with this service.
|
|
Computer interactive
The users uses a simple online form to log the incident or request. The form is easy to follow
and is automatically sent to the technical support team. Having completed the form, the user
should be confident that the call will be actioned and will wait for a response from the support
team.
Because there is no interaction with a person, the system must be proven to work, or users
will quickly avoid this method and use the 'corridor' approach instead.
|
Details to be recorded for service desk calls
- information about the incident or request
- user name and contact information
- Service Desk details (to be completed by the single point of contact)
- resolution.
|
|
Incident closure is an important aspect of incident management and should not be overlooked.
Once the incident is resolved, closure aims to ensure that the lessons learnt are recorded for future
use. This is where recording the details of the incident and resolution contribute towards reducing
the impact of future incidents.
Category - It is more appropriate at the closure stage to assign a category than when the incident
is first reported. Once the incident has been resolved, the knowledge is available about which
component or part of the system caused the symptoms of the error.
Known errors - Once an incident has been resolved, the solution becomes a resolution or
workaround and can be passed to problem management to be logged as a known error.
Update the call log and incident sheet
- Enter the closure details in the call log including the category of incident (if used by the
school).
- Enter the closure details on the incident sheet.
- File the incident sheet in chronological order, using the date when the call was first placed.
|
|
|
|
|