Speech recognition applications are not such revolutionizing IVR,
defined for this article as enabling dual-tone multiplex frequency
(DTMF) or TouchTone[TM] interactions, but instead appears to be
supplanting it as the key means of automated voice interaction.
[ILLUSTRATION OMITTED]
That victory, which may be in sight, sets the stage for integrating
voice with web, e-mail, and SMS to provide a unified user-friendly
automated solution that will reduce agent engagement time and, for an
increasing number but far from all interactions, eliminate agent
involvement.
The value proposition of speech rec is that offers superior
usability compared with DTMF for much lower cost than live agent:
typically 50 cents compared with $5-$9 per transaction.
Implemented right, speech rec, along with the text and web
applications can cut costs and maintain if not increase customer service
satisfaction and retention.
Yet while speech rec technology has come a long way, it still
places significantly lower than live agents in customer satisfaction
surveys, though higher than DTMF.
"Speech rec has a customer satisfaction ranking of about 4.5
on a 10 point scale while DTMF IVR is typically between 1 and 2, points
out Bob Lyons, General Manager and Vice President, Avaya's customer
service business. "In contrast, live agent interactions score a 7
on average. The real opportunity is in finding a way to get speech
interactions to begin approaching scores seen by live
interactions."
To achieve that goal will, however, require resources. Speech rec
software and integration can costs upward of several hundred thousand
dollars, can take nine to 12 months to implement followed by one year of
operation before achieving return on investment (ROI) multiple sources
of rich content such as web, voice and user-specific data. These sources
are typically in application silos, which will require large investments
to integrate them so that they can present the data in the appropriate
context, at the right time.
"The question becomes can the technology reach a level where
customer satisfaction is high enough to offset the investments,"
says Lyons.
Speech technology developments
Speech recognition technology is slowly but inexorably moving in
that direction. Aaron Fisher, Director, Professional Services, West
Interactive has seen marked improvements in the overall performance of
speech recognition software.
Applications are now better able to recognize callers with accents.
The speech engines are more effective at screening out ambient or
surrounding noise that is not generated by the callers' voices.
These developments have led to increased automation rates and fewer
agent opt-outs.
"In the old days, like 2003 and earlier, if a dog barked any
time during a call, the speech rec application would think this noise
might have been from the caller but wouldn't be able to make sense
of it, "recounts Fisher. "Now if you have a loud dog or child,
the system has the ability to analyze the difference between spoken
noise and ambient noise and callers can achieve their tasks with higher
success rates."
To illustrate, Loquendo's Loquendo ASR 7.5 features a new
noise compensation feature plus it has re-trained all supporting
languages with additional material recorded in the presence of
background noise, including mobile. It also offers more complex and
support for multilingual grammars and large vocabularies. It has
differentiated timeouts to permit utterances of fixed format and length
such as credit card numbers.
There is a continuing shift by users toward natural language speech
rec, which enables callers to speak to the computers like they are
conversing with people, away from directed dialogue speech rec, where
callers speak one or two words in response to a DTMF-like menu.
Natural language permits callers to obtain what they want quicker
and more easily. They can, for example, barge into the applications and
have their requests understood because the speech engines parses through
their words and retrieves the right responses from their libraries. This
functionality leads to greater automated interaction completion rates
and fewer live agent zero-outs. Yet the solutions are more expensive and
complex to install.
"Natural language is preferable because it more closely aligns
with the users need to have the system to respond to them,"
explains Lyons. The challenge is that natural language is not mature
enough yet to deal with the general public. You have to build extensive
libraries focused on the things that a person might ask. When you think
about the many language options along with the many accents and slang
options, it is easy to see why natural language is rather difficult to
implement successfully in many situations."
There are many applications where directed dialogue is extremely
useful. Voxeo, a provider of premise and hosted IVR and VoIP
applications, points to the example of a mail order firm where 20
percent of callers dial in to find out about their order status.
The marketer uses Voxeo's Prophecy platform to ask the callers
to say their order numbers, which is less restrictive than having them
use DTMF. It then queries an existing Web-based order status solution
and receives XML instructions to inform the customers that their orders
have been shipped. Rather then ending the calls, the platform then
queries the shipper's package tracking Web applications and tells
the callers where the packages are.
Speech rec engines, especially those that use natural language will
benefit from increased chip processing power driven principally by
strong demand for increasingly sophisticated computer games, reports Ian
Jacobs, senior analyst, Frost & Sullivan.
"The faster and more affordable chipsets will enable speech
rec applications to route calls quicker and handle more complex
interactions," he explains.
Advantages of speech over DTMF
These improvements are making speech rec a more effective automated
voice solution compared with DTMF-enabled IVR for most if not all
interactions.
Speech rec can bolster customers' experience with automated
voice methods by enabling them to complete transactions or obtain
information and assistance quicker by accommodating their requests,
instead of forcing them to go through long hierarchical menus as with
DTMF.
Speech rec also enhances CRM by permitting customer
personalization. When the system recognizes the callers it can then,
based on the rules you create, address them by the first names, cut
through the menus, and present customized information and offers.
One literal driver to speech rec from DTMF is mobile commerce.
Andrea Holko, Senior Vice President of Global Consulting Services,
Intervoice cites the growing number of jurisdictions that have
hands-free cellphone laws.
"In environments where for safety reasons you cannot use your
hands to use a phone, like driving a car, speech rec is a
necessity," says Holko.
Also, the conversational flow in natural language speech
recognition keeps older customers in the automated applications longer
before contacting live agents.
Security is enhanced with speech recognition because it allows for
complicated and less-readily-faked passwords. These are migrating from
the common mother's maiden name to names of high schools attended
and to the names of first pets.
There are places for DTMF. It can provide a high degree of accuracy
for low level security, such as through the entry of 4-digit or 6-digit
PINs. It also permits customers to enter confidential information in
public places, to avoid it being overheard, and possibly stolen by
others. It can, in addition, process vast number of simple calls
requiring only numeric inputs highly reliably at low cost.
If you do retain DTMF, avoid upgrading the host IVR with speech
rec, recommends Avaya's Lyons. Instead, have the speech
applications integtated directly on the routing and switching solutions.
Have the IVR connected only for those customers who wish to use the DTMF
functionality.
The Avaya executive explains that the IVR's hierarchical call
flow conflict with the natural conversation flows in speech rec and live
agent. Firms that install speech rec on the IVR therefore risk failing
to achieve ROI, such as improved customer retention and satisfaction and
shorter live agent call lengths, because more callers will zero-out than
projected.
"Installing speech rec on the IVR is the worst solution for
your customers because it doesn't allow you to change the paradigm
and permit you to create customer-friendly call flows," Lyons
points out. "All what you will have is more expensive DTMF with
poot satisfaction or call containment rates."
Speech rec as live agent adjunct or replacement
Banishing IVR to the periphery leaves speech recognition open to
take on live agents. Already it is handling more transaction types that
ate edge of competence with DTMF IVR but which are too expensive for
live agents, such as ordering movies, products, and tickets.
Speech rec can also reduce call lengths and call handling costs by
obtaining basic routine information from callers that it then transmits
to live agents. As speech applications become more robust they will be
able to gather more data and handle more tasks, leaving less work for
live agents to carry out.
COPYRIGHT 2008 Technology Marketing
Corporation Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2008 Gale, Cengage Learning. All rights
reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.