Speech recognition applications are not such revolutionizing IVR,
defined for this article as enabling dual-tone multiplex frequency
(DTMF) or TouchTone[TM] interactions, but instead appears to be
supplanting it as the key means of automated voice interaction.
[ILLUSTRATION OMITTED]
That victory, which may be in sight, sets the stage for integrating
voice with web, e-mail, and SMS to provide a unified user-friendly
automated solution that will reduce agent engagement time and, for an
increasing number but far from all interactions, eliminate agent
involvement.
The value proposition of speech rec is that offers superior
usability compared with DTMF for much lower cost than live agent:
typically 50 cents compared with $5-$9 per transaction.
Implemented right, speech rec, along with the text and web
applications can cut costs and maintain if not increase customer service
satisfaction and retention.
Yet while speech rec technology has come a long way, it still
places significantly lower than live agents in customer satisfaction
surveys, though higher than DTMF.
"Speech rec has a customer satisfaction ranking of about 4.5
on a 10 point scale while DTMF IVR is typically between 1 and 2, points
out Bob Lyons, General Manager and Vice President, Avaya's customer
service business. "In contrast, live agent interactions score a 7
on average. The real opportunity is in finding a way to get speech
interactions to begin approaching scores seen by live
interactions."
To achieve that goal will, however, require resources. Speech rec
software and integration can costs upward of several hundred thousand
dollars, can take nine to 12 months to implement followed by one year of
operation before achieving return on investment (ROI) multiple sources
of rich content such as web, voice and user-specific data. These sources
are typically in application silos, which will require large investments
to integrate them so that they can present the data in the appropriate
context, at the right time.
"The question becomes can the technology reach a level where
customer satisfaction is high enough to offset the investments,"
says Lyons.
Speech technology developments
Speech recognition technology is slowly but inexorably moving in
that direction. Aaron Fisher, Director, Professional Services, West
Interactive has seen marked improvements in the overall performance of
speech recognition software.
Applications are now better able to recognize callers with accents.
The speech engines are more effective at screening out ambient or
surrounding noise that is not generated by the callers' voices.
These developments have led to increased automation rates and fewer
agent opt-outs.
"In the old days, like 2003 and earlier, if a dog barked any
time during a call, the speech rec application would think this noise
might have been from the caller but wouldn't be able to make sense
of it, "recounts Fisher. "Now if you have a loud dog or child,
the system has the ability to analyze the difference between spoken
noise and ambient noise and callers can achieve their tasks with higher
success rates."
To illustrate, Loquendo's Loquendo ASR 7.5 features a new
noise compensation feature plus it has re-trained all supporting
languages with additional material recorded in the presence of
background noise, including mobile. It also offers more complex and
support for multilingual grammars and large vocabularies. It has
differentiated timeouts to permit utterances of fixed format and length
such as credit card numbers.
There is a continuing shift by users toward natural language speech
rec, which enables callers to speak to the computers like they are
conversing with people, away from directed dialogue speech rec, where
callers speak one or two words in response to a DTMF-like menu.
Natural language permits callers to obtain what they want quicker
and more easily. They can, for example, barge into the applications and
have their requests understood because the speech engines parses through
their words and retrieves the right responses from their libraries. This
functionality leads to greater automated interaction completion rates
and fewer live agent zero-outs. Yet the solutions are more expensive and
complex to install.
"Natural language is preferable because it more closely aligns
with the users need to have the system to respond to them,"
explains Lyons. The challenge is that natural language is not mature
enough yet to deal with the general public. You have to build extensive
libraries focused on the things that a person might ask. When you think
about the many language options along with the many accents and slang
options, it is easy to see why natural language is rather difficult to
implement successfully in many situations."
There are many applications where directed dialogue is extremely
useful. Voxeo, a provider of premise and hosted IVR and VoIP
applications, points to the example of a mail order firm where 20
percent of callers dial in to find out about their order status.
The marketer uses Voxeo's Prophecy platform to ask the callers
to say their order numbers, which is less restrictive than having them
use DTMF. It then queries an existing Web-based order status solution
and receives XML instructions to inform the customers that their orders
have been shipped. Rather then ending the calls, the platform then
queries the shipper's package tracking Web applications and tells
the callers where the packages are.
Speech rec engines, especially those that use natural language will
benefit from increased chip processing power driven principally by
strong demand for increasingly sophisticated computer games, reports Ian
Jacobs, senior analyst, Frost & Sullivan.
"The faster and more affordable chipsets will enable speech
rec applications to route calls quicker and handle more complex
interactions," he explains.
Advantages of speech over DTMF
These improvements are making speech rec a more effective automated
voice solution compared with DTMF-enabled IVR for most if not all
interactions.
Speech rec can bolster customers' experience with automated
voice methods by enabling them to complete transactions or obtain
information and assistance quicker by accommodating their requests,
instead of forcing them to go through long hierarchical menus as with
DTMF.
Speech rec also enhances CRM by permitting customer
personalization. When the system recognizes the callers it can then,
based on the rules you create, address them by the first names, cut
through the menus, and present customized information and offers.
One literal driver to speech rec from DTMF is mobile commerce.
Andrea Holko, Senior Vice President of Global Consulting Services,
Intervoice cites the growing number of jurisdictions that have
hands-free cellphone laws.
"In environments where for safety reasons you cannot use your
hands to use a phone, like driving a car, speech rec is a
necessity," says Holko.
Also, the conversational flow in natural language speech
recognition keeps older customers in the automated applications longer
before contacting live agents.
Security is enhanced with speech recognition because it allows for
complicated and less-readily-faked passwords. These are migrating from
the common mother's maiden name to names of high schools attended
and to the names of first pets.
There are places for DTMF. It can provide a high degree of accuracy
for low level security, such as through the entry of 4-digit or 6-digit
PINs. It also permits customers to enter confidential information in
public places, to avoid it being overheard, and possibly stolen by
others. It can, in addition, process vast number of simple calls
requiring only numeric inputs highly reliably at low cost.
If you do retain DTMF, avoid upgrading the host IVR with speech
rec, recommends Avaya's Lyons. Instead, have the speech
applications integtated directly on the routing and switching solutions.
Have the IVR connected only for those customers who wish to use the DTMF
functionality.
The Avaya executive explains that the IVR's hierarchical call
flow conflict with the natural conversation flows in speech rec and live
agent. Firms that install speech rec on the IVR therefore risk failing
to achieve ROI, such as improved customer retention and satisfaction and
shorter live agent call lengths, because more callers will zero-out than
projected.
"Installing speech rec on the IVR is the worst solution for
your customers because it doesn't allow you to change the paradigm
and permit you to create customer-friendly call flows," Lyons
points out. "All what you will have is more expensive DTMF with
poot satisfaction or call containment rates."
Speech rec as live agent adjunct or replacement
Banishing IVR to the periphery leaves speech recognition open to
take on live agents. Already it is handling more transaction types that
ate edge of competence with DTMF IVR but which are too expensive for
live agents, such as ordering movies, products, and tickets.
Speech rec can also reduce call lengths and call handling costs by
obtaining basic routine information from callers that it then transmits
to live agents. As speech applications become more robust they will be
able to gather more data and handle more tasks, leaving less work for
live agents to carry out.
"Speech is taking increasingly sophisticated calls from live
agents, including diagnostics, and tech support, and account
verification," reports Keith Dawson, senior analyst, Frost and
Sullivan. "I estimate that 35 percent, maybe 50 percent of calls
are so routinized that they can go to speech self-service in the near
future. "
Cusromets can more easily choose to leave for or enter speech rec
applications from live agents and other channels without reentering data
thanks to greater integration between speech rec and other interaction
types on platforms and services.
Intervoice's new CTI-enabled Intervoice Contact Portal permits
customers can choose the channel: phone, e-mail, SMS, or web chat, and
the resource--self-service or a live agent. All of the information is
touted with the contact throughout the entire session.
Also, Genesys Telecommunications Laboratories' newly-released
intelligent Customer Front Door[TM] solution incorporates Nuance's
natural-language based Nuance Call Steering application to determine
caller intent to reach either a live agent or automated system.
Avaya's Lyons sees automated speech and the web merging with
customers sometimes talking with machines, sometimes interacting with
over the web, while the applications tap the same rules and response
engines and databases.
He gives the example of a customer receiving a voicemail or SMS
from the airline that his flight is delayed and has to rebook it to get
to his destination on time. He call backs and reach the speech engine
where he finds out plane times and then listens to alternative
departures. When the desired flight is selected he can automatically
process seat selection, payment processing and any other tasks on the
web.
"By merging speech and web together gains more control could
reduce the need for live agents by avoiding them altogether or by
limiting the length of conversations through moving the interactions
farther in the automated process. "Lyons points out. "This
will open up automated functionality to many more and other applications
such as help desk, stock trades, and healthcare claims processing
without the need for more phone calls."
Choosing and Implementing Speech Rec
To see if speech recognition is right for your contact center and
to obtain the most of the investment, Ian Jacobs recommends that you
understand what your organization's most common interaction types
are, what callers are calling about and whether speech can automate
those. If they are then it is worth considering, if not then it
isn't.
In carrying out your examination look at the industries where this
makes a lot of sense and whether their benefits apply to your case. The
big speech rec adopters include the wireless, financial services, and
transportation and travel industries.
When selecting suppliers, which for speech rec can include not just
for the software application but also consultants and systems
integrators, choose those understand call How and where the technology
fits in the total customer interaction picture.
At the same time examine the option of hosted speech whether by
dedicated firm or by teleservices firms that offer speech either
standalone or integrated with its other solutions. This alternative
lowers up-front costs and provides you with applications that have been
pre-installed.
"Hosted speech is a good opportunity to access this technology
in a pay as you go model, allowing a company to better align costs with
benefit," explains Lyons. "The drawback is that without having
an integrated solution: live voice, speech, and web it will be difficult
to move the satisfaction index beyond where it currently is."
Hosting is also a good option for DTMF IVR and for high-volume
directed dialogue applications, which minimizes technology investments
and enables contact centers to handle spikes in traffic. Examples
include new credit card launches, holiday season order tracking, or
provide basic information including with password support such as in the
event of a disaster. Hosting firms typically have thousands of ports, on
machines at disaster-protected sites.
The principal challenge faced in putting speech recognition to work
is not so much in making the base technology work, which is reasonably
reliable, but in implementing it with the customets in mind so that they
will not mind using it.
The key is creating analytics so that customers can obtain more
control of the interactions. That includes coupling the speech rec
engines to the customers' intent quickly with minimal prompts and
mapping out the customer taxonomy so that high-value customers go
directly to live agents.
"Unfortunately there is very little analytics out there on a
production scale," reports Lyons. "This must be provided by
systems integrator or internally by people who understand the analytics,
the application, the underlying technology and the clients'
business."
When you design your speech application, avoid trapping customers
inside the automated system. While this technique lowers costs by
cutting agent opt-outs it also drive down satisfaction, retention, and
in most cases, revenues.
One technique to consider is to design menus and applications that
reflect knowledge of the caller, their account information and their
most recent transactions. For instance, if your firm makes furniture and
a retailer calls every day only to check the status of their orders then
"smart menuing" can be incorporated. You offer the order
status prompt first, and if the caller wants to do something different,
you can back off to the main menu.
"This is a simple but good example of how to demonstrate to
your customers that you value their business and time, and that
you're giving them a type of custom tailored treatment" says
West Interactive's Fisher.
The following companies participated in the preparation of this
article:
Avaya
www.avaya.com
Frost & Sullivan
www.frost.com
Intervoice
www.intervoice.com
Loquendo
www.loquendo.com
Nuance
www.nuance.com
Tellme
www.tellme.com
Voxeo
www.voxeo.com
West Interactive
www.westinteractive.com
RELATED ARTICLE: Tellme Delivers Spanish Speech Solution for
Domino's
Tellme has been delivering calls to Domino's Pizza through its
hosted speech rec application since 2005, but in English customer. The
partnership has saved the international pizza chain money over having
calls handled by live agents, which helps its products price-savvy in a
competitive market while delivering improved service.
The Tellme application is fairly sophisticated; it is linked to
Domino's website and database. For example when customers call in
the system looks up what they have Ordered before, whether online or
'by live agent as well as speech rec and offers it again.
Domino's now wanted to extend the speech rec capabilities to
its Spanish speaking customers, a growing and underserved market. It had
found that multilingual staffing was inconsistent across stores, making
the Spanish ordering experience inconsistent and sometimes impossible.
While customers could use the Tellme-provided service, which would
connect them to a Spanish-speaking agent, many of them would hang up
because Tellme would greet them in English.
Domino's then worked with Tellme to launch the first
end-to-end Spanish-language national coll-free pizza ordering service,
1-888-DOMINOS in February 2008. The centralized ordering solution
provides full ordering capabilities on a self-service basis, using
customer data to make intelligent offers and shorten the total
interaction length.
Tellme's network enables the transfer of caller information,
phone order history, web order history, and order status directly to
Domino's team members when a Spanish-speaking agent is needed,
delivering a consistent customer experience. Full integration to
Domino's point of sale and online ordering systems further improves
ordering experience.
The Spanish-language Tellme solution has been successful. It led
Domino's stores in Hispanic markets to grow their orders and the
revenues per order while radically reducing queues and live agent
opt-outs.
"Customer response to the new application has been very
positive," reports Brooks Crichlow, Tellme's Director of
Marketing. ]"Callers are greeted in the language they are most
comfortable with, and they no longer have to wait for minutes on hold
for a Spanish-speaking employee in the store to take their order. That
means they get their meal faster and with more confidence than
before."
COPYRIGHT 2008 Technology Marketing
Corporation Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2008 Gale, Cengage Learning. All rights
reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.