The document discusses the current state of conversational interfaces such as chatbots and voice assistants, noting that while early versions were limited, recent advances in artificial intelligence, data availability, and user expectations have created new opportunities for conversational interfaces to become more useful. However, conversational interfaces still have limitations and work best when focused on simple, well-defined tasks rather than attempting to replace more complex interactions or functions better suited to humans. Designing effective conversational interfaces requires keeping interactions simple, clearly setting user expectations, and in some cases, involving human assistance.
2. “Machines should work; People should think” an excerpt from
The Jim Henson Company 1967 video "Paperwork Explosion”.
from
1967…
3. "The dream of conversational interfaces
is that they will finally allow humans to
talk to computers in a way that puts the
onus on the software—not the user—to
figure out how to get things done.”
— FastCompany, Conversational Interfaces, explained
…to 2016
4. A conversational interface is a program that you primarily
interact with through a back-and-forth dialog—using either voice
or text—instead of a more traditional graphical UI.
…at least, that’s how we think of them today.
What is a conversational UI (CUI)?
5. The two most common types of CUI are currently (text-based)
chatbots and (mostly voice-based) AI assistants. But there
are also already, many variations on this theme.
What kind of CUIs are there?
6. “The introduction of bots to
Facebook and other platforms has
been overhyped—and the bots
themselves often aren't very
good…[many] aren’t nearly as good
as the native apps they were
designed to replace.”
— Facebook Messenger chief David Marcus
Is this bot thing just hype?
Right now…maybe. :) There *is*
a lot of hype, and many bots
are barely useful.
But it’s important to consider
why bots and AI assistant
exist today, as this can help us
understand where they go in
the future.
8. Chatbots are not a new invention, and either are AI assistant.
Clippy, 1997
The much hated Clippy was
annoying, because it
promised a smart, helpful
assistant, yet wasn’t
sophisticated enough to
deliver on that promise.
ELIZA, 1966
Developed by MIT, the most famous Eliza bot was
DOCTOR, a simulation of a Rogerian psychotherapist.
We’ve been here before…
9. The reason conversational
interfaces may finally go
mainstream, is that we’ve
reached a combination or
human and technological
tipping points that have
created new opportunities
and expectations.
•artificial intelligence
•cloud computing + data
•mobile everywhere
•messaging everywhere
•new behaviours/expectations
•app fatigue
10. Artificial intelligence
The past few years have seen big
advances in artificial intelligence,
and machine learning technologies.
These technologies enable key
aspects of CUIs, such as automatic
speech recognition (which converts
voice to text) and natural language
processing (which determines an
input’s meaning). an example of language parsing and processing using
Facebook’s open source wit.ai
text input
structured
data
output
11. Cloud computing + data
The widespread availability of
low-cost, “infinite storage”
through cloud computing let to a
big data explosion, and greatly
reduced the cost of the intensive
computation needed to run
machine learning.
(Many popular machine learning
APIs are in fact now combined
with a cloud offering).
cloud-based machine learning and
cognitive computing
AWS cloud computing and cloud-based
machine learning
cloud-based machine learning
12. Mobile is everywhere
Number of mobile internet device subscriptions worldwide (in billions)
Mobile now reaches half the
worldwide population, with
the largest recent and
projected gains in Asia and
countries outside Europe
and N. America.
This demographic change is
important as a mobile is
often the first or only
computer these new
internet users will own.
13. For many mobile-first users, social and messaging apps are a primary window
onto the internet. In fact—many even believe these apps are the internet.
1B
1B
800M
220M
275M
WhatsApp
Messenger
WeChat (China)
Line (Japan/APAC)
kik (N America)
And if you use mobile, you use messaging
14. Source: Why Southeast Asia is Leading the world’s most disruptive business models
find a social vendor browse products inquire via messaging
(often using another app)
get payment details
(digital or otherwise)
ship anconfirm payment
These messaging apps were in fact the first prototypes of ‘conversational commerce’—
ad-hoc experiences assembled by users to meet a need.
15. “Most smartphone users download zero apps per month” - Quartz
Fewer apps used per month
of time spent on mobile is within
five non-native apps
Most download zero apps per month
These trends are colliding with a growing app fatigue. Although time
spent in apps is up, most people primarily use just a few apps—and
many of these, are messaging apps.
“Only five apps see heavy use” - TechCrunch
84%
17. AI assistants are services whose
job is to serve as an enabler for
different types of interactions.
Their primary means of input
tends to be voice, but a user’s
mobile is often used to output
more complex data and
responses.
AI Assistants
Apple’s Siri
(can be voice + screen)
Microsoft’s Cortana
(can be voice + screen)
Amazon Alexa
(primarily voice)
Ok Google
(can be voice + screen)
18. Most assistants have a collection of core behaviours—such as
fetching the time, setting an alarm, or sending an email—but most
are also platforms.
Core behaviours
Just a few of ‘Ok Google’s’ core behaviours
19. With each new brand that
creates a service for the
platform, the assistant (and
therefore its users) gain a
new set of skills*.
*Amazon (shown right) actually calls these
skills. Other platform will have different names
for them.
Third party ‘skills’
20. Bots are small services
that you ‘chat’ with
through a text interface
such as Facebook
Messenger or SMS.
Chatbots (…or Bots)
The Taco Bell tacobot for Slack
21. Some bots are standalone products,
while others aim to provide a subset of
tasks from a larger service.
In this sense, bots are similar to the
‘skills’ found within assistants: single-
domain micro-applications that help
users complete a range of tasks related
to an activity—such as booking a flight
or finding an apartment.
Trim is a personal finance bot with a very
simple value proposition—help you save
money by keeping an eye on where and
how you spend.
The Expedia bot enables users to search for
hotels, and book them using expedia.com.
22. There are already quite a few
hybrid approaches. Facebook
M for example, is an AI
assistant that uses text chat
instead of voice.
More importantly however, it’s
one of a growing number of
services that combine
automation with ‘humans in
the loop’ .
Hybrid approaches
“Hi! I’m M, your
personal assistant in
Messenger”
Facebook M has human
trainers who silently
supervise, and take over
complex tasks.
Operator’s human
assistants get to know
their clients to better
curate products to their
tastes.
Clara, a scheduling AI
is supported by
experienced
Executive Assistants.
23. Hopefully not :-)
There are many contexts where
we will still need a more traditional
graphical UI—either because the
task is just too graphical in nature,
or just because a bot doesn’t
really add to the experience.
Will everything become a bot or CUI?
24. These apps may however
soon have bots of their own.
AI-powered assistive
interfaces are starting to
appear within more complex
apps that could benefit from
smart, human-guided use of
artificial intelligence.
Embedded, assistive AIs
While not (yet) conversational, the Google Sheets Explore
panel acts as an assistant that proactively suggests
alternate data renderings for your spreadsheet.
25. An AI whose job is to watch
over us…
•to proactively problem solve,
•suggest more effective
ways to complete a task,
•provide a more ‘human’
interface through which to
collaborate (with other
people, or other bots). Crystal provides ‘personality profiles’ for contacts, and
helps you better communicate with them.
26. …hence all the hype :)
The promise of conversational apps appears huge:
•more human and personal than a GUI
•faster and simpler to use…if the context is right
•low commitment, ephemeral…closer to the web than apps
•mobile ‘native’…born of, and uniquely suited to mobile
e.g. interaction models, contexts of use, use of sensors
29. In this section, we’ll look at common challenges, and
design considerations when building bots and
conversational services for AI assistant platforms.
30. We are in the “primordial soup” phase of bots and AI-augmented services.
Existing platforms are immature, and prone to change.
Disclaimer
32. Although bots are zero-install,
(and ‘skills’ for assistant
platforms are broadly similar)
users still have to know the
service exists before they can
enable or interact with it.
In this sense, we’ve somewhat
replaced the app store discovery
problem with a bot store
discovery problem :(
Platform-level discovery
>4000
>30,000
>2700
~1000
Telegram
Messenger
Amazon Alexa
Slack
33. Thankfully, some platforms
already offer tools that make it
easy to share a bot or embed
just-in-time discovery within
other interactions.
(This will hopefully become standard
practice, and make bots more similar to
web sites, than traditional apps).
Contextual discovery
Just in time discovery plugins
Facebook web plugins enable users
to initiate a chat conversation, or
pass information to Messenger for
onwards interaction.
Share a bot
Share Telegram and
Facebook Messenger bots
using a hyperlink*.
*A URL opens in any browser, but Messenger and Telegram bots only
function within those apps. A shame that there isn’t further interoperability.
https://telegram.me/<bot username>
m.me/<bot username>
34. https://www.flickr.com/photos/marketingfacts/6323249188/
Just in time discovery isn’t
limited to digital platforms. A
key enabler, within WeChat is QR
codes—which are often used to
initiate or complete an offline-to-
online (O2O) interaction.
kik, Facebook Messenger and
Snapchat offer similar 2D codes,
which users can scan to follow a
brand, or initiate a conversation. ...in Korea, grocery
stores are embedded on
Subway platforms where
users scan QR codes to
buy items that are
delivered just-in-time for
dinner
36. KLM embeds Messenger plugins
at various stages:
• ticket purchase,
• check-in
• boarding pass retrieval
Users who opt-in, then receive their
confirmation, check-in notice, boarding pass
and flight status updates via Messenger.
38. Today (and for the foreseeable
future) bots and AI assistants will
remain pretty simple. Today’s
services are good at answering
simple questions, and are best
suited to completing simple,
repetitive tasks.
If your bot promises more than this,
it will likely disappoint, and this is as
much due to human factors as
technology constraints.
39. CUI proponents often
compare them to gesture
and touch based interfaces.
Interfaces that ‘natural’—
because most people
already know how to scroll,
swipe, speak or type.
‘Natural’ UI…
https://www.flickr.com/photos/hams-caserotti/6160875175/
40. ‘Natural’ but not
automatically intuitive
While they may at first glance
seem intuitive, ‘natural’
interaction models often
share similar challenges.
If for example, a gesture is
completely new, it will have
to be taught, and may be
hard to discover on its own.
Dash by Bragi “a discrete personal
assistant right in your ear”
Gesture: activate touch lock
Gesture: deactivate touch lock
41. Similarly, if you don’t know
what a bot or AI assistant can
do, or how to properly ask, you
can waste a lot of time
guessing.
The simpler the bot, the easier
it will be for users to quickly,
build a conceptual model of
what it can do.
This is particularly critical for voice-only services
as there’s no screen to refer to.
42. The majority of bots are also still
powered by rules (not that
different from the decision trees
we’ve used for years in telephone
systems).
43. And although chats look like a
conversation, the bot is simply ‘slot-
filling’—asking the necessary questions
to formulate a query with set
parameters.
It can only understand certain
questions, and respond with specific,
pre-chosen commands. If a user say
the wrong thing, it won’t know what she
mean.
44. Bots that use elements of
machine learning may go a step
further, as they can begin to
understand language*.
Users can therefore be less
specific with their commands, and
the system can generate its own
responses—gradually expanding
its vocabulary over time.
Next up…machine learning
Image: Isazi consulting*to a degree, you can’t yet expect full fluency from any of these systems
45. The most useful and successful
bots (even fairly complex ones)
have one job.
They also solve real,
demonstrable problems (and
ideally, something for which a
much better alternative doesn’t
already exist).
Give the bot one job
This extremely simple bot identifies images.
46. The problem the bot solves
should be easy to convey, simple
to understand, and (hopefully)
include steps that users may be
able to guess on their own.
Bots that leverage mobile
(camera, sensors, notifications
etc.) to simplify tasks, will often
be particularly useful.
Example:
Energy company account bot
• receive monthly bills
• check balance
• get monthly reminders to submit
a meter reading
• snap a photo of the meter to
send your reading (or type it in)
48. Use any means available to
help users quickly understand
what they can do.
Hello!
Monday, 4:09 pm
Hello…
Monday, 4:09 pm
Hello…?
Monday, 4:12 pm
Hello…?
Monday, 4:15 pm
Hello…?
Monday, 4:16 pm
49. Most bots are zero-install, but users
still see a bit of information before
they begin a chat.
Facebook Messenger for example,
provides an introductory screen where
you can set basic assumptions:
• how fast does the bot respond?
• what does the bot do?
• what can you ask?
• what personal data will it see?
Onboarding
50. It’s also good practice to welcome users with a few prompts describing
the most likely starting point, and what information the bot will need
to complete a that request.
I might get
confused
This is my job
Start like this
Here are terms I
understand and
can filter by
Can I interest you in
this useful thing?
51. The more constrained or well understood the task—for example booking a train
ticket—the more likely users will make correct assumptions of their own. This
is less likely if your bot does something new or bespoke to your service.
A known/fixed task?
Trim, the personal finance bot “can show you a few ways to save money”. Because
‘saving money’ isn’t binary…it must then explain what this means.
52. Platforms such as Facebook
Messenger, Telegram, and
Slack also enable you to
include custom buttons and
keyboards (in Telegram only)
that allow for faster, and
more accurate input.
Facebook quick reply buttons
Telegram custom keyboard
54. Apple has restricting third-party apps
within Siri to six domains: ride booking,
messaging, photo and video, payments,
VoIP and workouts.
This helps set expectations, as users are
(a bit) less likely to ask Siri for something
outside these categories.
Users also enjoy better UX as Apple can
gradually release, and optimize
vocabularies for each domain.
Third-party apps in Siri
56. Bots shouldn’t attempt
to replace what is best
left to a traditional
graphical UI.
(…and if they do, they maybe shouldn’t use
Poncho the weather bot as role model)
vs.
glanceable, easy to understand
despite high information density
57. They also shouldn’t
attempt to replace things
that humans are really
good at…
Computers are really good at…
• data retrieval, sorting, filtering
• complex maths,
• parsing vast datasets
• doing this over and over (they won’t get bored or frustrated)
Computers are getting better at…
• analyzing human sentiment
• understanding intent outside set domains or vocabularies
• determining content and context of images, video etc.
Computers are incapable of…
• emotional intelligence
• empathy
• human reasoning
• pragmatism
• (un-scripted) persuasion
• actual conversation!
(…a partial list in all cases)
58. There are also very basic
aspects of ‘real’ human
conversation that computers
still struggle with.
This includes, maintaining the
scope of a conversation,
chaining conversations
together, and differentiating a
new question, from a follow-on
question. Source: @jonesabi
This can be particularly aggravating with text chat, as there’s a visual
record of the conversation. It’s therefore easy for users to assume the
bot ‘knows’ everything that’s been said.
59. In the case of Facebook M and personal
assistants like x.ai, providing human
assistance in tandem with automation
may be purely tactical.
"M is a human-trained system:
Human operators evaluate the
AI's suggested responses, and
then they produce responses
while the AI observes and
learns from them.”
— Facebook AI Research
Other reasons to involve humans
Take over complex tasks
that can’t be automated
• “plan a birthday party”
Offer services that can’t
yet be automated
• APIs often don’t yet exist
for one AI or service to
interface with another
Generate usage data • clarify key use cases to
inform the product
roadmap
• to train the AI
61. Edward’s design was informed by a
deep understanding of typical
guest queries. The goal was to
automate the most common and
routine queries, to free up front desk
staff for face to face interactions.
what cuisine does your restaurant serve?
Tuesday, 8:30 pm
please send me some ice
Tuesday, 8:00 pm
please don’t clean my room today
Tuesday, 7:45 am
what time do I need to check out?
Wednesday, 7:00 am
can you send me more towels?
Tuesday, 7:12 am
I’d like a paper delivered to my room
Monday, 6:00 pm
Hi…i’m Edward, Radisson Blu
Edwardian’s virtual host
Monday, 4:09 pm
“We were intrigued to find out
how many different questions
a guest can have during a stay:
153 to be precise”
— Tobias Goebel, Aspect software
Edward, the virtual host
62. Edward’s handles routine questions,
and automatically routes more
complex requests to appropriate staff.
Source: Aspect software
Universal template for
self-service
64. Despite your best efforts,
users will get stuck, or
need help that’s beyond
the bot’s capabilities.
Always build in easy and
intuitive ways for users to
quit a task, start over, or
speak to a person*.
*even if the response is not immediate
Source: @superwuster
65. Shown when you first open the bot.
Nice! But you may forget it’s there.
I tried this, to see what would happen,
and was pleasantly surprised. Nice!
From Bot to human…
Users see this when they
directly message customer
service out of hours.
From human to bot…
A few nice examples…
67. “Pretending that bots are humans is
impersonal. If customers are in conversation
with an entity that they think is a person, but
then realise through inevitable technical
limitations that it is in fact a bot, how do you
imagine they will feel?
And how could that feeling ever be good for
business?”
— Paul Adams, Bots vs. humans
While it’s good practice
to enable users to
switch from human to
bot—obfuscating this
process may not be in
your best interest.
68. Source: @jonesa
This is down to trust, but also our tendency to anthropomorphize;
to attribute human characteristics to animals, inanimate objects,
or natural phenomena.
69. “[iRobot] regularly received calls asking for help to
fix “Rosie” or “Seamus” or “Floorence”. Customers
expressed concern when iRobot told them to mail
in their Roomba, and receive a new one in return—
as they might with another small appliance.
…They didn’t want a new vacuum…they wanted
“Rosie” to be fixed—or more to the point, healed.”
— Paul Colin Angle, CEO if iRobot
Anthropomorphism isn’t completely understood, but can occur even
if the object has no recognizable human form.
70. …it can even occur
when a ‘thing’ has no
physical form at all.
71. As people are likely to
attribute human qualities to
your bot regardless, you
should consider what kind of
personality you’d like it have.
“Bots are personas, whether or not
it’s intended. Every participant will
project an identity onto the bot, its
gender and personality — whether
or not it has been created
intentionally by the design team.”
— Chatbots ultimate prototyping tool, IDEO
72. Personality can be tricky to get
right. A common problem is to
misjudge how much personality
may be too much—and in what
context.
Jokes may be OK for this weather
bot, but would be exasperating if
this were a airline bot with a flight
delay message.
73. Persona mismatch?
Facebook is trying to seem friendly,
but if the context is wrong, it just feels
weird (Zach is Scott’s son).
“…we had to outlaw Howdy’s bots
from asking rhetorical questions
‘because people expect to respond
to them, even though the bot was
just being polite’.”
— FastCompany, Designing chatbot personalities
Cultural and social norms
Politeness can be deeply cultural, and
consumers is certain markets may feel
particularly compelled to respond.
Culture, social norms, and the user’s personal context are also a factor.
74. People often experiment with a
bot (either to understand what
it can do, or just for fun).
Anticipating these questions is
a nice way to develop the bot’s
personality in a more neutral
context (i.e. users aren’t actively
trying to ‘get things done’…so
may be more open to chit chat).
Google Assistant, within Allo
76. Communicating with services
on a private device, and in a
more personal context, also
changes our expectations.
Any brand or organization
entering this space should
consider whether this may
create entirely new, and
unexpected interactions.
Source: Washington Post (March 2016)
Siri’s response to
‘I was raped’…
“I don’t know what that
means. If you like, I can
search the Web for ‘I was
raped.’”
Samsung S Voice:
‘I am depressed’…
“Maybe it’s time for you to
take a break and get a
change of scenery.”
77. Society is still coming to terms
with what this means, and
where the responsibility may
lie in these complex, and very
human scenarios.
A complicating factor is that,
some software is no longer
taught what to say—it simply
decides on its own*.
*based on input from millions of users with
varying motivations