The rise of the robot assistant is happening. Last year, Amazon’s Echo accounted for a quarter of all speaker sales.
Siri, Cortana, M, Magic, Alexa, and Slackbots have all begun vying to fulfill your needs in various ways: Siri and Cortana take your voice input on the go, fulfilling queries and doing basic tasks. Facebook’s M and Magic are text-based bot-assisted humans helping you complete more complex actions. Slackbots fulfill a wide variety of (mostly) work and team-related needs through a text-only interface. Alexa rounds out the group, taking voice input at home, helping you answer basic queries, play music, and place orders through Amazon.
Except, every one of these consumer-level bots requires it be communicated with in a specific way. I can’t text Siri or Alexa. I can’t speak to Magic or M. I can’t talk to Slackbots outside of Slack. Likewise, they know nothing of each other.
General purpose robot assistants won’t truly be useful until they can work with each other, in all contexts.
When communicating with humans, we rarely only text or only speak. When talking to a friend, coworker, or family member, you switch modes depending on your context and theirs: you may choose to text them, give them a call, send them an email, or speak to them face-to-face.
The myriad of ways in which we communicate with someone provides the flexibility needed for natural, ongoing conversations. Restricting these bots to a limited set — a “communication surface” — means that they, unlike a person, are unreachable outside of specific contexts. The surface area and, consequently, our ability to rely on the bot, is reduced.
If I’m in a meeting, appropriate social behavior dictates that I probably shouldn’t speak to Siri and ask her to remind me to email off the earnings report before the end of today — but I could text her.
Similarly, while driving, I can’t safely text Magic and ask it to buy and deliver flowers to my home this evening — I could ask Siri to text Magic, but then I’d have two messages to proofread and a greater likelihood of miscommunication.
And at neither of these times could I ask Alexa to do anything — she’s sitting at home.
Switching bots depending on my location, social mores, and personal comfort of speaking or texting means that each time, I need to consider the capabilities of the particular bot I’m interacting with. Without an interface to guide me, I’m left to remember the functional differences on my own. Alexa can place orders. Siri cannot. Magic can do most tasks, but to ask something trivial like Robert De Niro’s age would incur a slower reply than the others.
While their capabilities may overlap, every bot is functionally different.
Additionally, because of the limited usage context of each one of these bots, they have little understanding of me outside of themselves. While Siri could know that my girlfriend texted me about needing onions and garlic for a dinner party this evening, Alexa and Magic don’t have any idea. Conversely, if Alexa places an Amazon order for me, Siri won’t let me know when it’s shipped or at my door.
These bots do share some knowledge; Alexa, Siri, and Magic can all look at my calendar and schedule events. That’s…about it.
Siri, did I leave my lights on?
\¯\_(ツ)_/¯
Alexa, did John email me that PDF?
\¯\_(ツ)_/¯
Magic, what did Matt ask me to pickup from Home Depot again?
\¯\_(ツ)_/¯
Unlike a real assistant, none of the fully robotic assistants proactively take care of your needs beyond a simple calendar reminder; only Magic offers these capabilities and at a significant cost. I won’t be notified that my Amazon package will be arriving later today or that the temperature in my apartment has dropped 15 degrees in the past half-hour. It won’t ask me if I’d like to respond to Matt’s text about tomorrow’s meeting.
Worse yet, the tools that do provide these notifications (such as Nest, or a package notification app) don’t usefully tie into our bots. While Alexa can control my Nest thermostat with some configuration, that does very little for me when I’m not home.
So, where does this leave us? Right now, we’re in a fragmented landscape, at the bottom of the uncanny valley of virtual assistants. I can speak to some, but only text others. I can do things with some but not others — until they add that feature (and then it’s up to me to keep up with release notes). Almost none of them will do anything for me proactively beyond reminders.
Until these bots gain the ability to be used more flexibly — expanded usage contexts, sharing knowledge — we’ll continue to see them as very limited curiosities. Anything like Her is still a far off dream.