Letter from America: Voice Control Takes Command
If you’ve read any of the reports from CEDIA Expo 2016 here on HiddenWires or were there yourself, you know that “Voice Assist” was the major topic and point of interest for attendees. Only a few weeks ago there were basically two widespread platform options.
One was Apple’s Siri, which uses an iPhone or iPad as the input device to provide either a direct search response via voice or on the screen (“Siri: Where is the nearest Metro Station?”), or it uses voice commands to initiate an action on a compatible device through HomeKit or an app (“Siri: Open the garage door.”).
The other, of course, is everyone’s current star child, Amazon’s Echo, Tap and Dot products, also known by their more popular “wake up name”: “Alexa”. Unlike Siri they are — for now, other than using the FireTV app for your phone — limited to input through the units, themselves. But as was obvious to anyone who walked the aisles at CEDIA, the use of the Amazon products as a voice input device and event trigger is a big deal. Amazon’s expansive plans to make it easier to have companies add more “skills” to the Alexa Library clearly has wide appeal.
However, Siri and Echo will not be the only products with voice control starting later this month. For example, along with Amazon’s command presence at CEDIA, US cable television giant Comcast was not only promoting voice control capabilities for the Xfinity X1 set tops, but they were also giving away remotes that allow Comcast subscribers to “tell” the cable box what to do.
Next, starting on October 20, the new version of Amazon’s FireTV Stick will come with a voice control remote, as the standard-sized FireTV already does. That remote, along with the FireTV app already available for iOS and Android, triggers search and the same set of commands and skills as the Echo products programmed to your Echo. Curiously, although the phone app initiates the voice command, the results do not appear on the phone as they would with Siri or “OK Google”. Rather, they appear on the TV screen and any audio is heard through the TV’s speakers. This option will now give the user the choice of near-field voice entry (the phone app or remote) or far-field (the Echo, Tap or Dot).
A potentially disruptive change will come in early November when Google announces a suite of products built around Google Assistant. The new Pixel phones, replacing the current Nexus models, a new 4K/UHD Chromecast “Ultra” with HDR-10 and Dolby Vision, a new Wi-Fi Access Point line, and the long-awaited Google Home.
With far-field microphones coupled to voice command, Wi-Fi and a speaker system that should produce very competitive audio quality, at first glance one might see Google Home as simply a slightly different take on what Amazon already does. True at a basic level, but look closely and it becomes clear that there is more than meets the eye.
Tying all the new Google products is the “listen – understand/analyse/interpret – search – respond” way that the Google Assistant moves mere voice control to “voice initiated intelligence.” Or, as Google said during the introduction event, the ability to turn mere search into a personalised, two-way conversation. Indeed, Google expressed this a move from “direct actions” to “conversation items.” In the former, the device and the software behind it knows what you’ve asked it and answers or reacts accordingly. The benefit of the latter is more than just the ability to do a deeper search, based on the knowledge about the user, their preferences, previous searches, photos and similar.
As the notion of a “conversation” conveys, the Google Assistant takes the search to a new level. For example, after asking for a restaurant in a particular area you can then select one, ask the Assistant to make a reservation for a specific time and party size, get feedback in the form of a reservation and, if needed, get directions or order an Uber. For home control it potentially allows you not only, tell the HVAC control to raise or lower the temperature, but to first ask what the temperature is, and then tell is to adjust it to a specific level at a given time.
The use of the Amazon products as a voice input device and event trigger is a big deal.
The implications for this embraces entertainment through Chromecasts and a new range of Chromecast-embedded speakers and displays, traditional information queries from what was the score of a sporting event, the closing price of a stock or tomorrow’s weather forecast, and, of course, home monitoring, control and automation.
One interesting aspect of Google Home is the ability to link Home units, Chromecasts and GoogleCast-enabled speakers to create multiroom audio in the vein of what Sonos, Heos, Play Fi, Blue Sound, Musicast and others. All of these have combined to change the way “whole house/distributed audio” is installed in the 21st century.
However, it’s the last item, of course that interests us the most, as Siri and Echo, and to a lesser degree Microsoft’s Cortina can already access and play entertainment content with varying degrees of success. The head start comes both from the benchmark stake in the ground set by the thousands of “skills” already available for Amazon’s products and by Apple’s HomeKit-enables products suite. With third-party products such as the “Echo with a built-in video display” from Nucleus, other Echo-enabled products to come, Apple-based systems such as startup Josh.ai, that lead is not to be ignored.
The use of the new Pixel phones and Google Home as a portal to the Google Assistant is a start, but more is needed to make Google’s initiative truly competitive. The GoogleCast speakers from a number of leading brands and forthcoming GoogleCast TVs are part of the puzzle, but having all of them perform actions outside the items the Assistant can search out is critical.
Again, matching programs announced by Amazon for integrators to react and activate from Echo commands, Google will launch an Open Development Platform/Open API in December. Along with that, there will be an “embedded SDK” next year that will not only allow you to create actions and responses, it will enable you to use platforms such as Raspberry Pi as the “response engine” that will, in turn, initiate the action in a device that might not otherwise be able to process commands.
The concept of voice command is as old as a Star Trek character talking to “Computer”, Dave talking to HAL in 2001, A Space Odyssey, or early attempts at voice-control for home automation such as the Video Butler. With all the attention in recent weeks to what is loosely called Voice Command one should not forget that the real magic behind all of this is how deep the search behind a command goes and how complicated a conversation can follow the initial command. Google’s new products at the front end, and the power of the Google Assistant at the back-end, our industry will have yet another tool set to consider and, literally, call upon. This is going to be interesting!
Michael Heiss is a technology consultant and journalist, CEDIA Fellow, CEDIA ESC 2 Certified, and US correspondent for HiddenWires magazine. You can contact Michael via the HiddenWires LinkedIn Group. Follow him on Twitter: @captnvid.