Voice technology is a hype — for now

By Krzysztof Szymanski and Dennis Krüger

At Project A we are constantly looking for ways to provide the startups we work with with a certain operational edge. Considering this, naturally, we have been keeping an eye on the development of voice technologies and are keen on finding use cases for our portfolio companies that actually generate value. We would like to share how we currently see this topic.

In the United States research estimates that nearly one in five households have access to a smart speaker (about 50 million adults). In Germany, over 11 million own a smart speaker already. Add to that, the number of people that have access to a smart assistant through their smartphone and it becomes necessary to conclude that the general awareness of voice technology is very high, and the prediction is that it keeps growing (click here if you want to read more about it).

There are however a couple of reasons that still keep us from being overly enthusiastic about the current value-adding potential of voice. For one, there is the discovery problem of voice skills on smart speaker devices. Consumers are not actively anticipating new releases of skills, rather they stumble upon them by chance or as a result of intensive marketing efforts. At the moment, Skills and Actions markets act as the App Store in its early stage. The vast majority of skills in the Amazon app-store are either quite useless or don’t work as expected. It still seems more like hobby projects (hard to monetize), and in this area, we expect the most rapid change in 2019. Google, for example, seems to have made slightly more progress in terms of providing value to real use-cases. Differentiating the voice of multiple users of the same device demonstrates Google’s ability to understand more of the user’s contextual situation during the interactions.

Second, since you do not have the visual representation of the application on your screen (think of icons on your phone) every skill has to be triggered with voice. Setting data security issues aside, just imagine having 10 skills and having to remember all the commands. In commerce, especially in industries like fashion, beauty or travel, users are used to seeing what they buy/book. Both Google and Amazon are attempting to find workarounds to this problem (Google lets you send the information to your Android phone, Amazon can forward some information to your email). Additionally, they already offer voice-controlled tablet-like devices (Echo Show and Google Home Hub) to easily accommodate more use-cases for the industries mentioned above. Our assumption is that the future of voice technology contains voice controlling devices enriched with some form of visual representation. The amount of devices combining assistants with screens presented at Google’s CES 2018 further confirms this.

User retention for skills and actions is currently proving difficult because users need to either program the skill to their morning Flash Briefing or, as mentioned before, trigger the skill by voice every time they want to use a particular application. Ideally, a Flash Briefing becomes a part of a customer’s daily routine as they get ready for work or school, or wind down after their day. There is an exciting opportunity for brands which can provide an interesting piece of daily content that will be delivered directly to the users’ devices. However, one certainty that persists with all products we see is that convenience is key to adoption. The way in which technology is handed to the user needs to eventually lead to superior customer experience. In our opinion, the fields in which voice are currently being applied with a real benefit to the users are limited. There is still a long way to go for the biggest players to make voice applications more intuitive and user-friendly.

Outside of the smart speaker environment we a more dynamic rate of change. Millennials and the generations above may already use digital assistants like Siri or Alexa to set alarms, prompt calls or even google the occasional question — especially in situations where their hands are busy, such as driving a car. However, these types of interactions can hardly be counted as really driving business value. As mentioned before, the key to retaining your users will be to provide more personalized voice experiences that include showing a contextual understanding of the users’ particular situation. Delivering a weather forecast for a location is one thing, doing the same when the user asks “Will it snow in the next 3 days?” is another. We will see a lot more of these use cases being presented in 2019 and 2020.

If we look to Gen Z for a glimpse into what the future might hold, the outlook becomes even more enticing. Statistics strongly show that the adoption of voice technology in this age group is much more advanced. Talking to inanimate objects seemingly comes a lot more naturally to these, not just digital, but also “connectivity natives”. As computers, enabled by more advanced deep learning methods, are increasingly able to understand the context of natural language they will also be able to have more elaborate conversations and assist with more complex tasks (such as making travel arrangements for example). Until then the story of Google’s recent demo at its 2018 I/O conference, during which its voice assistant masterfully placed a phone call to make an appointment, serves as a reminder of the slumbering potential of this technology.

So what does this mean for brands and organizations right now?

First, the trajectory of these technology advancements shows the critical importance of optimizing your website for voice search. The impact on search engine optimization (SEO) will be especially interesting to follow. As people increasingly use search terms closer to natural language patterns, the ability to optimize for longer, question-based, searches will gain importance. One of our Project A portfolio companies, uberall, a provider of SaaS-Solutions for location marketing and data management. Uberall recently published the Voice Search Readiness Report 2019, a study that highlights the increasing utilization of voice search in Germany — almost 60 percent of respondents consider voice search a relevant topic for the future and 22 percent of German consumers conduct a voice search at least once a week. For more information, the report also presents a checklist about things to consider if it comes to the so-called “Voice Search Readiness”.

Google Voice assistant on a table next to Apple devices

Lastly, as the interactions with our devices and companies continue to become more conversational, using voice technology in products and in marketing those products may well become a new standard. Innovators hoping to attract younger audiences should begin to keep this in mind when discovering new ways to provide value to their users. These companies need to be prepared to manage an even more fragmented and complex systems landscape. Products will need to be developed to work across even more Platforms and to be communicated across more channels.

Despite the barriers and obvious roadblocks presented above, we at Project A believe that voice is on the path to adding another dimension of interaction possibilities to our collective mobile experiences. The data clearly indicates that users are warming up to the idea of using voice-operated technology in their day- to-day lives. If the rate of smart speaker purchases is anything to go by, the adoption of voice technologies is certainly on the rise. For now, however, the potential to add value to these interactions is limited and therefore the topic remains a hype.

So what does this mean for brands and organizations right now?

Related

Understand the Why

Agile, AI, and Low-Code: Tech at PAKCon 2023

Who Needs a Frontend Framework?