HALL 9000
A Space Odyssey (1968)If you were influenced by the HAL 9000 in Stanley Kubrick’s 2001: A Space Odyssey (1968), then you’re likely to have disconnected your Alexa, hopefully with the determination of Dave…
The Information reported (which cites Amazon’s internal sources) that only 2% of Alexa owners made purchases with their voice. From that group, only 10% did so again, indicating that there is a large gap between what Amazon expects from Alexa and the value the market gets from it.
Voice, like any other innovation that interacts with people, succeeds or fails not only because of its ability to resolve a set of logical functions, but also by its capacity to make us feel. It’s now evident that voice technologies aren't triggering the kind of feelings its investors aspired for.
What do we expect from voice technologies?
As with any consumer facing technology, there is likely to be a gap between the vision of the makers and the perception of the market, a distance that for the case of voice is of key significance.
Nowadays, consumers of smart speakers and the like continue to expect more than what they are getting, and the challenge lies in the human longing nature of these expectations.
When we confront voice technologies for the first time we naturally fall into perceiving them as something that is alive, something that is present, something we can relate to. Then, when we realise these “things” are inanimate and all but “smart”, is when the feeling of frustration and disappointment kicks in.
Can human-like technology be created?
Elevating technologies like voice with human-like qualities leads nowhere as it implies creating human-like technology capable of behaving in human-like form.
The emulation of human-like form firstly and inexorably demands to elucidate – in neurobiological terms – the many aspects of the mental-body phenomena that orchestrates humans and these, for the most part, will remain a mystery for years or even decades to come.
Can AI do the trick?
Creating technology that is capable of understanding, expressing and conversing in a human-like form needs more than machine learning. It inevitably involves the creation of technologies that encompass software (mind) and interface (body) as a twofold unity that hold a notion of integrity, being therefore both aware and conscious.
Such human-like technology would have to be capable of generating a flow of mental contents in the way we do (caused by emotional responses), and capable of identifying, triggering and executing emotional states which can be considered to be the motive and source of intelligence.
Such human-like technology would have to become capable of understanding and expressing beyond the recognition of sentences, words or images. It would have to become capable of “thinking”; and as such a holy grail for AI's “general intelligence” that sits way beyond the statistical approach that machine learning follows today.
How will voice technologies evolve?
The future of human-like technologies such as voice is uncertain and yet, given the vast research in the field, it’s likely that progress will continue to be made. That said, we should recognise that such progress will occur only in highly concrete domains where potential user frustrations can be minimised.
Gene Muster of the investment firm Lupe Ventures estimates a massive $5bn annual investment by Google, Apple, Facebook and Microsoft on voice technologies. This will be a brutal force of which I find particularly revealing through the investments by Amazon in start-ups like Bamboo Learning, Endel and Aiva’s patient assistant tool (with investment also from Google).
What is interesting to observe in these innovations is that they are aimed at specific domains where frustrations are minimised and the perception of value maximised.
Think of a digital assistant designed to guide a beginner on an introduction to piano, or a tool aimed at assisting a Y7 student with algebra. All these tools exist today and will soon evolve to also allow voice interactions as well as images and text, providing an exchange that evolves search from mere query-results mapping to richer interactions. One that sits beyond the notion of a simple information system and can be defined by the notion of a communication system (sender-message-receiver).
The future is always uncertain, and the evolution of conversational technologies and the machine learning that powers their capabilities is unknown. Yet, the evolution of voice technologies will continue from tools designed to launch simple commands into a new value proposition; interacting with a person through various objectives towards an expected outcome.
The many possible cases that interacting with machines will be able to accomplish will not result in human-like voice conversations but in machine-like voice-image-text interactions. These will take place within highly specific contexts where these tools will better relate to known goals.
So, can Voice Technology deliver to its promise?
Yes, but only within extremely specific domains where the natural amplitude of conversational trails is reduced to a minimum.
Will we ever converse to our devices as Dave did with HAL 9000?
Replace the 2001 spacecraft for a car, HAL 9000 for a McDonalds drive-through and a McMenu for an answer, then that is a far as Voice will be able to deliver.
0 Response to "Can Voice Technology Deliver Its Promise? - Forbes"
Post a Comment