This new technology puts AI in touch with its emotions — and yours

This new technology puts AI in touch with its emotions — and yours

A new “empathetic voice interface,” launched today by Hume AI, a New York-based startup, makes it possible to add a set of emotionally expressive voices, plus an emotionally attuned ear, to large language models from Anthropic, Google, Meta, Mistral, and OpenAI — heralding an era in which AI helpers may more routinely attack us.

“We specialize in building empathetic personas that speak in ways that humans would speak, not in stereotypes for AI assistants,” says Hume AI co-founder Alan Cowen, a psychologist who has co-authored a number of research papers on AI and emotions and who previously worked on emotional technologies at Google and Facebook.

Like ChatGPT, Hume is much more emotionally expressive than most conventional voice interfaces. If you tell him that your pet has died, for example, he will adopt an appropriately somber and sympathetic tone. (Also, as with ChatGPT, you can interrupt Hume mid-stream and it will stop and adapt with a new response.)

OpenAI didn’t say how much its voice interface tries to gauge user emotions, but Hume’s is explicitly designed to do so. During interactions, Hume’s dev interface will display values ​​indicating a measure of things like “determination,” “anxiety,” and “happiness” in the users’ voice. If you talk to Hume in a sad tone, he’ll pick up on that too, something ChatGPT doesn’t seem to do.

Hume also makes it easy to deploy a voice with specific emotions by adding a prompt in the user interface. Here he was when I asked him to be “sexy and flirty”:

Hume AI’s message ‘sexy and flirty’.

And when you’re told to be “sad and gloomy”:

Hume AI’s ‘sad and gloomy’ message

And here’s the particularly nasty message when you’re asked to be “angry and rude”:

Hume AI’s ‘angry and rude’ message

Technology was not always what it seemed like polished and smooth like OpenAI and it sometimes behaved in a strange way. For example, at one point the voice suddenly sped up and spewed nonsense. But if voice can be refined and made more reliable, it has the potential to help make human voice interfaces more common and diverse.

The idea of ​​recognizing, measuring, and simulating human emotions in technological systems dates back decades and is studied in a field known as “affective computing,” a term coined by Rosalind Pickard, a professor at the MIT Media Lab, in the 1990s.

Albert Salah, a professor at Utrecht University in the Netherlands who studies affective computing, was impressed by Hume’s AI technology and recently demonstrated it to his students. “What EVI seems to do is determine emotional valence and arousal values [to the user]and then modulate the agent’s speech accordingly,” he says. “It’s a very interesting twist on LLMs.”

Leave a Reply

Your email address will not be published. Required fields are marked *