MIT’s AlterEgo Headset Lets You Silently Converse with Voice-Controlled Devices

3 May 2018 11:00am, by

The idea of using one’s voice to tell a computer what to do is a pretty common one in the sci-fi genre, but it’s also something we’re actually seeing now in voice-activated assistants like Siri, Alexa and Cortana. But the big drawback to these platforms is that these are not private exchanges; those who are within earshot can also hear whatever conversations you’re having with your devices.

But that could change with a device being developed by researchers over at Massachusetts Institute of Technology’s Media Lab. Dubbed the AlterEgo, it’s a stealthy tool that allows users to interface with computers using commands that can be verbalized silently, by interpreting neuromuscular signals in the jaw and face that occur when one ‘talks’ to oneself. Watch how it might be used in everyday life:

“The motivation for this was to build an IA device — an intelligence-augmentation device,” said Arnav Kapur, an MIT graduate student and one of the lead authors of the research paper. “Our idea was: Could we have a computing platform that’s more internal, that melds human and machine in some ways and that feels like an internal extension of our own cognition?”

AlterEgo is conceived of as a “wearable silent speech interface” where users can not only issue commands and queries in natural language, but also hear answers silently. The design takes advantage of what is known as subvocalization, or the natural process of “silent speech” that one typically engages in while reading. This silent act of articulating speech before it becomes audible sound creates tiny movements in the larynx and other muscles, generating electrical signals that can be detected by machines. To track these minute electrical signatures, the AlterEgo device employs four electrodes that are placed on strategic points (as seen below) along one side of the face and jaw, where these signals can be reliably picked up.

These electrical impulses from the user’s internal verbalizations are then processed by a convolutional neural network that has been trained to classify and translate these signals into words. AlterEgo can then respond in natural language as well, via bone-conduction headphones that vibrate into the user’s inner ear, allowing them to hear the answer without drowning out other auditory information from the user’s environment. In addition, the interface’s neural network can retrain itself to adapt to the idiosyncrasies of each user’s neurophysiology.

The team tested their prototype with 10 participants by getting them to use the device for specific tasks, namely arithmetic computations and the playing of chess, using the standard notation system to report moves across the game board. The researchers chose these tasks in particular as these could be performed with a relatively limited range of vocabulary words. On average, the device was able to accurately transcribe signals 92 percent of the time, with an average latency of 0.427 seconds.

In any case, there are a lot of interesting possibilities here: not only would we be able to converse with our machines privately without anyone eavesdropping, such an interface would create a much more seamless user experience. Rather than having to type or switch between applications on a smartphone for instance, one could just silently converse with one’s device to perform a task, without having to divert one’s attention too much from what is going around in one’s environment.

This smoother workflow with machines could be brought over not only to interacting with various applications, IoT or smart devices — it would also be useful in workplaces where some auditory discretion is required when working with collaborative robots, such as in the medical field or in military operations.

On an broader level, the AlterEgo’s creators envision something else emerging altogether. Faced with the rise of super-intelligent machines that will likely overtake humans in many areas of life, tools like the AlterEgo could help level the playing field by significantly augmenting human abilities, creating a more “natural human-machine symbiosis” and advancing the field of “silent computing.”

“AlterEgo aims to combine humans and computers — such that computing, the internet, and AI would weave into human personality as a ‘second self’ and augment human cognition and abilities,” said the research team.

Others like Elon Musk have proposed similar brain-computer interfaces that would permit humans to gain access greater computational power and the collective knowledge of humanity. However, AlterEgo would potentially be a less invasive way to achieve the same end.

The idea of that symbiotic ‘second self’ that is an interweaving of human and machine is far off yet, but in the meantime, we still haven’t completely untangled all the ethical and philosophical implications of such a possibility. For now, Kapur says that the goal is to improve the system’s performance further with more training data, which will also help develop a more generalized silent speech recognition model, and build a wider range of applications with larger vocabularies. Ultimately, the plan is to refine it so the interface can approach a more conversational level — meaning that one day, such devices might very well get many of us talking to ourselves on a regular basis.

Read the rest of the paper.

Images: MIT Media Lab

Feature image via Pixabay.

A newsletter digest of the week’s most important stories & analyses.