May 7, 2019

Scientists create speech using brain signals

At a Glance

  • Scientists used brain signals recorded from patients with epilepsy to program a computer to mimic natural speech.
  • This advance could one day help certain patients without the ability to speak to communicate.
Spectrograms show that synthetic speech is similar to the sentence spoken by the patient, but with less detail. "Is this seesaw safe?" Spectrograms of the sound frequencies when patients were asked to speak this sentence (top), and a synthesized version of the same sentence (bottom). Chang lab

Losing the ability to speak can have devastating effects for people whose facial, tongue, and larynx muscles have been paralyzed due to stroke or other neurological conditions.

Technology has helped these patients to communicate through devices that translate head or eye movements into speech. Because these systems involve the selection of individual letters or whole words to build sentences, communication can be slow. The technology can only generate up to 10 words per minute, while people speak at roughly 150 words per minute.

Instead of recreating sounds based on individual letters or words, a team led by Dr. Edward Chang at the University of California, San Francisco, looked for ways to create computer-generated speech. The team studied five patients with epilepsy who鈥檇 had electrodes placed in their brains to detect seizures prior to surgery. They focused on the brain activity involved in controlling the more than 100 muscles needed to produce speech. The research was supported in part by an 蜜芽传媒's New Innovator Award, the NIH BRAIN Initiative, and National Institute of Neurological Disorders and Stroke (NINDS). Results were published on April 24, 2019, in Nature.

The researchers first recorded signals from the brain area that produces language while participants read hundreds of sentences out loud. Using this data, the team then mapped out the vocal movements the participants used to make the different sounds, including how they moved their lips, tongue, jaw, and vocal cords. Next, the researchers programmed this information into a computer with machine-learning algorithms to decode the brain activity data and produce synthetic speech.

Speech synthesis from neural decoding of spoken sentences. UCSF

Volunteers were then asked to listen to 325 synthesized single words or 101 synthesized sentences and transcribe what they heard. When choosing from a list of 25 possible words, they accurately identified more than half of the synthesized words and determined the sentences being spoken by the computer without any errors 43% of the time. The accuracy varied depending on the number of syllables in the words and possible words in the sentence.

The decoded vocal movement maps were similar across the volunteers, suggesting that this step may be generalizable across patients. That may make it easier to apply these findings to multiple individuals.

鈥淔or the first time, this study demonstrates that we can generate entire spoken sentences based on an individual鈥檚 brain activity,鈥 Chang says. 鈥淭his is an exhilarating proof of principle that with technology that is already within reach, we should be able to build a device that is clinically viable in patients with speech loss.鈥

The researchers plan to design a clinical trial involving patients who are paralyzed and have speech impairments to determine whether this approach could be used to produce synthetic speech for them.

Related Links

References: . Anumanchipalli GK, Chartier J, Chang EF. Nature. 2019 Apr;568(7753):493-498. doi: 10.1038/s41586-019-1119-1. Epub 2019 Apr 24. PMID: 31019317.

Funding: NIH鈥檚 National Institute of Neurological Disorders and Stroke (NINDS), NIH BRAIN Initiative, and 蜜芽传媒鈥檚 New Innovator Award; New York Stem Cell Foundation; William K. Bowes Foundation; Howard Hughes Medical Institute; New York Stem Cell Foundation; and Shurl and Kay Curci Foundation.