Posts Tagged ‘speech’

By Alison George

Human speech contains more than 2000 different sounds, from the ubiquitous “m” and “a” to the rare clicks of some southern African languages. But why are certain sounds more common than others? A ground-breaking, five-year investigation shows that diet-related changes in human bite led to new speech sounds that are now found in half the world’s languages.

More than 30 years ago, the linguist Charles Hockett noted that speech sounds called labiodentals, such as “f” and “v”, were more common in the languages of societies that ate softer foods. Now a team of researchers led by Damián Blasi at the University of Zurich, Switzerland, has pinpointed how and why this trend arose.

They found that the upper and lower incisors of ancient human adults were aligned, making it hard to produce labiodentals, which are formed by touching the lower lip to the upper teeth. Later, our jaws changed to an overbite structure, making it easier to produce such sounds.

The team showed that this change in bite correlated with the development of agriculture in the Neolithic period. Food became easier to chew at this point, which led to changes in human jaws and teeth: for instance, because it takes less pressure to chew softer, farmed foods, the jawbone doesn’t have to do as much work and so doesn’t grow to be so large.

Analyses of a language database also confirmed that there was a global change in the sound of world languages after the Neolithic era, with the use of “f” and “v” increasing dramatically in recent millennia. These sounds are still not found in the languages of many hunter-gatherer people today.

This research overturns the prevailing view that all human speech sounds were present when Homo sapiens evolved around 300,000 years ago. “The set of speech sounds we use has not necessarily remained stable since the emergence of our species, but rather the immense diversity of speech sounds that we find today is the product of a complex interplay of factors involving biological change and cultural evolution,” said team member Steven Moran, a linguist at the University of Zurich, at a briefing about this study.

This new approach to studying language evolution is a game changer, says Sean Roberts at the University of Bristol, UK. “For the first time, we can look at patterns in global data and spot new relationships between the way we speak and the way we live,” he says. “It’s an exciting time to be a linguist.”

Journal reference: Science, DOI: 10.1126/science.aav3218

https://www.newscientist.com/article/2196580-humans-couldnt-pronounce-f-and-v-sounds-before-farming-developed/

Advertisements

“Foreign Accent Syndrome” (FAS) is a rare disorder in which patients start to speak with a foreign or regional tone. This striking condition is often associated with brain damage, such as stroke. Presumably, the lesion affects the neural pathways by which the brain controls the tongue and vocal cords, thus producing a strange sounding speech.

Yet there may be more to FAS than meets the eye (or ear). According to a new paper in the Journal of Neurology, Neurosurgery and Psychiatry, many or even most cases of FAS are ‘functional’, meaning that the cause of the symptoms lies in psychological processes rather than a brain lesion.

To reach this conclusion, authors Laura McWhirter and colleagues recruited 49 self-described FAS suffers from two online communities to participate in a study. All were English-speaking. The most common reported foreign accents were Italian (12 cases), Eastern European (11), French (8) and German (7), but more obscure accents were also reported including Dutch, Nigerian, and Croatian.

Participants submitted a recording of their voice for assessment by speech experts, as well as answering questions about their symptoms, other health conditions, and personal situation. McWhirter et al. classified 35 of the 49 patients (71%) as having ‘probably functional’ FAS, while only 10/49 (20%) were said to probably have a neurological basis, with the rest unclear.

These classifications are somewhat subjective in that there are no hard-and-fast criteria for functional FAS. None of the ‘functional’ cases reported hard evidence of neurological damage from a brain scan, but only 50% of the ‘neurological’ cases did report such evidence. The presence of other ‘functional’ symptoms such as irritable bowel syndrome (IBS) was higher in the ‘functional’ group.

In terms of the characteristics of the foreign accents, patients with a presumed functional origin often presented with speech patterns that showed inconsistency or variability. For instance, pronouncing ‘cookie jar’ as ‘tutty dar’ but being able to correctly produce ‘j’, /k/, /g/ and ‘sh’ sounds as part of other words.

But if FAS is often a psychological disorder, what is the psychology behind it? McWhirtner et al. don’t get into this, but it is interesting to note that FAS is often a media-friendly condition. In recent years there have been many news stories dedicated to individual FAS cases. To take just three:

American beauty queen with Foreign Accent Syndrome sounds IRISH, AUSTRALIAN and BRITISH
https://www.mirror.co.uk/news/weird-news/you-sound-like-spice-girl-11993052

Scouse mum regains speech after stroke – but is shocked when her accent turns Russian
https://www.liverpoolecho.co.uk/news/liverpool-news/scouse-mum-regains-speech-after-15931862

Traumatic car accident victim has Irish accent after suffering severe brain injury
https://www.irishcentral.com/news/brain-injury-foreign-accent-syndrome

http://blogs.discovermagazine.com/neuroskeptic/2019/03/09/curious-foreign-accent-syndrome/#.XI58R6BKiUn

by George Dvorsky

Using brain-scanning technology, artificial intelligence, and speech synthesizers, scientists have converted brain patterns into intelligible verbal speech—an advance that could eventually give voice to those without.

It’s a shame Stephen Hawking isn’t alive to see this, as he may have gotten a real kick out of it. The new speech system, developed by researchers at the ​Neural Acoustic Processing Lab at Columbia University in New York City, is something the late physicist might have benefited from.

Hawking had amyotrophic lateral sclerosis (ALS), a motor neuron disease that took away his verbal speech, but he continued to communicate using a computer and a speech synthesizer. By using a cheek switch affixed to his glasses, Hawking was able to pre-select words on a computer, which were read out by a voice synthesizer. It was a bit tedious, but it allowed Hawking to produce around a dozen words per minute.

But imagine if Hawking didn’t have to manually select and trigger the words. Indeed, some individuals, whether they have ALS, locked-in syndrome, or are recovering from a stroke, may not have the motor skills required to control a computer, even by just a tweak of the cheek. Ideally, an artificial voice system would capture an individual’s thoughts directly to produce speech, eliminating the need to control a computer.

New research published today in Scientific Advances takes us an important step closer to that goal, but instead of capturing an individual’s internal thoughts to reconstruct speech, it uses the brain patterns produced while listening to speech.

To devise such a speech neuroprosthesis, neuroscientist Nima Mesgarani and his colleagues combined recent advances in deep learning with speech synthesis technologies. Their resulting brain-computer interface, though still rudimentary, captured brain patterns directly from the auditory cortex, which were then decoded by an AI-powered vocoder, or speech synthesizer, to produce intelligible speech. The speech was very robotic sounding, but nearly three in four listeners were able to discern the content. It’s an exciting advance—one that could eventually help people who have lost the capacity for speech.

To be clear, Mesgarani’s neuroprosthetic device isn’t translating an individual’s covert speech—that is, the thoughts in our heads, also called imagined speech—directly into words. Unfortunately, we’re not quite there yet in terms of the science. Instead, the system captured an individual’s distinctive cognitive responses as they listened to recordings of people speaking. A deep neural network was then able to decode, or translate, these patterns, allowing the system to reconstruct speech.

“This study continues a recent trend in applying deep learning techniques to decode neural signals,” Andrew Jackson, a professor of neural interfaces at Newcastle University who wasn’t involved in the new study, told Gizmodo. “In this case, the neural signals are recorded from the brain surface of humans during epilepsy surgery. The participants listen to different words and sentences which are read by actors. Neural networks are trained to learn the relationship between brain signals and sounds, and as a result can then reconstruct intelligible reproductions of the words/sentences based only on the brain signals.”

Epilepsy patients were chosen for the study because they often have to undergo brain surgery. Mesgarani, with the help of Ashesh Dinesh Mehta, a neurosurgeon at Northwell Health Physician Partners Neuroscience Institute and a co-author of the new study, recruited five volunteers for the experiment. The team used invasive electrocorticography (ECoG) to measure neural activity as the patients listened to continuous speech sounds. The patients listened, for example, to speakers reciting digits from zero to nine. Their brain patterns were then fed into the AI-enabled vocoder, resulting in the synthesized speech.

The results were very robotic-sounding, but fairly intelligible. In tests, listeners could correctly identify spoken digits around 75 percent of the time. They could even tell if the speaker was male or female. Not bad, and a result that even came as “a surprise” to Mesgaran, as he told Gizmodo in an email.

Recordings of the speech synthesizer can be found here (the researchers tested various techniques, but the best result came from the combination of deep neural networks with the vocoder).

The use of a voice synthesizer in this context, as opposed to a system that can match and recite pre-recorded words, was important to Mesgarani. As he explained to Gizmodo, there’s more to speech than just putting the right words together.

“Since the goal of this work is to restore speech communication in those who have lost the ability to talk, we aimed to learn the direct mapping from the brain signal to the speech sound itself,” he told Gizmodo. “It is possible to also decode phonemes [distinct units of sound] or words, however, speech has a lot more information than just the content—such as the speaker [with their distinct voice and style], intonation, emotional tone, and so on. Therefore, our goal in this particular paper has been to recover the sound itself.”

Looking ahead, Mesgarani would like to synthesize more complicated words and sentences, and collect brain signals of people who are simply thinking or imagining the act of speaking.

Jackson was impressed with the new study, but he said it’s still not clear if this approach will apply directly to brain-computer interfaces.

“In the paper, the decoded signals reflect actual words heard by the brain. To be useful, a communication device would have to decode words that are imagined by the user,” Jackson told Gizmodo. “Although there is often some overlap between brain areas involved in hearing, speaking, and imagining speech, we don’t yet know exactly how similar the associated brain signals will be.”

William Tatum, a neurologist at the Mayo Clinic who was also not involved in the new study, said the research is important in that it’s the first to use artificial intelligence to reconstruct speech from the brain waves involved in generating known acoustic stimuli. The significance is notable, “because it advances application of deep learning in the next generation of better designed speech-producing systems,” he told Gizmodo. That said, he felt the sample size of participants was too small, and that the use of data extracted directly from the human brain during surgery is not ideal.

Another limitation of the study is that the neural networks, in order for them do more than just reproduce words from zero to nine, would have to be trained on a large number of brain signals from each participant. The system is patient-specific, as we all produce different brain patterns when we listen to speech.

“It will be interesting in future to see how well decoders trained for one person generalize to other individuals,” said Jackson. “It’s a bit like early speech recognition systems that needed to be individually trained by the user, as opposed to today’s technology, such as Siri and Alexa, that can make sense of anyone’s voice, again using neural networks. Only time will tell whether these technologies could one day do the same for brain signals.”

No doubt, there’s still lots of work to do. But the new paper is an encouraging step toward the achievement of implantable speech neuroprosthetics.

https://gizmodo.com/neuroscientists-translate-brain-waves-into-recognizable-1832155006

https://www.nature.com/articles/s41598-018-37359-z

An automated speech analysis program correctly differentiated between at-risk young people who developed psychosis over a two-and-a-half year period and those who did not. In a proof-of-principle study, researchers at Columbia University Medical Center, New York State Psychiatric Institute, and the IBM T. J. Watson Research Center found that the computerized analysis provided a more accurate classification than clinical ratings. The study, “Automated Analysis of Free Speech Predicts Psychosis Onset in High-Risk Youths,” was recently published in NPJ-Schizophrenia.

About one percent of the population between the age of 14 and 27 is considered to be at clinical high risk (CHR) for psychosis. CHR individuals have symptoms such as unusual or tangential thinking, perceptual changes, and suspiciousness. About 20% will go on to experience a full-blown psychotic episode. Identifying who falls in that 20% category before psychosis occurs has been an elusive goal. Early identification could lead to intervention and support that could delay, mitigate or even prevent the onset of serious mental illness.
Speech provides a unique window into the mind, giving important clues about what people are thinking and feeling. Participants in the study took part in an open-ended, narrative interview in which they described their subjective experiences. These interviews were transcribed and then analyzed by computer for patterns of speech, including semantics (meaning) and syntax (structure).

The analysis established each patient’s semantic coherence (how well he or she stayed on topic), and syntactic structure, such as phrase length and use of determiner words that link the phrases. A clinical psychiatrist may intuitively recognize these signs of disorganized thoughts in a traditional interview, but a machine can augment what is heard by precisely measuring the variables. The participants were then followed for two and a half years.
The speech features that predicted psychosis onset included breaks in the flow of meaning from one sentence to the next, and speech that was characterized by shorter phrases with less elaboration. The speech classifier tool developed in this study to mechanically sort these specific, symptom-related features is striking for achieving 100% accuracy. The computer analysis correctly differentiated between the five individuals who later experienced a psychotic episode and the 29 who did not. These results suggest that this method may be able to identify thought disorder in its earliest, most subtle form, years before the onset of psychosis. Thought disorder is a key component of schizophrenia, but quantifying it has proved difficult.

For the field of schizophrenia research, and for psychiatry more broadly, this opens the possibility that new technology can aid in prognosis and diagnosis of severe mental disorders, and track treatment response. Automated speech analysis is inexpensive, portable, fast, and non-invasive. It has the potential to be a powerful tool that can complement clinical interviews and ratings.

Further research with a second, larger group of at-risk individuals is needed to see if this automated capacity to predict psychosis onset is both robust and reliable. Automated speech analysis used in conjunction with neuroimaging may also be useful in reaching a better understanding of early thought disorder, and the paths to develop treatments for it.

http://medicalxpress.com/news/2015-08-psychosis-automated-speech-analysis.html