Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Machine Translates Thoughts into Speech in Real Time (physorg.com)
111 points by unignorant on Dec 26, 2009 | hide | past | favorite | 15 comments


Remarkable. Here's what seems to be the key point (reordered a bit for clarity):

The study supported our hypothesis [...] that the premotor cortex represents intended speech as an 'auditory trajectory,' that is, as a set of key frequencies (formant frequencies) that vary with time in the acoustic signal we hear as speech. [...] In an intact brain, these frequency trajectories are sent to the primary motor cortex where they are transformed into motor commands to the speech articulators. [...We] had to interpret these frequency trajectories in order to translate them into speech. [...] In other words, we could predict the intended sound directly from neural activity in the premotor cortex, rather than try to predict the positions of all the speech articulators individually and then try to reconstruct the intended sound [...]

Also remarkable (but maybe this is old hat to people who know about this stuff?) is that the signals they're interpreting come from neurites that started actually growing into the electrode months after it had been implanted.

I suppose there is a big difference between being able to interpret pre-speech frequencies in a normal brain (i.e. of a person who hasn't used this device before), versus someone being able to train themselves to communicate using this device over time. Given how adaptable the brain is, it's the latter that would seem to be the big win (and the article does vaguely imply this). Of course the device presumably wouldn't work at all if it weren't rooted in normal speech function.


The key facts:

In the current study, only three vowel sounds were tested. The test subject's average hit rate increased from 45% to 70% across sessions, reaching a high of 89% in the last session.


Holy cow! Is this 1st April? Unbelievable. They implanted an electrode to a disabled guy's brain, powered it wirelessly and it sent them back wirelessly signals representing audio frequencies of what the guy wanted to say. Decoded on a computer with 50ms latency, 89% accuracy on vowels.


I work as an EEG technician, with a background in electronics, and computer programming. This technology is most certainly possible, but in fact a real reality. The results are usually a result of part classical conditioning, and part cognitive neurology. http://en.wikipedia.org/wiki/Brain%E2%80%93computer_interfac...

This is not a huge leap from other devices like cochlear implants. Some of the newer implants use coils to avoid having wires pass through the skull. http://en.wikipedia.org/wiki/Cochlear_implant


Yeah, I think I'll have to see more evidence before I'll believe this exists.


The physorg article links to the original research paper, which is published in an open access journal that anyone can view. Perhaps you would be interested in video S1 in the supporting information section:

http://www.plosone.org/article/info:doi/10.1371/journal.pone...


It's not that wild really. You don't need a whole lot of information to get different vowels. Consonants are harder, but only in the sense that you need more electrodes and training time. The synthesis of speech (as opposed to the sampling of phonemes you usually hear in computer speech) is well understood and can be managed with (IIRC) about 16 parameters.


I'm hoping it's real but we should keep in mind the source is physorg.com


Unfortunately I can't read the story from my iPhone. They redirect a perfectly good site to a minimal site that doesn't properly follow links, so I'm redirected to the current story list. Clicking on Full Site switches to the proper display but then the story isn't in the list.


http://pda.physorg.com/speech-speechsynthesizer-frequencytra... should work. In fact, I prefer this layout to the 'normal' one.



That is quite amazing. It's a pity you posted this on what is probably one of the lowest traffic days of the year for HN, but thank you anyway.


I wonder how it goes about a person talking to himself (without actually speaking) vs. thinking and speaking to somebody.

If I had to use one of these I'd surely need a 'mute' feature :)


Polygraph replacement incoming?


I want one.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: