Slashdot Mirror

← Back to Stories (view on slashdot.org)

Extracting Audio From Visual Information

Posted by samzenpus on Monday August 4, 2014 @01:38AM from the what-the-bag-says dept.

rtoz writes Researchers at MIT, Microsoft, and Adobe have developed an algorithm that can reconstruct an audio signal by analyzing minute vibrations of objects depicted in video. In one set of experiments, they were able to recover intelligible speech from the vibrations of a potato-chip bag (video) photographed from 15 feet away through soundproof glass.

2 of 142 comments (clear)

Min score:

Reason:

Sort:

Scary by Anonymous Coward · 2014-08-04 01:48 · Score: 3, Interesting

This is cool, yet scary stuff.
I wonder how loud the original audio has to be in order to be recovered in this manner? It sounded to me like the spoken words were being shouted, and we have no way of knowing how loud the music was played. I didn't see any mention of that in the linked article.
The linked article has additional technical(ish) information that's not in the video.
Requires a very high speed camera by tepples · 2014-08-04 01:51 · Score: 4, Interesting

The YouTube video captions state that this technique requires a camera capable of a few thousand frames per second. Thus this is pretty much using a camera to follow the vibrations, little different from a laser mic. What would impress me more is if they were able to pick up different frequencies from different parts of the bag with different resonant frequencies and reconstruct from standard 30 fps video using the bag as a transducer.