Researcher Turns HDD Into Rudimentary Microphone (bleepingcomputer.com)
An anonymous reader writes from Bleeping Computer: Speaking at a security conference, researcher Alfredo Ortega has revealed that you can use your hard disk drive (HDD) as a rudimentary microphone to pick up nearby sounds. This is possible because of how hard drives are designed to work. Sounds or nearby vibrations are nothing more than mechanical waves that cause HDD platters to vibrate. By design, a hard drive cannot read or write information to an HDD platter that moves under vibrations, so the hard drive must wait for the oscillation to stop before carrying out any actions. Because modern operating systems come with utilities that measure HDD operations up to nanosecond accuracy, Ortega realized that he could use these tools to measure delays in HDD operations. The longer the delay, the louder the sound or the intense the vibration that causes it. These read-write delays allowed the researcher to reconstruct sound or vibration waves picked up by the HDD platters. A video demo is here.
"It's not accurate yet to pick up conversations," Ortega told Bleeping Computer in a private conversation. "However, there is research that can recover voice data from very low-quality signals using pattern recognition. I didn't have time to replicate the pattern-recognition portion of that research into mine. However, it's certainly applicable." Furthermore, the researcher also used sound to attack hard drives. Ortega played a 130Hz tone to make an HDD stop responding to commands. "The Linux kernel disconnected it entirely after 120 seconds," he said. There's a video of this demo on YouTube.
"It's not accurate yet to pick up conversations," Ortega told Bleeping Computer in a private conversation. "However, there is research that can recover voice data from very low-quality signals using pattern recognition. I didn't have time to replicate the pattern-recognition portion of that research into mine. However, it's certainly applicable." Furthermore, the researcher also used sound to attack hard drives. Ortega played a 130Hz tone to make an HDD stop responding to commands. "The Linux kernel disconnected it entirely after 120 seconds," he said. There's a video of this demo on YouTube.
https://www.youtube.com/watch?...
*"Cogito Ergo Liberalis"*
Remember: NEVER SHOUT AT YOUR JBOD!.
It's not yelling, if it's yelling?
- http://www.milkme.co.uk
The original finding from 2008 is here:
https://www.youtube.com/watch?...
No idea why anybody thinks this is worth a talk now.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Thank you.
End of Line.
Before all the silly conversations begin about "omg anyone's computer can be turned into an eavesdropping device!!!1" ... remember that if you can compromise a computer to the point where you can make low-level manipulations to the hard disk ... you can also simply turn on the microphone.
Tired of FB/Google censorship? Visit UNCENSORED!
I'm sure some nice men from the TLAs are now queuing outside his place offering money and facilities for him to conduct this 'further research' that you diss. One day headlines will say that terrorists have been caught using HDD sound recording technology.
I'm the original author.
First, you are kind of rude for calling me idiot, specially if you didn't even read the friendly article.
Second, have you even looked at the video? no, the disk don't "temporarily park". The delay is proportional to the vibration amplitude, mean you can sense sound volume at a low rate. Sample rate is about 50 hz, it can't reconstruct a kHz signal but voice is in the ~300 Hz, and you don't need to reconstruct the complete signal to recognize it. You don't need to recognize a conversation, you need to recognize the patterns that the conversation causes. In the original article I proved a link to a research do does exactly that with the gyroscopes in mobile devices.
Let me know when you can do the same thing with a microwave oven.
#DeleteChrome
I would like to apologize on behalf of people with dismissive attitudes. It is a real problem not just with anonymous posts, but even at the workplace, especially among "half-technical" people, who are are smart enough to understand jargon and comment but not enough to understand a reasoned argument. I've seen countless times where someone will quote from stackoverflow or some other source out-of-context, and several times where the source itself they quote from is utterly wrong to begin without even in-context. I might prove something with complex numbers, and they'll just quote someone saying you can't take a square root of negative numbers. Even after I convince them, they'll just laugh saying Intel cpus don't support complex numbers, and I have to show them the Intel cpu spec for hardware acceleration of complex numbers (and even without hardware support, it can be easily emulated in software). I've learned to stop trying, half-technical people are impediments to innovations.
Now, after that apology is done, I would like to bring up some academic research that may relate to your study of signal processing. There was some research done a while back (early 2000s, I think), that found that keyboard keystrokes leaked information on electricity draw. And even though they could not directly tell which key was hit, they were able to apply a model of qwerty keystroke cadence, since people tend to be faster or slower with keystrokes depending on the sequence of keys. Applying that model with a roughly 60Hz electrical tap, they were able to successfully reconstruct full text input at a 90% confidence. Because the model relied heavily on predictive modeling, it is not good for high-entropy signals like 8-character passwords, but it is excellent for low-entropy signals like a legal memo with several paragraphs explaining one point. You also mentioned a study directly applying to low SNR audio, for speech. However, I wonder if the vibrations for keystrokes are enough to disrupt HDD latency, and if so, a bivariate model using both HDD signal and electricity signal may yield a far superior reconstruction than electricity on its own, especially since the two 60Hz signals are likely out-of-phase. My 2 cents.
> I've learned to stop trying, half-technical people are impediments to innovations.
It's the internet. They are assholes, you just have to have thick skin :)
> I wonder if the vibrations for keystrokes are enough to disrupt HDD latency
Yes, they do. I saw it myself, the HDD is much more sensitive to vibrations transmitted by the chassis than sound. You might be onto something great here. I will quote you if I ever do something like this in the future.
They already have his research, and plenty of smart people in their own labs. No further work by the original researcher needed for their purposes.
Just junk food for thought...
This is the BlackHat pdf / powerpoint from 2009, by Andrea Barisani and Daniele Bianco, titled "Side Channel Attacks Using Optical Sampling Of Mechanical Energy and Power Line Leakage": https://www.blackhat.com/prese...
It appears it less about predictive modeling regarding cadence of keystrokes and more about the data cable itself being poorly shielded and leaking onto the +5V and GND power cables.
I still think a multivariate model using multiple low-SNR signals can be quite useful even if no univariate model of a single low-SNR signal has enough fidelity to reconstruct conversations or keystrokes. Speaking of which, how orthogonal are the signals from different HDDs in a JBOD? Will signals from 12 HDDs in the room provide sufficient signal strength for a multivariate model? If you're able to sample at 60Hz, speed of sound moves 5 meters in 1/60th of a second, so HDDs separated by 2.5m should provide considerable phase-shift. Even at 1m separation, the signals should be fairly orthogonal, and having 12 HDDs at varying distances from the audio source should give you nearly 10x the sampling frequency.
Hey I remember your project. Or something similar. You needed like 8 SDRs right? it's basically a radar. It was great.
Hey, it's research, not engineering. Our work is to prove that it's possible, engineers work is to make it practical.
Regarding multi-variate / multi-signal modeling, LIGO used the same approach to successfully detect gravitational waves. They used multiple low-SNR signals from different detectors (Washington State and Louisiana) since their noise is highly orthogonal and the signal is highly correlated with the correct phase-shift applied (solve for phase-shift using SSE minimization, then extract a high-SNR signal from the newly aligned signals). Some similar approach with multiple HDDs may work if the noise is less about ambient room noise and more about internal HDD initial-head location, other HDD geometric properties, and OS reporting error due to jiffies and NMIs (these are the sort of noise that should be very non-correlated / orthogonal across multiple HDD/CPU sources).
The practicality of any sort of potential vulnerability must be considered. In a datacenter, even a human ear can generally not hear things. While someone will say 'well I tore my laptop apart and tore out the microphones and still have a spinning disk', this is a vanishingly small portion of the userbase.
XML is like violence. If it doesn't solve the problem, use more.
You can make this work even if you are inside a VM on your notebook. In fact, I did the whole talk while listening sound from inside a virtualbox VM, with no access to the physical mic.
You mean inside a VM that just happens to have a real (not emulated) disk dedicated to it, and with user priviliges that allow direct access to said disk (IOW, root), right? And with both guest and host being very lightly loaded so little details like task switches don't complely hose your timing.
Hey anonymous,
Heres the complete kernel log of one of my test. HDD disconnect starts at line 156. Maybe it helps you.
https://pastebin.com/K22qc2Ju
Regards,
Alfred
You provide great information. For the complete paper I might use some of it, for example I didn't look at the smart parameters but they might provide some info.
BTW the firmware just completely blocks, as you can see in the video, it doesn't even answer the hdparm -I. But in my tests, I was also accessing the HDD constantly (this is to draw the delay graph showing above in the video) so it might be that a read() comand is queued and blocks, waiting for vibrations to stops, and it blocks all other commands being sent to the HDD.
Another information lacking in the article: I managed to permanently damage an HDD. It didn't completely stop responding, but now the read delay is much bigger than before. While testing it at high vibrations, the HDD did some loud mechanical noises, so apparently the HDD did try to park itself multiple times. That HDD is now unusable for tests because it randomly delays reads over 10 ms (normally the read syscall takes about 500 ns).
I wonder if this would be more useful as a seismograph?
I think this website help us: http://www.fanatik.com.tr/2014...