Sampling Short Sequences From Long MP3 Recordings?
mehl writes "I am a professor for social psychology at the University of Arizona and I am looking for help with finding / developing a special program. In my research, I ask participants to carry around a digital voice recorder while they go about their normal lives. The voice recorder then tracks the ambient sounds in their environments and produces an 'acoustic log' of a person's day. We then use these ambient sound recordings as source data for various person perception studies. For privacy reasons, we are required to sample brief snippets of ambient sounds instead of recording an entire day continuously ('Big Brother is listening to you...'). So far, we have achieved this by modifying the hardware of a digital voice recorder (triggering it with an external microchip). With the high turn-over in player models, however, this strategy has turned out to be short-sighted (every half a year we have to build a new chip). I am thinking about switching strategy, recording continuously in the first place (no problem with the current generation of flash memory) and then sampling (random) snippets after the fact from the continous recordings. Does anybody know of an existing program that can randomly (or pseudo-randomly; e.g., 30 sec every 10 min) and automatically sample short sequences from a day-long (18 hours) mp3 recording? What would it entail to develop such a program (for Windows)?."
I work for an acoustics company and we use either matlab or CoolEdit pro to analyze waveforms. Given the size of your data, it could be difficult though. Probably would want to break down the input into hour segments.
Might want to check with an acoustics lab.
Try http://www.ee.sunysb.edu/~cspv/CSPV.html or something similar.
100% Insightful
Perhaps I didn't get the question right, but it seems very trivial to me and just about any semi-professional audio editing tool could do it.
But have you also considered using Matlab? Its DSP Toolbox (and other ones as well) is fantastic and manipulating audio data is almost a trivial thing to do (assuming you've already found a way to have your data recorded into a readable format -- which includes many popular choices). Matlab is also suitable for all kinds of manipulations with vectors and you can do an enormous amount of things (FFT, etc.) with some very simple and intuitive commands. Resampling, random number generation, time commands -- all these are provided.
So... check out www.mathworks.com -- the drawback is the price (but you can get very good deals for educational purposes).
PS: Of course, I am not affiliated in any way with Mathworks Inc.
Doomie
That was one of my issues. First for privacy reasons, they decided to only sample randomly by turning the device off and on for short periods. Then, because it's easier for them, privacy gets tossed and they record the entire day, and sample randomly from that.
it's just a stream. You should be able to use dd, like
dd if=/home/britney/oops.mp3 of=/home/kazaa/sample.mp3 bs=(not sure?) count=(duration)
Meh, whatever, you get my point. You specify the frame size in the bs= parameter, the id3,etc, offset in the "where to start" parameter, and go from there.
Or just use one of the metric assload of utilities out there that already do this.
I don't need no instructions to know how to rock!!!!
the OP said as much
compare, for example, to the latest (federal) medical privacy rules, www.hhs.gov/ocr/hipaa/
[this sig has been trunca
"Thanks to the remote control I have the attention span of a gerbil."
You ask a person to carry an 18 hour voice recorder... I'm just curious what batteries you use.
To solve your problem(s) here are a few suggestions.
You are at University.
Is there a music lab?
Is there a Computer Science dept?
Is there an electronics dept?
If you can answer yes to 3 of these question most of your problems can solve themselves.
Talk to the Deans of those departments, explain your needs and suggest that the students in these depts may participate in the construction of what you need for their labs or projects.
I'm sure you would have students banging down the door to work on a project.
My point is, use what you have available.
I read a few posts down that someone "couldn't handle, dropped out of college because of professors like this". Well to me this sounds like a lesson with a deeper meaning than just some sort of useless project.
Perhaps it is an effort to show how even the simplest of experiments present difficult logistical problems.
I am Bennett Haselton! I am Bennett Haselton!
It's a sort of 'if there's nobody in the forest, the sound was never heard' type of solution.
Free Software: Like love, it grows best when given away.
I've built a simple Java app that parses through an mp3 file and can do exactly this. You feed it three parameters... the begin time, the duration, and a filename and it outputs the request.
I originally did this for a *large* organization which had a huge number of sound recordings and they wanted specific cuts and already had the offsets and durations.
Drop me an email for details.
Basically what you want to do is some time compression. You can do that by means of granulation or a FFT. Most audio applications already have time compression/expansion plugins built in to them. Sound Forge, Pro Tools Free, Live and Cool Edit are some of the commercial programs that come to mind. You could also build a stand alone program fairly easily with Csound, Max/MSP or Pure Data. These are audio programming/scripting languages. Csound and Pure Data are free. You just need to know a little about digital audio to make a program with any of those languages.
I can't find anything on his web page about it, but Greg Abowd, a professor at Georgia Tech has been working on continuous capture. He has some pda/cell phone software that his group has been working on which allows for continuous capture of audio. He also knows a lot about the laws regarding such recording. Not all states/provinces allow it, but many do.
I think his goals are more along the lines of automating segmentation and indexing of the audio for easy searching of your entire last day/week/year/decade of conversations with people.
Anyway, you might be interested in the kinds of things he's doing. But actually picking out random snippets of mp3 audio should be a trivial coding task. I'm sure there have already been a dozen libraries/scripting tools/command-line solutions proposed already in previous posts.