Ask Slashdot: Synchronizing Sound With Video, Using Open Source?
An anonymous reader writes: I have a decent video camera, but it lacks a terminal for using an external mic. However, I have a comparatively good audio recorder. What I'd like to do is "automagically" synchronize sound recorded on the audio recorder with video taken on the video camera, using Free / Open Source software on Linux, so I can dump in the files from each, hit "Go," and in the end I get my video, synched with the separately recorded audio, in some sane file format. This seems simple, but maybe it isn't: the 800-pound gorilla in the room is PluralEyes, which evidently lots of people pay $200 for --and which doesn't have a Linux version. Partly this is that I'm cheap, partly it's that I like open source software for being open source, and partly it's that I already use Linux as my usual desktop, and resent needing to switch OS to do what seems intuitively to be a simple task. (It seems like something VLC would do, considering its Swiss-Army-Knife approach, but after pulling down all the menus I could find, I don't think that's the case.) I don't see this feature in any of the Open Source video editing programs, so as a fallback question for anyone who's using LiVES, KDEnlive, or other free/Free option, do you have a useful workflow for synching up externally recorded sound? I'd be happy even to find a simple solution that's merely gratis rather than Free, as long as it runs on Ubuntu.
What you want is a simple Hollywood style clapboard. Use any two track + video editor and visually line up the audio spike to the video frame where the board is closed.
That is what the movie claps you see in documentaries are for. You produce a clap with your hand, resulting in a spike in your external mic and in the video cameras mic. That way, you know where to sync the two in your favorite video editing tool and you're fine. No need for special software or something similar.
Unless the devices themselves have some kind of common sync like wordclock or the like, they will drift out of sync. So sync the audio at any given point, it'll be out of sync later. That's why studios have all kind of gear to slave everything to a master clock.
So you either have to have something that can do an advanced auto-sync and make sure the sync gets corrected in multiple places, or you'll need to do it manually. Depending on how long the recording is that may not be too bad and you may not need to adjust it that many times, but it is really all you can do.
Now of course if you gear has some kind of clock input and output you can slave it together, but I'm guessing it doesn't.
Finally your request for Linux stuff makes it really hard. The Linux video editing scene is, well, really really bad. There are no good tools that I've come across. GRanted I haven't looked in awhile but last time I did all I found were things that were incomplete, or buggy, or not very useful (or all 3).
Something like Sony Vegas Movie Studio would do the trick and make it pretty easy to do what you needed manually, but it does cost money and isn't for Linux.
I'd love to find that same tool. Lately, I've been shooting up to 4 cameras and 2 audio-only sources, and while Sony Vegas (commercial software) has a tool to synchronize media based on embedded time-stamps ( or timecode if one's using pro cameras), I haven't been able to get it to work (probably my own stupidity). Still, it relies on having the clocks sync'ed between cameras.
I'd also love to find a tool/plug-in which would automatically trim out all my quick pans, zooms, focuses from my close-up shots on the editing timeline. That would save a lot of time.
I haven't used it in a long time, but would Avidemux do the trick?
See Subject. It's not completely automagic, but it can probably do what you want.
I think FFMPeg can mux audio and video from separate files. If not, search for muxing tools, and you'll have your answer. Of course, bear in mind that sample rates can drift. so the longer the video, the less likely this is to work without resampling. And you'll need to edit the start points to be precisely the same.
Your issue is very similar to what Twitch streamers go through with delay between audio and video. I'd suggest checking out OBS and there are quite a few how-to videos on YouTube to show you how to sync.
The video editor / compositor in Blender is quite nice. I'm sure with a little setup in Blender (and using a ClapperBoard at the start of every one of your videos) you could automate this process down to just having to drag an audio clip back and forth to match a spike in the audio with a frame of video.
If you REALLY have to do a LOT of videos and don't want to take the time to do it manually, then maybe you should think about dropping the $200 for PluralEyes. I have never seen or heard of any free / open source program that will do this automagically.
Actually I'm not even sure what visual / audio processing algorithms / methods PluralEyes would use to be reliable enough to do this every time - Unless they also require you to use a clapperboard...
The PiTiVi editor will do something like this, it's free and open source. I think OpenShot will do something similar. The nice thing about both programs is you can break audio/video files up and move them around if you decide you want to edit either audio or visuals later.
Don't sync to the video, sync to the audio. Your video recorder records the audio "reference" track and you just sync your externally recorded audio to the reference track. Kdenlive has this feature. Other editors may a well.
~~~~~~~
"You are not remembered for doing what is expected of you." - Atul Chitnis
It's not clear how many pieces of video you want to sync, but it is pretty straightforward in any video editing program to sync up uncut chunks of video with uncut audio. Simply put the video (with audio from your camera) into the timeline, then add an audio track and insert your microphone audio roughly in sync. If you have a clap or marker at the beginning this is easier, but even without you can usually find some sound or spoken phrase to identify a "sync point". When you cut it in close to the correct spot and play the sound you will hear it double up or "echo". Now adjust the microphone audio forward or backward until the "echo" turns into more like a reverb sound, eventually you will find that the two audio tracks line up and sound normal or at least "phasey" like a flange effect. This is how professional audio editors synchronize sound that has no timecode or slate. It may sound complicated, but it is very easy and fast. I use Plural Eyes when I have lots of complicated little clips, but if I have a dozen big chunks of video to synchronize, I just do it manually -- it's simpler than setting up Plural Eyes.
NOTE: You MAY end up with sync drift depending on what sample rate you recorded your audio at since it doesn't sound like you have an audio recorder that is capable of recording audio at "pulled down" (24p vs 23.9p) speed.
mencoder will do exactly what you need, muxing separate video and audio tracks into one container file of your choice, with tunable offsets for each track.
You can sync up the motion to the sound in post edit
https://en.wikipedia.org/wiki/...
Blender has quite good capabilities for non-linear video editing, and the audio-video syncing will be a breeze. I was able to do a short (3m) product demo movie without having any prior experience with Blender, just by watching a couple of tutorials on youtube.
https://github.com/allisonnico...
I've done this with OpenShot. It's fairly easy and intuitive to use. Save often -- it crashed frequently for me, but when it worked, it worked well.
It might "seem intuitively to be a simple task" to you but if there is a $200 piece of software that does that and nothing else (and lots of people buy it) it's obviously not that simple. Either do it manually in a normal editor, write (and open-source) your own version, or pay for the commercial version.
Do you have a video editor that shows the audio waveform?
Start your audio and video recordings. Stand in front of the camera (optional) and clap your hands.
Load both recordings into your editor, look at the waveforms, and line up the spike in them from when you clapped.
If you stood in front of the camera, you can use the visual cue of your hands to line things up if the camera audio is bad ... this is why clapboards you see on film sets have the stripes. It's to make them easier to see.
I use avidemux and audacity to add Rifftrax to movies. Basically copying out the existing audio, merging the rifftrack with the movie audio in to one track and then putting it back in as a separate audio track so the original audio is there too if you want to watch the movie without jokes (pretty rare for me actually, but it's nice to have options ;)).
Someone else already suggested OBS, I've used that too for recording gaming video. It can be pretty intensive and it takes some fiddling to make it workable but it's not bad.
Okay, so are you going to buy him the copy of Windows he needs to run it, since as he said, it doesn't run on Linux?
If you're doing this for surveillance, I use iSpy you can define a range of your camera that if it detects motion will start recording and you can choose how far in advance the buffer is and how long it waits before it stops. It can be triggered by audio too, I've found it to be very nice for watching maintenance at my apartment and people trying to break in to our cars.
They do charge for remote access features but you can "roll your own" for that if you need it (I didn't). It can use an external mic as a source or one on the camera either way.
I've done that a couple of times with kdenlive for some amateur skydiving videos. It probably wouldn't work so well for production quality stuff, but it works fine for what I'm doing. Most of my videos are less than 10 minutes, though.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
http://thinkinghard.com/blog/H... http://www.hecticgeek.com/2013... http://alien.slackbook.org/blo... There are lots more blogs, website etc that have solutions
This is normally done with timecode. You "jam-sync" the sound recorder with the camera at the start of the day, using time-of-day timecode on each. They should stay reasonably close. For greater precision, you need the slate, as mentioned above, since timecode only gives you one-frame accuracy, and sound that is half a frame out-of-sync is still noticeable. Anyway, pretty much anything that can be used to edit video should be able to sync picture and track if they both have good timecode on them, and you can manually use the slate to fine-tune it.
It's actually not that hard with the new video tag and RT5 standard.
Setup a webserver (I recommend nodejs), build a simple website, add in the well documented javascript to activate your camera/mic and then record it to a file. When done save it.
Voila.
http://stackoverflow.com/questions/12938581/ffmpeg-mux-video-and-audio-from-another-video-mapping-issue
There's a reason so many are "stuck" using Windows, no matter how onerous Microsoft gets (or OSX, no matter how walled garden Apple is) - some industry standard software simply doesn't run on other systems. Compared to the cost of someone to do the work for you, buying a dedicated Windows machine and the software to run it is less than a single edit session.
Or, as they say, use a clapperboard and do it yourself.
Is it just my observation, or are there way too many stupid people in the world?
I can't help with the automagic part, but...
I capture with this method too, except when working with pre-recorded studio performances. I highly recommend rolling the audio and video capture device for a few seconds before "action" to give you time to clap or tap an object in front of the camera, as close to the sound source (not necessarily the microphone) as is practical. you will then have a definitive reference point in your audio and video streams against which you can synchronize in your editor of choice. There are many FOSS editors that can help with this. The one I am currently using is OpenShot.
"Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife
should be a pretty good hint that you didn't read it and just responded to the title.
He's screwed for what he wants: No cost Linux software that does auto-sync. There is none.
There's a reason so many are "stuck" using Windows, no matter how onerous Microsoft gets (or OSX, no matter how walled garden Apple is) - some industry standard software simply doesn't run on other systems. Compared to the cost of someone to do the work for you, buying a dedicated Windows machine and the software to run it is less than a single edit session.
Or, as they say, use a clapperboard and do it yourself.
Just one of the big reasons I use OSX. Even iMovie can make a surprisingly professional movie. A lot of times when under insane time pressure, I used it instead of FCP.
And I used a clapboard. Quick and efficient. And it works.
caveat: if I were to be making a long single cut video, I'd find a professional tool to keep everything in sync. But that never happened.
The shepherds did so well protecting the flock that the sheep no longer believed that wolves existed.
Have you thought about modding the device to add a mic input? If you check the web, someone might have already done it, or the schematic might be available, or it might be obvious if you take it apart.
I'm no expert at video editing, but did you look at Cinelerra ? It exists for years (I mean, before VLC existed, before mplayer, even before Xine if I remember correctly... Who remembers aviplay ?) and apparently it is still (kind of) maintained.
For Debian it's available on Christian Marillat's multimedia repository.
reaper has added video editing capability recently. while windows/osx only, the devs have made efforts to make it run well under wine. should allow for syncing/editing your audio into your video. i believe it acts largely as a front end for ffmpeg for a purpose like this. there should be a lot of help at the reaper forum. the demo is fully functional and not time-limited. BabaG
If you haven't tried VirtualDub you need to go do so. That nobody here even speaks of it is mounting testament to the declining mental health of this community.
Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
This can all be done with open source tools.
Use Audacity with the noise removal filter to clean up the audio clip, if needed.
Use Audacity to save the audio clip in a format that can be imported into Kdenlive or Openshot.
If the video needs to be stabilized use Transcode before using a video editor. Here is an example using the file MVI_0611.MOV: ./MVI_0511.MOV -y null,null -o dummy; transcode -J transform -i ./MVI_0511.MOV -y raw -o ./MVI_0511S.mov
transcode -J stabilize -i
Import the video clip using Kdenlive or Openshot. Mute the audio from the video clip.
Add the audio clip to the video clip with the video editor and drag the clips down to the empty rows at the bottom of the interface.
Align the audio from each clip. Set an audio reference point and align to it using the video editor. Alternatively, stretch or compress the audio using Audacity and reinsert the changed clip into the video editor.
Render the file in the format of your choosing.
Good luck!
You have non-free: lightworks (free version) and pro version 350 Us$
in free bst product AMHA: shotcut.
otherway: avconv in command line: http://www.shotcut.org/, ( a little problem between interlace and progressive format).
bst regards.
user mplayer. The keys - and = will adjust A-V sync by tenths of a second in real time while the video is playing.
Have you looked at pitivi?
http://arjunpe.blogspot.com/2012/06/how-to-combine-video-and-audio-files-in.html
Given that both blender's video editor and ardour DAW support jack transport protocol, syncing sound and image becomes almost trivial.
Hit record x2, CLAP, sync in post. You really can't zoom in & line up two tracks like the big boys?
Could you not have just answered the question like a normal person instead of being a douchebag about it? Or were you worried you weren't going to feel superior to anyone today?
systemd is Roko's Basilisk.
What hardware do you use and what sort of performance do you get with KDEnlive? Do you use ffmpeg?
I 'm trying to work with 4K GoPro video and perfomance sucks a 264 to 265 transcode is only 4fps. Eight times real time on a quad core processor sucks.
Could you not be the stereo typical white knight wannabe SJW trying to protect the feels of the pathetic?
Seriously, this is the internet, take your "re-invented" emo shit to your momma. Don't be a fucking hypocrite about someone else feeling superior. You're pathetic.
Thanks to those who offered support without the need to retaliate against differing opinions.
I have a little Sony handycam and made a video of a headless 8 string acoustic guitar I built. Using the Blender video sequence editor, the audio kept drifting. So I just split the audio into two minute chunks and sync'd each one. I figured I was doing something wrong, but it appears this is a widespread problem.
Different device clocks drift at different rates, mostly due to heat. Oven-controlled crystal oscillators are employed in more expensive devices to provide more reliable clocking.
Sources that are simply synchronized at the start will typically quickly drift and lose sync. Unfortuantely the drift is often not linear. So if you sync them at the start, and then stretch or shrink one source via re-sampling, they may sync at the start and end, but not in the middle.
A good solution for the OP's problem would compare the audio streams, identifying common reference points. The sources would then be adaptively corrected, matching those reference points. I've always thought that would be a very fun project, and useful to many people, but haven't been able to align my clocks to make it happen.
There is an open source solution I used a few months back called shenidam. It's a little slow, but it works! The best use case is onboard camera audio and separate sound. You give shenidam source video and separate audio, and it matches and muxes automagically. The result: a video file with separate audio synced perfectly. Check out the website and source code on github.
I've run into a problem with the sound getting out of sync the further a video goes. It can be a problem with the audio having a longer playtime than the video. Using a program like Audacity, we can change the Tempo without changing the Pitch, thus squeezing the audio down into a shorter play time without making the sounds play faster and resulting in a chipmunks effect. Fine tuning it can be a real pain though.
I record presentations where the camera audio is usually quite horrible, so I also record audio with an XLR interface on my laptop. I use the file modification times to calculate a rough time offset (the camera is easily 2 seconds off, FAT32 and manually synced clock). Once I have a rough estimate, I roughly cut out the audio into wav files using ffmpeg and then praat can find the exact offset.
Download praat from: http://www.fon.hum.uva.nl/praat/
#!/bin/bash
cat > "correlate_audio_offset.praat" <<~~~
form Cross Correlate two Sounds
sentence Input_sound_1
sentence Input_sound_2
real start_time 0
real end_time 30
endform
Open long sound file... 'Input_sound_1$'
Extract part: 0, 60, "no"
Extract one channel... 1
sound1 = selected("Sound")
Open long sound file... 'Input_sound_2$'
Extract part: 0, 60, "no"
Extract one channel... 1
sound2 = selected("Sound")
select sound1
plus sound2
Cross-correlate: "peak 0.99", "zero"
offset = Get time of maximum: 0, 0, "Sinc70"
writeInfoLine: 'offset'
~~~
praat correlate_audio_offset.praat camera_audio_original.wav laptop_audio_original.wav
I do this audio sync in kdenlive for every video I make. Takes about 30 secs to align either the better sound track to video or the reverse. You pick which is stationary.
Precisely. I read this post, and heard the voice of Christine Baranski in my head..."So needy." If you're cheap, and really want to promote open source software, write it yourself. You say it's a simple task; just do it, and then share. If you're not willing to pay for the simplicity and ease of something like iMovie (free with every Mac since 1998 or so, so the barrier to entry is below $100), you need to pony up.