Slashdot Mirror


A Distributed DivX Ripper?

RJ asks: "I know much about Java/C++ and sockets programming and I'd like to use this knowledge to build a distributed program to rip a DVD into DivX. It will work by breaking the job into chunks, sending chunks to other computers to encode, then patching it back together. Despite my searching efforts, I have been unable to find a decent resource to teach me how to program the DivX core to encode mpeg2 and re-join parts. I'm hoping that some readers of slashdot can point me in the right direction?"

32 comments

  1. I'm not an expert, by Skyfire · · Score: 3, Informative
    but it seems to me like there are several problems that slow down DVD to Divx reencoding:
    • Reading around 4 gigs of data in a fairly fast time: not much of an issue on a normal computer, but distributed computing might introduce issues with speeds of the network.
    • Deencoding the mpeg2: this seems to me to be the biggest slow down. The problem seems to one of not having a fast enough decoder that is capable of good image output. Distributed computing would help, but it seems to me to be somewhat of a macgyvered solution to the real problem.
    • Resizing, cropping etc: not much of an issue, but of course distributed computing would help.
    • Mpeg4 encoding: Again, one of the major problems seems to be with fairly slow encoding alogarithims. Therefore, helped by distributed computing.
    I'm sure setting up a beowulf cluster of computers or some other setup seems to be nice and all that, but unless you're reencoding several dvds a day, it seems like a normal athlon would probably do the trick, and working on the code for better decoding and coding of the various codec would seem to me to be a more efficient way of improving speed. I know that I can reencode a dvd at half real time with 1-pass divx 5 encoding and flaskmpeg, and a 1/4 real time with 2-pass encoding, and I've only got a Athlon 1.4. So unless you are reencoding Das Boot quite a bit, it seems to me like one normal high powered consumer computer would be plenty sufficient.
    --
    Do not go gentle into that good night. Rage, rage against the dying of the light.
  2. Re:I'm not an expert by showboat · · Score: 1

    Please, it's "decode," not "deencode."

    Like my stupid high cshool programming teacher who yankee-ily kept saying "disenable," when it's really "disable," you don't have to include BOTH positive and negative prefixes -- only one.

    Anyway, you're pretty much right. Though I think the original intent was a bit more distributed than a cluster (wan), since such a thing (with oh, just a couple athlons) is beyond most would-be rippers, I'd assume.

    However, if you had a good sized pool of cohorts, and they all had cable modems that can sustain 100k+ between them boh ways, then the networking part seems more feasible than one might at first think. But it's not something I'd release and let and old Joe 56k try, of course, but I wouldn't (personally) consider it McGyver-ish.

  3. Distributed encoding... by karnal · · Score: 3, Interesting

    I've tried to do something similar before -- namely, breaking up .wav files so that I could distribute the pieces to other machines to encode.

    I ran into a snag.

    It seems the encoder I was using at the time (bladeenc) was inserting silence at the end of each mp3, to keep it to spec. What I can imagine is that even with DVD encoding, you'd need a "master" that would give out file chunks to the worker bees. But -- it would have to be intelligent enough to know when you wanted a new keyframe, and split up the .avi / .vob in that sense.

    In other words, you may as well just build one heck of a fast machine, and try to get 30-40fps encoding out of it, rather than try to put together something to distribute it and encode it. That's my 2 cents, and I may be wrong....

    --
    Karnal
    1. Re:Distributed encoding... by Rolo+Tomasi · · Score: 1
      Well, MP3 audio consists of frames, so if your .wav files are not the exact size to match MP3 frame boundaries, the rest of the frame will be padded with zeroes. Same problem when trying to play two tracks without gaps in between. Go to www.r3mix.net, they got a lot of info on this (and more).

      The solution is to cut your .wav files to the correct size, and all will be fine.

      --
      Did you know you can fertilize your lawn with used motor oil?
  4. dvd::rip has a cluster mode by Vito · · Score: 5, Informative

    Watch this post get modded up, and not my qualified response to the From Coder to Game Designer question. Humbug!

    Anyway, as brought up in the last Ask Slashdot remotely similar to this one (Archiving DVD's with Linux), dvd::rip, which is a Perl+GTK front-end to transcode, has a fairly insecure cluster mode, whereby it will split up the video transcoding task among however many machines you can coerce into doing it, and rip and mux the audio with the video on the host machine.

    Sounds like just what the doctor ordered. Now someone go mod up that other answer of mine. Please?

    1. Re:dvd::rip has a cluster mode by Phexro · · Score: 2

      Allegedly, transcode has some sort of clustering stuff as well. I can't testify for it, but dvd::rip's cluster mode is great, except that it won't do AC3 audio pass-through.

      However, I've found that the XviD codec is fast enough to encode 2:1 on my P3 866. That is, it takes 2x as long to encode as the movie lasts - it gets around 12fps. DiVX 4.02 put out slightly better quality, at 4-6fps on the same system. I haven't done anything with DivX 5.xx yet, so I can't speak for that.

  5. Well by Anonymous Coward · · Score: 0

    > I know much about Java/C++ and sockets

    That's impressive, but did you know that BSD is dying ?

    1. Re:Well by Anonymous Coward · · Score: 0

      Are BSD sockets dying as well?

    2. Re:Well by Anonymous Coward · · Score: 0

      No they are not. They currently reside within WindowsXP. The 'X' in XP represents the cross, as a memorial to BSD.

  6. Er? by Wakko+Warner · · Score: 2, Funny

    You want us to help you break the law?!? What do you think we are?!

    --
    "Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"
    1. Re:Er? by Anonymous Coward · · Score: 0

      You have to know that american laws (read DMCA) do not apply outisde of the USA... As such, lots of people coming to slashdot and leaving outside of the USA can participate in this discussion... then of course, as slashdot is hosted in the USA you can not do it due to DMCA...

      I see two solutions : host slashdot outside of the USA or drop DMCA... What is the easiest solution ?

    2. Re:Er? by fishebulb · · Score: 2

      umm i remeber a person in europe was arrested for DMCA violations. DeCSS.

      You are technically right, the DMCA doesnt exist outside the USA, but laws just like it do.

  7. Distributed Encoding by Stinson · · Score: 1

    Yeah, i've been thinking of something like that for bit now. On a local network, you wouldn't have to worry much about speeds (some guy above said that). One of the things i was thinking of was more of a distributed encoding system, that on the servers (that the chunks are being sent to) could have sorta like plugins for different media types...so you could encode many types at once.

  8. why use sockets ? by dario_moreno · · Score: 2

    when developers have put years of work
    in PVM or MPI. I do not know if
    "mpi_allgather" and "mpi_allscatter"
    would stand an 2gb array like found on DVDs,
    but at least this would put several 1M$
    beowulves I know of to a somewhat useful
    purpose (besides cracking /etc/passwd and SSL
    sniffs, of course), instead of boring
    quantum chemical computations or climate simulations.

    --
    Google passes Turing test : see my journal
    1. Re:why use sockets ? by benjamindees · · Score: 2, Interesting

      As the author of Transcode explained to me, using a binary Divx encoder with PVM/MPI/Mosix is impossible. I don't know what state open source Divx encoders are in, but I agree this would be a much better solution than chopping up a DVD and encoding all the pieces separately.

      --
      "I assumed blithely that there were no elves out there in the darkness"
  9. Ask /. by Anonymous Coward · · Score: 0

    The Ask /. section seems like it's generally moderation-neglected, particularly when the article doesn't make the front page (like the Programmer To Designer article).

  10. This already exists: Vidomi by Anonymous Coward · · Score: 5, Informative

    Vidomi is a badass little program to turn mpg, vob, ... into DivX. One of the recently added features is "Distributed Encoding" (read: Scalability via network slaves).

    This answer your question?

    1. Re:This already exists: Vidomi by Anonymous Coward · · Score: 0

      Or the open source way :

      www.theorie.physik.uni-goettingen.de/ ~ostreich/transcode

  11. Re:I have a better idea by Sloppy · · Score: 3, Insightful

    Did you read the same Ask Slashdot that I did? There's nothing in his question that even remotely hints that he's copying someone else's DVD instead of doing this to DVDs he already bought.

    Maybe that's what he's doing, but you're really jumping to conclusions. It's sort of like when someone asks, "Where can I buy a screwdriver?" and you fly off the handle that maybe he should stop stabbing people with screwdrivers.

    --
    As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
  12. Breaks compression? by Hard_Code · · Score: 1

    Well, I'm not sure exactly how DivX compression works, but in general, the more data the compression algorithm can see at once, the more redundancy it can find, and therefore the more it can compress. Chopping a piece of data up into bits will at some point start reducing the seen compression ratio.

    --

    It's 10 PM. Do you know if you're un-American?
    1. Re:Breaks compression? by Anonymous Coward · · Score: 0

      What if the DVD was owned by others, then everyone would have access to the info and compression wouldn't be broken

  13. MOSIX. by Anonymous Coward · · Score: 0

    Obviously, the ripping part can't be broken up - But if you get a good encoder it should be SMP capable. Then all you need is to use MOSIX.

  14. Dedicated audio encoder by yerricde · · Score: 1

    It seems the encoder I was using at the time (bladeenc) was inserting silence at the end of each mp3, to keep it to spec.

    Dedicate one machine to running LAME on the audio, and you won't have this problem.

    --
    Will I retire or break 10K?
  15. Load balancing by heroine · · Score: 2

    The hardest part is load balancing. How do you make sure the slowest computer doesn't get the last job and force the faster computers to wait forever for the last job?

  16. Re:I have a better idea by 42forty-two42 · · Score: 1
    "Where can I buy a screwdriver?"

    Ever heard of google? Search before you ask! :)

  17. Re:I have a better idea by mumblestheclown · · Score: 0

    It boggles my mind that "sloppy"s reply was modded insightful. Oh wait, this is slashdot. no it doesn't. For all the self-righteous talk, we all know that the non-infringing use of this technology is likely less than 1%, and perhaps less than .1%. Regardless of whether you consider the whole concept of intellectual property to be morally bankrupt and practically corrupt, the screwdrivers analogy is oversimplistic and patronizing. Hey, didn't we have this debate already with Napster, et al?

  18. transcode does this already by Anonymous Coward · · Score: 0

    There's a set of utilities called transcode, that already does something similar, perhaps you should take a look at that. Just do a google search for transcode.

  19. Re:I have a better idea by Sloppy · · Score: 2

    I don't think it's fair to compare this to Napster. The non-infringing use of this these tools is far greater than it was for Napster. This audio equivalent of these kind of tools would be mp3 and vorbis encoders, not trading software like Napster.

    You don't seem to understand that some people (e.g. me) really don't like those shiney discs. They get scratched, possibly lost, and once you have a few hundred of them, they are inconvenient to physically manage. Compared to files on a drive array, they are as hard to use as stone knives and bear skins. I don't even play my new audio CDs once; they get ripped and ogg encoded, and then they go into a box, where I don't know if I'm ever going to access them again. (Time will tell, I guess.) And no, my resulting collection is not shared.

    Eventually it's going to be like that for my movies too (although at present, the storage demands are still a little high for me, and I also think DivX is too lossy for action scenes). This technology has very substantial non-infringing use, and the comparison to Napster is unjustified.

    --
    As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.