Use BitTorrent To Verify, Clean Up Files
jweatherley writes "I found a new (for me at least) use for BitTorrent. I had been trying to download beta 4 of the iPhone SDK for the last few days. First I downloaded the 1.5GB file from Apple's site. The download completed, but the disk image would not verify. I tried to install it anyway, but it fell over on the gcc4.2 package. Many things are cheap in India, but bandwidth is not one of them. I can't just download files > 1GB without worrying about reaching my monthly cap, and there are Doctor Who episodes to be watched. Fortunately we have uncapped hours in the night, so I downloaded it again. md5sum confirmed that the disk image differed from the previous one, but it still wouldn't verify, and fell over on gcc4.2 once more. Damn." That's not the end of the story, though — read on for a quick description of how BitTorrent saved the day in jweatherley's case.
jweatherley continues: "I wasn't having much success with Apple, so I headed off to the resurgent Demonoid. Sure enough they had a torrent of the SDK. I was going to set it up to download during the uncapped night hours, but then I had an idea. BitTorrent would be able to identify the bad chunks in the disk image I had downloaded from Apple, so I replaced the placeholder file that Azureus had created with a corrupt SDK disk image, and then reimported the torrent file. Sure enough it checked the file and declared it 99.7% complete. A few minutes later I had a valid disk image and installed the SDK. Verification and repair of corrupt files is a new use of BitTorrent for me; I thought I would share a useful way of repairing large, corrupt, but widely available, files."
jweatherley continues: "I wasn't having much success with Apple, so I headed off to the resurgent Demonoid. Sure enough they had a torrent of the SDK. I was going to set it up to download during the uncapped night hours, but then I had an idea. BitTorrent would be able to identify the bad chunks in the disk image I had downloaded from Apple, so I replaced the placeholder file that Azureus had created with a corrupt SDK disk image, and then reimported the torrent file. Sure enough it checked the file and declared it 99.7% complete. A few minutes later I had a valid disk image and installed the SDK. Verification and repair of corrupt files is a new use of BitTorrent for me; I thought I would share a useful way of repairing large, corrupt, but widely available, files."
Awesome idea. I've done this in the past with stuff. If a corrupt version was on one tracker, I'd save the files, get a new torrent and import the old files. Saves a lot of bandwidth wasting.
If I happen to see a stuck torrent (many leechers, no seeds), sometimes I can find a good version of the file I already have - so I start the torrent, stop it, replace the single good file (sometimes you need more if the file is smaller than the part size), and upload a few Kb to finish the torrent. Then sit back and watch as everyone fills up.
Those of us who use BitTorrent for *ahem* illegal purposes have been doing this since the beginning. The only way to get rare and complete downloads was to take the files to other trackers and match them against another md5 to finish the download.
.r23 file which is just a bit too short for some reason :)
It's like getting parity files over on usenet to fix that damned
Fiesta Online
For heavy BT users this tactic is very common, provided the file(s) you are willing to download is fairly well available from different sources.
Are their even MD5 hashes on Apple's download pages for such large files? Jusging by how the article was written and the lack of hashes on the QuickTime and iTunes download sites, it doesn't seem like they even bother.
I asked the same question. Wikipedia answered it.
One should be more concerned as to why your files are becoming corrupted.
I'd say its a safe bet that the files from apple.com are in perfect condition.
Which means it either became corrupted in transit to, or on arrival to your machine.
Which leads the question, is your memory defective
run memtest86 to check your memory.
http://www.memtest86.com/
Check if your Harddrives have SMART and are reporting anything. A disk checker would also be a good idea.
The other idea that springs to mind is if your behind some proxy with the above problems, although i doubt anyone would want to proxy a 1.5gig file.
Fact is, if files are being corrupted on your disk, its just a matter of time before something more important is hit by corruption.
To avoid criticism; Say nothing, Do nothing, Be nothing.
Its networking - shit happens. Some of his bits got thrown out of a router somewhere as heat, or maybe a packet timed out and didn't quite make it.
Obligatory blog plug: http://www.caseybanner.ca/
I've used bittorrent for this purpose many times in years gone by.
:)
Especially with our slow links, or worse yet, on dialup (if I go enough years back) in Australia.
Before bittorrent I would use rsync. That required me to download the large file to a server in the US on a fast connection, then rsync my copy to the server's copy to fix what is corrupt in my copy.
It works beautifully.
You can tell how powerful someone is by the magnitude of the crime they can commit and be able to get away with.
Those who have never developed P2P software might never understand why they all need to use strong checksums to detect data corruption, and why bad blocks actually do appear in the wild; frequently.
You'd be shocked - SHOCKED - at how much data gets corrupted routinely - by errant antivirus software, flaky network equipment, plain ol' line noise that the checksums don't detect (which will happen much more often than you expect, see also birthday paradox), or misbehaving routers who think that any occurence of 0xC0A80102 obviously must be an internal IP address and needs to be changed to your external one. Even if that's in the middle of a ZIP file. Oops.
Encryption actually aids this somewhat, as the same byte patterns don't get repeated, so if there's an errant IDS changing things for example, it tends not to fire the second time.
I've done this before for file repairs. Works a treat, but you sort of wish that torrent used a Merkle hash tree such as the modified THEX standard Tiger Tree Hash. SHA-1's so last century.
We have been doing this for ages for certain high-demand games file that we mirror. While offering torrents for some of our download mirrors is only mildly useful (as we're in Australia we're trying to keep bandwidth on-shore to cut down international traffic, and BT doesn't really help this), it is extremely helpful for the VAST amount of users that appear to either have massively crazy Internet problems or are simply unable to drive a HTTP based downloader and resume downloads.
When a large number of users are having problems downloading or resuming a particular file, I simply create a torrent for them and give them some vague instructions about how to resume it and then generally I never hear from them again. They're happy because they don't have to download a 4gb game client again from scratch, they don't have to worry about resuming/corrupt downloads, and because its a torrent it probably feels like they're getting something for free that they shouldn't be.
Big deal, I do this all the time. It also helps when you're downloading files via Torrent and supplement with pieces from the newsgroups. This combination works well because newsgroups often have RAR'd binaries that are missing files. Find a similar package available on a Torrent site and fill in the missing files. Hell you can start the Torrent first and do a Force Check as you add each piece. Why not just download the whole thing via Torrent then? Well nntp is local and much faster... Had I known this was worthy of a slashdot submission I would have done it all long time ago.
For even more fun, if you have two differently-corrupted copies of a file and a torrent to go with it, then you can have BitTorrent stitch them together into a valid file without involving any third parties.
I used Azureus's internal tracker ability and two computers on a local network with the torrent modified to track on one of the machines, and one corrupted copy of the file on each.
Obviously only works if they don't have corruption in common, but it also doesn't require the original torrent file tracker to work anymore.
Using bit torrent for it's actual legal intended use. I love it!!!
/.
I'm not a lawyer though. I just hope it doesn't violate apples NDA. Please please please follow the rules. Don't want to see you in prison or slapped with a large fine.
Bit torrent has received a bad reputation because of pirates. There are legitimate uses though. I do believe that doctor who episodes aren't public domain, so shame on you for that. Might want to be careful what you admit to on
Are you kidding? Sure, some episodes are slow or don't really work, but the second episode of the first "series" (that's "season" in the US) of the new Dr. Who is in my top five favorite sci-fi TV episodes of all time, including all the Star Treks and Babylon 5.
The TCP checksum offloading on nForce 4 motherboards (I have one) were notorious for corrupting TCP packets and allowing them to be received by the application. That's the most likely kind of failure that would be able to reproduce this problem.
09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
It's obvious you have no clue how the Internet actually works. Shit happens, but the Internet is designed for it. Dropped packets cause retransmission, not corrupted data; the Internet drops packets *by design* and the entire system is designed around that. Flipped bits happen, but they are detected by multiple checksums which make it astronomically unlikely for corrupt data to remain undetected. Nope; if you receive corrupt data, the blame is squarely on some piece of software fiddling with your packets and changing the checksums to match. Maybe it's the crappy cheap NAT router, or the ISP's deep-packet-inspection P2P filter, or their (not so) transparent HTTP proxy. But whatever the cause, it's almost certain that software is to blame.
I'd bet $100 that if he did the same download over HTTPS, thus preventing software meddling of the packet contents, it would come out perfect.
The first season is written by a schizophrenic that likes lesbian porn. Too many contradictory episodes and only two good ones. They need to get rid of the immortal, the generic asian chick and the geek, do something about the cop that doesn't know how to use a real gun and put a spine in her and flush the traitor down a toilet as organic residue. They really need to drown some of the writers.
They could be the first series to kill off all but one of their franchise characters and several of the support crew!
I'd go on a Vegan diet but the delivery time from Vega is too long. --brownkitty
Oh, and TCP checksumming isn't perfect.
Don't thank God, thank a doctor!
I had the same problem. What's really terrible is that I don't think they ever fixed the problem. That drove me nuts for a few weeks trying to figure out why all my downloads were corrupted.
Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
Thanks a bunch. FYI /. users, i took the first one the other 4 are free!
I'm here for the experience, not the Hyperbole.
I wrote this bash script to do basically the same thing. It uses openssl (built into most unix and OS X in specific) to create 1mb check files basically the same as torrent files. Follow the instructions and its easy to fix a corrupt download from someone that has a good copy, with the minimum required data transfer. The person with the bad file runs option 1 to make the check file and sends that to the person with the good file. They run option 2 which identifies bad chunks and exports them, which they send back to the first person. Run option 3 and the exports are patched into their download and it's fixed.
Last time I used it, we repaired a 3.8gb transfer by exchanging 11mb of data. (the transfer had been resumed multiple times and apparently one of the transfers glitched its offset or something)
This is easier than BT because using BT can have a bit of a learning curve for seeding. Beta but appears stable. Feedback encouraged.
I work for the Department of Redundancy Department.
I'm pretty sure they fixed the problem a while ago. Download the latest drivers and you should be ok, assuming you still use the motherboard that is.
09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
To be fair, very few British cops know how to use guns. At least, if the gun control advocates on my side of the pond can be believed.
Can you be Even More Awesome?!
Oh please, some of the epsidoes like the whole John Hart ones are just incredibly poor. And they have the most absurd sexual relationships ever and I don't mean the gay thing. They go from deep kissing to completely psychotic let's kill each other mode in two seconds flat. And to let the guy who almost killed your crew and would have killed you not be immortal just go. I much prefer the "normal" characters over the man himself, the less of him the better. I'd like a good action/sci-fi/csi flick which it is at times, but Jack seems to be the "particle of the day" of the series.
Doctor Who is good fun, light entertainment. It's a guy flying aound in a blue police box and you're not supposed to take it so seriously, particularly since there's time paradoxes cropping up all over the place. And I think the series show I'm soooooooooo glad Star Trek didn't go with a time agency series, and why they should have kept it out of Enterprise too. You go from self-healing time to self-destroying time to being prevented from certain events to changing much bigger events to anti-time to changes that ripple through time slowly/quickly/not at all and you'll never be self-consistent.
It's fun for the odd episode but destroys the whole logic. For example now at the end of Stargate Atlantis, doctor McKey returned Colonel Shepard from the future - and only sent with him the location of where to find Teyla. WTF? He could have given him 25 years of science and technology, all wrapped up on the data crystal (the same kind that contain for example the entire replicator code...). Why didn't he? Because he "can't" use that solution. It's like the convienient non-interference with the timeline in Star Trek. It allows you to actually do a little time travel without creating so many issues.
Live today, because you never know what tomorrow brings
I actually saw this happen once ... the astronomically unlikely [1]. TCP accepted the corrupt packet. I'm sure it will never happen again. Fortunately, rsync caught it in the next run.
One problem I ran into once with a certain Intel NIC was that a certain data pattern was always being corrupted. TCP always caught it and dropped the packet. There was no progress beyond that point because of the hardware defect always corrupted that data pattern. Turns out there was a run of zeros followed by a certain data byte (I tried a different data byte and with different run lengths and those never got corrupted). What the NIC did was drop 4 bytes, and put 4 bytes of garbage at the end. I suspect it was a clocking syncronization error. I got around the problem by adding the -z option to rsync (which I normally would not have done with an ISO of mostly compressed files). Another way would have been to do the rsync through ssh, either as a session agent (like rsync itself can do) or as a forwarded port (how I do it now for a lot of things).
[1] ... approximately 1 in 2^31-1 chance that the TCP checksum will happen to match when the data is wrong (variance depending on what causes the error in the first place) ... which approaches astronomically unlikely. Take 1 Terabyte of random bits. Calculate the CRC-32 checksum for each 256 byte block. Sort all these checksums. You will find 2 (or more) data blocks with the same checksum (or a repeating pattern in your RNG). Why? Because CRC-32 has 2^32-1 possible states, and you have 2^32 random checksums.
But whatever the cause, it's almost certain that software is to blame.Agreed. Since it is at least software's responsibility to detect and fix it, if the problem happens, the famous finger of fault points at the software.
I'd bet $100 that if he did the same download over HTTPS, thus preventing software meddling of the packet contents, it would come out perfect.Your $100 is safe.
now we need to go OSS in diesel cars
Obviously you didn't see the finale of series 2. Two of your wishes were fulfilled.
Torchwood is pretty silly (especially in supernatural episodes with ghosts, Death Incarnate, zombies), but still watchable.
This is not really anything BitTorrent specific, but good use of available tools. However, I hope you then checksum verified the completed file with an MD5 from Apple or somebody who has downloaded directly from them. While you probably weren't a target of an attack, you did download software from an unknown source. An attacker could download the SDK, insert malicious code, compute a new set of MD5 sums for the torrent file, upload to pirate bay or some tracker, and then seed the torrent expecting that nobody will attempt an external verification.
I had a shitty old hard drive that was failing CRC (cyclic redundancy checks) but the file I had downloaded was 4 gigs, and there were a few corrupt pieces, but by copying it to another hard drive, and replacing just the corrupt pieces I saved myself a shit load of bandwidth.
Orbis terrarum est non altus satis
Yeah, it's fixed with the latest driver, but they had to disable most of the TCP offloading. I had the same problem on my NF4 board. I chucked the NV Active Armor firewall software and never had a problem since.
http://techreport.com/discussions.x/9483
Thanks! The top code worked for me, and the bottom one was already used... (:
Whoever stated that signature sizes should be limited to one hundred and twenty characters can just go ahead and kiss my
OK, maybe not tonight-at-eleven news, but this is a totally clever hack, which is exactly what many people on Slashdot live for.
On a related note, I came up with a roundabout way to do something similar to help a friend who was having trouble moving large files. On the remote end, split the file into small chunks. Then md5 them all and save those results into a text file. Then, ftp them, and when they arrive, md5 them all again and compare your values to what's in the text file. If any don't match, re-download them; else cat them all together and you should be good.
I don't think this wouldn't have worked for the submitter, even if he knew someone with a known-good copy of the file, because I imagine these things work linearly, so if the bad part of the file was at the halfway mark, every chunk after that would have the wrong checksum. His method was very, very clever.
Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
Got the second one, thanks.
blog & fiction: jd87
TCP has a 16 bit checksum. That means there's a 1 in 2^16 chance of an error getting by the checksum. Let's assume, for a moment, that the packets were sent 1kb at a time (ethernet max is greater than this, but it's an easy number). In a 1.5Gb file (assuming base 10 throughout for simplicity), this means a total of 1,500,000 packets must be transmitted. Using only the TCP checksum, 22 of these packets would be corrupt, but allowed through. Even though there are additional checks at layer 2, the fact is that when dealing with large amounts of data, relying on TCP for data integrity is not enough.
I have been using Torrents for this very reason.
I was being required to copy sometimes 10-20GB of Virtual Machine Image Files from Server to PC or PC to PC on up 40 machines at one time.
This was taking way too long and copies were not perfect.
Restoration of VM images presented the same problem.
Updating a VM meant redistribution of the entire file to all machines again.
Using (Micro) Torrent and my own tracker changed all that.
I came up with the following solution using all available resources.
First I started by copying all images to workstations to a separate partition. (about 200GB of VM's.)
Then I created created my own internal Tracker and Web Page to host torrents.
The results were:
1. Extremely efficient use of all available network hard drive space.
2. Utilities every machine on the network to distribute the files.
3. Works extremely well restoring or redistributing the VM's to any one machine or several machines at once. (The more the better)
4. 100% accuracy in distribution.
5. The ability to quickly modify any one image on any machine, recreate the torrent(hash) and then update that image across hundreds of machines very quickly.
In other words, modifying a file only means that the machines only have to download the bits that changed not the whole image again.
6. With Micro Torrent any machine can be used as the tracker.
7. The Tracker is also the "master" file server, however any machine can be used to modifiy and upload a change
Just recreate and re-upload the new torrent replacing the old one. Remember that a torrent file serving network is Not a server centric file sharing system.
I used to download Linux ISO files directly from FTP or web sites.
Nothing upset me more than downloading an ISO only to find out that after I burned it to CD/DVD, it had CRC errors and random lockups during an install.
After BitTorrent with error correcting, the problem was solved. It works for other things as well.
Commercial software companies can offer ISO downloads via BitTorrent trackers and send the install CD Key via email. That way customers just burn the CD/DVD and install the key they got in email.
Some thing with media files, download via BitTorrent enter an unlock key you get via email when you bought it.
Business are stupid if they ignore the benefits of BitTorrent.
Even piracy doesn't hurt that much as most people want to try the software before they buy it. It is like kicking the tires before buying a car and taking it out for a test drive before signing the papers to buy it.
Remember, Slashdot does not have a -1 disagree moderation, and no, troll, flamebait, and overrated are not substitutes.
No, I've seen this as a requirement for a few private trackers. It put me off on posting as I'm not going to waste my time.
I've used it to finish up the last 3% of a jigdo build when I was missing a file or two. Worked great.
I wonder if you could legitimately argue that you were verifying the data in a personal backup of media that you had?
Unless I am mistaken, it is perfectly legal to make a backup of data that you own right? So, if you already own an item, would downloading it to have a backup be a legal thing to do?
And if that's the case, I wonder what the legal implications are in cases where the RIAA comes down on people who have been "participating in file sharing" activities.
Moved to http://soylentnews.org/. You are invited to join us too!
Assuming you can find a source that serves a known-good file via rsync, it's a very efficient way to fix up a damaged copy.
I once had to download a CD image over a dialup connection when I was at a client site in Mexico. I did the initial download via FTP, but it got corrupted and the MD5 sum didn't match the correct value. It had taken almost two full days to download the first time (over a weekend, so shipping a CD wouldn't have been faster), but rsync was able to find and correct the corrupted sections in less than five minutes.
Rsync is also an unbeatable tool for making incremental backups. I use it (rather, I use rdiff-backup, which uses rsync) to back up a server with almost 30 GiB of data, nightly, over a standard cable modem connection. Last night's, for example, took 57 minutes to run, found 527 changed files totaling 1.36 GiB of 26.2 GiB total. I don't know how much it actually downloaded, but I'm sure it was much less than 1.36 GiB.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
I agree TCP/IP has problems with raw file transfers.
However, a Torrent system ensures the delivery of the file based on the files hash value.
This is very beneficial if an update or recovery of the original file needs to be made.
Simply recreate the torrent, upload the updated torrent.
Once the clients get the new torrent they only download the changes to that file.
For instance.
A 10GB virtual image file needs to be changed.
Make the changes needed, recreate the torrent, upload the new torrent.
Clients download the new torrent fot the same file.
Restart the download of that file to the same location.
The client makes a hash check.
This time according to the hash value only 12% of the file has changed.
Only the bits that need to be needed to match the hash are downloaded.
Not only that but because it an asyncronis file transfer across multiple machines, on a large network, the update occurs incredibly fast.
This works 100% of the time.
Since a central file server is not needed, any machine on the network can act as the tracker, hardware failure maybe is the biggest concern.
But then again a failed component is always inevitable.
With a torrent system corrupted data transmission is no longer a real problem.
It's called NVidia ForceWare Network Access Manager (NAM) now on 680 and 780 boards. And it's still a piece of crap. At least 2 third-party products (Azureus was one) mention in their FAQs to uninstall NAM to avoid crashes. I had Azureus crashes all the time, prevented (sort of) only by going into Task Manager and setting the affinity to the second processor, until I uninstalled NAM.
Good motherboards, bad motherboard drivers.
The first rule of Usenet: don't talk about Usenet.
( Redundancy is ) ^ n
Most do, but you may have to enable it in the program's options.
What wouldn't Jesus do?!
The End of the World? Not a bad episode, but its main purpose was to demonstrate that the new Doctor Who has a decent budget. 'Blink!' from season three is one of the best episodes ever; classic series included.
--
Reverse outsourcing: it's the future
You'll lose your $100
The checksum on a TCP packet is only 16bit thats 1 in 65536. You can only get about 100Mbytes in 65536 packets!
The lower layers normally have better CRCs so this is rather unlikely to cause a real problem but if something is chattering the TCP CRC is likely to fail too.
BTW: it's also a rather poor redundancy check even for 16bits.
Transparent proxies also kill large downloads; especially when the browser is not IE. I hear "not IE" also included IE7!
Besides rsync & torrents, you can also repair files with metalinks, which require nothing extra on the server, and is not blocked like p2p in some places.
This is why so many distributions use them for ISO downloads, so you don't have to restart large downloads from the beginning.
I've been doing this with linux ISOs for quite some time. Never thought it could be unknown to anyone.
...yesterday I used BitTorrent to repair an Ubuntu Studio iso that I downloaded from my local ftp firehose. The MD5SUMS mismatched, so I fetched the matching torrent file, fired up KTorrent and pointed it at the dir I downloaded the iso into. Only 1 block needed repairing, saving me a helluva long download.
The Hacker's Guide To The Kernel: Don't panic()!
First, as rdebath argues, you only get 16 bits of CRC on TCP headers.
And furthermore, if you start calculating CRCs off random data, chances (>50%) are you will get a collision (two chunks of data with the same CRC) around the 256th try (this is known as the "birthday paradox" in criptography). Of course, to be really sure to get a collision you will need to try at most 65536 values; but you will reach a very high probability of clash much sooner than intuition may tell you.
See birthday attack for the math.Please people. It's very easy. Just go into your settings and look for something that says Protocol Encryption and say 'Enabled'. If everyone gets into this habit, we will all live in a far better world. In fact, encrypt any application (that traverses the Net) you can. Application layer is nobody's business but your own.
One problem I ran into once with a certain Intel NIC was that a certain data pattern was always being corrupted. TCP always caught it and dropped the packet. There was no progress beyond that point because of the hardware defect always corrupted that data pattern.
I'm pretty sure I had this exact bug with World of Warcraft. The game would get stuck retransmitting the same packet over and over again. It could be fixed by turning off hardware checksumming on the NIC.Rsync also works nicely for "upgrading" CD images of beta Ubuntu releases to the final version, and for, say, making a Kubuntu Live CD out of the normal GNOME-based Ubuntu one. It has the advantage that it can spot blocks that have moved around in the new version but are still the same, even if they're no longer on block boundaries.
I was sick of multipart files in 1991, ha!
All your points are solved by software, split rars are a hack on deficient protocols or routers that limit BW per tcp connection.
Oh and, what is it with these stupid long ass crap file names, S05E03-XDVD-HPEP-LOL-FUKME.avi
This is not 1972 cobol days dudes, if its unlikely to be a hit like friends, stick to one digit to seasons, S3, E03 is ok.
Kill the lame postfix acronyms, except sensible ones not in caps as they take more pixel, (dvd) or (ts) is smaller.
As gordan ramsey says, "you guys a fuking tossers, your a shit head".
TvShowName-S4ep23.avi is nicer, i always rename because they are TOO DAMN long on HTPC systems. Again, this aint 1200bps modem days. (they werent this bad btw)
Oh and another pet peve of mine to your so called elites, stop resizing 720 rips or tv shows to 604 or 624, if its done only to compress better, or
to play on PSP, then why should 90% suffer, 720 original is best on 42in LCDs. Stop resizing because you own a crap 12in crt. Or want to watch tv shows in a psp, cartoons are ok, but not good tv shows. Dont give me this 624 is ntsc in usa shit, only trailer trash own CRTs. If you can afford to download, you own an LCD. If you own a shit tv, well you'll get a better quality any way. Again, read my lips, 624 or 608 sucks 1980s style.
Liberty freedom are no1, not dicks in suits.
Or are they too 1.0 for the kids of today?
I am trolling
There is a chance of 1 in 2^16 to get a bad packet that produces the right checksum, but that doesn't mean that one in 2^16 packets will be corrupt (on average).
For a packet to be bad and undetected, two conditions must hold: it must be corrupt to begin with, and it must be lucky enough to produce the right checksum. The probability depends on the error rate of the connection as well as the weakness of the checksum.
In some cases you'd get a lot more errors (picking up the phone on a dial-up connection produces many undetected errors, a lot more than one in 2^16), and in some cases a lot less (running the server and the client on the same machine).
I tried using it on our current administration. It showed up as being 29% complete, but unfortunately nobody's seeding the uncorrupted parts that we're missing. :(
For your security, this post has been encrypted with ROT-13, twice.
Funny how you complain about how bad others do your dirty work while you apparently save enough money because of it to watch your favorite shows on expensive hardware.
If there is one thing to be learned on slashdot, it has to be sarcasm.
I have also noticed that the P2P softwares as a group seem to offer excellent features in the area of moving files, large and small, and not corrupting said files, even in high noise/disconnect environments. Its a feature set that should make its way into webbrowers/common-downloaders, but seems to just not happen. Anytime I see a file to download and it is over 300MB, I'm like, "oh-boy this could be an adventure"
:-)
The birthday paradox involves a population in which finding ANY two (or more) of the same is considered a match. That does not apply to a TCP header checksum because the comparison needs to be made against ONE SPECIFIC checksum (e.g. the one the packet in question has). You get a packet and it has a checksum. You calculate a checksum from the data. Do they match or not when the data is corrupted? That's not a birthday paradox.
The birthday paradox DOES apply in cases where you want to create TWO packets with the same checksum, but it doesn't matter which checksum that is. You can create two messages with the same hash in the case of cryptography where there is a weak hash. But in the case of error checking, it's not about creating any pair of matching checksums; it's about creating one checksum that matches one you already have that you cannot change. In birthday terms, it's about finding someone in the population that has the same birthday as you do.
OK, it's 16 bits. My bad. TCP bad. But birthday paradox does not apply here.
now we need to go OSS in diesel cars
The security implications of that have always bothered me.
I wonder, does the current diskutilities app phone home to check the hash? Not that that provides more than a speed-bump for the middleman.
Of course, it is somewhat useful for checking file integrity for issues other than crafted corruption.
Computer memory is just fancy paper, CPUs just fancy pens with fancy erasers; the 'net is just a fancy backyard fence.
Actually, I'm thinking I may have just undone a perfectly good disk swap recently when the problem might have been at Apple's end.
I guess I need to test that disk.
Computer memory is just fancy paper, CPUs just fancy pens with fancy erasers; the 'net is just a fancy backyard fence.