Slashback: Flashmob, Currency, Verification
Reminder of your scheduled spontaneous appointment. Zero_K writes "As previously posted on Slashdot and the NY Times, the University of San Francisco's, Computer Science department is building a 'flash mob' supercomputer on April 3rd. On their newly updated official web-site (Main Site, ISO's) the team has now posted the ISO image of their custom morphix that will be used to boot all the computers into the cluster, documentation is on the website (under 'downloads') and on the CD (index.html). I personally plan on downloading and testing this ISO tonight. And after the cluster is taken off line, there will be a massive LAN PARTY (Possibly one of the biggest in San Francisco...) On a 10-Gigabit LAN...Oh sweetness ... So if you are in or around the SF Bay Area on April 3rd, be sure to sign up and bring your laptop or desktop to campus and help make history."
Whaddya mean, "no pun intended"? Rudiger writes "After the dust (no pun intended) has settled around the whole Operation Dust Bunny thing, McAfee updates their signature database classifying Dust Bunny as an application. To be more specific: 'This program is detected as a "potentially unwanted application."' They also say 'This is not a virus or trojan.' Should we leave it to the experts this time?"
Would you read Atlas Shrugged on this screen? An anonymous reader writes "The so-called 'electronic paper,' being a high-clarity monochrome display to become a foundation for comfortable and inexpensive 'electronic papers,' has finally shown its face. The new electronic paper, which looks a bit like an iPod, has 10MB memory, keyboard, Memory Stick PRO slot, voice recorder, speaker, and headphones output, and USB2.0 interface."
(We mentioned the device yesterday, but this link provides better images of it.)
Now they're Pragmatic Publishers as well -- much success! AndyHunt writes "As you may have heard, the Pragmatic Programmers have started their own publishing company (see Slashdot reviews here and here). We've just signed our first outside author: Mike Clark, editor of the JUnit FAQ and developer of JUnitPerf and JDepend. He'll be writing the eagerly-anticipated Pragmatic Project Automation book, the third volume in our Jolt Productivity award-winning series."
Exactly how many bits, Ma'am? And in what order, did you say? jlcooke writes "Two months (almost to the day) after getting slashdotted for an innocent post to sci.crypt - the MD5CRK project has launched. The aim is to get the thousands of applications and websites to drop MD5 for SHA-1 or SHA-256 by finding a counter-example of a security requirement in MD5. Press Release is here."
How to take criticism, by example. slashdot_commentator writes "Eric S. Raymond has recently written a wonderful piece explaining to the Linux zealot why it may not be the operating system of choice of all users. (Or what user aspects open source developers need to focus on to further Linux World Domination.) The op-ed specifically focuses on the CUPS printing system. (But it would be a mistake to dismiss it as a screed against CUPS.) The CUPS authors surprisingly acknowledged ESR's points, and he wrote a followup to the article."
Hitting them where it figuratively hurts. Ian Wilson writes with a followup to the Slashdot post earlier this month on "website thieves stealing content and designs from others, taken from silicon.com. Well, now silicon.com is reporting that it has contacted the offending site's advertisers and forced them to stop paying ad revenues - thus effectively crippling the illegal site - after all, no revenue, no reason to the run the site."
Express your appreciation with PizzaPal. Chuck writes "After you guys published the article on $20 bills exploding when microwaved, a co-worker of mine went to put his soup in the microwave and found a $20 bill in it. Too bad it was an older one, but someone around the office must have left it in there after reading your article. The co-worker then took me out to lunch. Thanks, Slashdot!"
I've seen that before... it's when I get modded -1 Flamebait within 30 seconds of posting!
Hmm, just went upstairs and checked my own microwave for cash. Nothing. Maybe I should get my dimwitted roomates to start reading Slashdot.
Kip Hawley is an idiot.
The other day, there was a bitTorrent link in the article, and I realized that I didn't have Bit Torrent installed. So when I went to download it, McAfee told me it was Spyware.
Bit Torrent is spyware?
Yet another reason for me to hate McAfee.
well, that would be fine if you could say that you don't want "potentially unwanted programs" and it was a clear option. being that I don't have the program, I can't say if that's the case for sure.
people are starting to realize now that they do indeed have many "potentially unwanted programs" that they in fact do not want, and I think that they would recognize such an option if they saw it.
on the subject of whether or not dustbunny is actually a virus, I think it's no different than many other pieces of spyware that windows users typically use adaware or spybot to exerminate. so let's leave it to those programs, which seem to be pretty popular with even average users these days.
Hmm... put an 802.11b interface on this thing, and it won't matter that it has a trivially small amount of memory...
"Freedom means freedom for everybody" -- Dick Cheney
clicked just after my boss walked by. AFTER thank god.
a co-worker of mine went to put his soup in the microwave and found a $20 bill in it.
He found a $20 bill in his soup?
At least we slashdotted thier site. So I guess there is probably a gap in there where they didn't get all the data they were looking for.
--
Live updates from Slickdeals, Tech Bargains, Bens Bargains, Got|Apex, etc..
So, let me get this straight. He went to microwave his soup and found a $20 in it? That's better than a fly, I suppose.
This saddens me. I just finished implementing an md5 password hashing routine for a web application.
At least it's not production yet, so I can switch it over.
See? This is why my bosses should let me read Slashdot at work.
If they can do that, make notes using handwriting easy (no recognition required), I'd love that...
But I bet the main opponents to this would be book publishers who charge exhorbiant amounts for "new editions" where hardly anything was changed. oh well.
If we are trying to get people to move away from MD5 sums, what do we use? CRC?
There are only 10 kinds of people in this world... those who understand binary and those who don't
I had 10,000 assholes on my screen and so many being launched I couldn't stop them.
Welcome to Slashdot.
Opinions on the Twiddler2 hand-held keyboard?
That's all for today.
Good night.
I couldn't help but think back to how that article on microwaving stuff just turned to chaos with all of the discussion on what we could microwave for fun. I bet a bunch of us broke microwaves and somewhere, someone is keeping track saying "Wow, there's a big demand on micowaves this week." Now it's going to happen all over again bringing this article back...
Microwave my 802.11b card...I wonder....
Anyone wanna outsource the infrastructure and SW for the Lan party to us indians? ;-)
Jokes apart, i'd really like to fly down to USA top be a part of the lan party and see how those guys manage things.Its one thing to have a lan party with 100 ppl but using up complete subnets is one different league!
Lord of the Binges.
SHA-1 isn't really "their" message digest algorithm, they're just recommending it as a replacement for MD5, which they're trying to crack.
You have three different "MD5 sum" utilities that all give different checksums for the same data? If so, then at least two of them aren't actually MD5 utilities, in the sense that they don't compute MD5 sums. *cough*
Too bad nobody agrees on how MD5 should be calculated.
Wow, really, you know, someone should like, write an RFC for it or something, then maybe they could all agree!
The point is that, unlike a command tool for techies that should give them lots of choices, the goal of a GUI is to present the user with as few decision points as possible.
Remember the Macintosh dictum that the user should never have to tell the machine anything that it knows or can deduce for itself.
this is as clueful as it gets. Most app designers should heed him
Time flies like an arrow, fruit flies like a banana.
I see this every single day. The open source community (as it were) is full of people who want to use and like operating systems like Linux and BSD but are just too fucking afraid of even uttering anything that might reveal their ignorance (and I don't use that word in a negative sense) of whatever it it they're trying to accomplish with their computers.
Slashdot and USENET are full of endless threads about how easy it is to do this-or-that and if you haven't figured it out you must be supremely stupid and lazy. "What, you want it in a fucking silver plate?". Normal people (the ones not buying into open source right now) are petrified at this. They eventually either figure out how to do it ($deity bless Google) or just give up.
Without gross generalizations of course, I can't claim that everyone is this way. But there seems to be a troubling majority of zealots who are just so fantastically out there in their claims that [insert technology here] is so easy to use that even a "brain dead Windoze luser" must be able to figure it out, so they just cannot figure out why everyone hasn't dumped "M$". I mean, it's all so easy and efortless.
Maybe this will indeed be a wake up call for everyone.
You're trolling. It is a relatively simple algorithm and an old standard.
ESR just jumped A LOT of points in my book. I haven't read anything so dead on in the community in ages. But add to that his level of tact and his *gasp* sympathy for the user. Wow. Definitely worth the read.
Quack, quack.
Use the IP address rather than the domain name in the redirect, it will make it far less obvious *grin*
Good luck brother!
You won't be sorry!
Or grab the nifty new (v1.1 released today) md5deep. Computes MD5, works recursively and most any platform too.
If you wrote code to generate the checksum(s) and it's not working then you have a problem between the keyboard and chair, not with the algorithm. That's a standard that is not OS, platform or language specific.
Be extraordinarily careful when trying to take a MD5 sum of a text file. Most operating systems will give you different file contents for a text file, depending on how you ask to open and read the file. If you have MD5 utils that aren't explicity requesting all files in binary mode, then they are being sloppy.
You also have to be careful with text files that they aren't being modified on the fly when being transferred between machines.
...or does ESR come across like he has the biggest ego whenever he writes something? What with all the adoring emails he includes the whole thing sounds like an exercise in self-gratification...
Thats strange -- are you feeding it the same data?
.NET -- never used it yet. Hell, I use it to pass off authentication between these languages when I can't get away with using the same language through out. All work exactly the same...and I'm not even that great of a programmer...
I have a few implementations of MD5 that I use for various apps that ALL give the same results. Sometimes you have to make sure that character sets and otherwise are being processed that same way, and it all comes out the same way.
Lets see -- I have the PHP builtin function, a perl implementation (for systems that don't have it built into the OS), a Javascript one and one that was for just plain ASP (not the
I've been using this one and it seems to work. At least it generates the same digests as md5sum. The problem is it's Windows-only, but it's also extremely fast. I was using another one that had a gui (!) but it was so excruciatingly slow that I had to dump it.
I'll give yours a try. Cheers.
Not a flame or anything, but did you check the source for the Bittorrent client you downloaded? SpywareInfo shows there is a Bittorent client floating away with an infection of spyware.
Just for grins, I checked my machine and McAfee ( Virusscan Enterprise 7.0.0, virus defs 4341) didn't complain about ABC [Yet Another Bittorrent Client] 2.6.5 being on my machine. (Nor did AdAware 6.0.) So McAfee doesn't go after all Bittorrent clients.
Yeah. The Daily Show on Comedy Central had this last week.
Let's jump the gun and win the race ahead of those guys. That'll show them!
So you spend all these resources to find one collision amongst 2^128 combinations.... not really that useful. Sure it is significant, but does it really bring down the entire MD5 infrastructure?
To really destroy MD5, you need to either be able to reverse the plaintext from the hash, or build a lookup table where you can get the plaintext from the hash.
Both of these seem infeasible, especially the lookup table, so things like Paypal using MD5, which the web site uses as an example, doesn't seem quite true.
But it was a "Real" story, and a scary one I might add.
If the dollar is an "I owe you nothing", then the Euro is a "Who owes you nothing." - Doug Casey
Why is this shit moderated up?
You say people are too lazy to click a button, but you expect them to read the manual?
That's fucking hillarious!
Support for anti-banknote technology so that none of these Linux criminals/pirates are going to steal money from Lexmark with superior technology!
Given how much whining I read on this site about outsourcing and monopolies, I seriously doubt most Slashdotters would read Atlas Shrugged in any format! (ducks...)
for a 7.5" by 5" device with 800x600 4-tone grayscale and 10 megs they want how much??? Damn thing probably doesn't even have a decent processor, can't do 1/10th the things a 5 yr old Palm could do and they're charging $400?!? Did I warp back to 1984? Sure it's not a Mac?
Let Dell copy it and sell them for $149.
my karma will be here long after I'm gone
To really destroy MD5, you need to either be able to reverse the plaintext from the hash, or build a lookup table where you can get the plaintext from the hash.
Exactly which plain text are you finding, there are (for he purposes of this at least) an infinite number of plain texts for each MD5 hash.
NZ Electronics Enthusiasts: Check out my Trade Me Listings
If so, tell somebody at xbox-linux
Apart from this, does it support any other format? I'd love to have something like this to read the countless PDF and HTML books I have, but if I had to buy them again in BBeB format, it's not quite as cool.
I frequently use MD5 in my code, for verifying a file's integrity. I do not use SHA-1 or SHA256, because they run a lot slower than MD5, without providing a realistically better guarantee that a file contains what it did at the time of its creation (if 128 bits leaves a significant chance of collision, you have bigger problems than choice of hashing algorithms... Such as how to store over a trillion yottabytes, which corresponds to one bit per 10 picograms assuming you used the entire Earth as a storage device).
Now, cryptographically, MD5 does not have the same "strength" as the SHA256. If you want to prevent tampering, you should most certainly switch to an SHA. But to just check the validity of a large block of data (such that a mere CRC doesn't suffice), MD5 works beautifully.
Additionally, I would point out to those who seem to believe finding a single MD5 collision would invalidate the whole algorithm - BS. For SHA256, going though every possible 257 bit block, you can guarantee a collision. For any hashing algorithm, that will hold true. I don't care if someone came up with a quantum hash (pulled from my posterior, since quantum-blah seems like the word of the day for magical guarantees of computational perfection), you'll still have at least one collision in N+1 bits, where the hash generates N bits.
So can we drop the SHA elitism that seems to have infected people lately? If you want to waste time in your code, go right ahead. But don't fault those of us who actually understand that, outside the realm of hard cryptography, MD5 more than suffices as an all around good hashing algorithm.
LOL. You mention in your own post that MD5 is 128 bits long. If you just restrict yourself to documents that are, say, 10mb big, that means there are 2^81920 possible plaintext documents for each MD5 hash. Granted, only some of them will look remotely like english, STILL... 2^81920 is quite enough to come up with many plaintext documents per hash. If you restrict yourself to keys
As far as I've understood it, the primary purpose is to demonstrate that cracking MD5 is realistic. If this project can then anyone with decent resources (the MD5CRK FAQ claims $100,000 would be enough) can do it. Also, additional collisions will most likely be found soon after the first one (the probability of finding collisions increases), and the data collected from the search can be used for future efforts (e.g. for analysis that might reveal actual statistical flaws in the algorithm).
OVERRATED
See for yourself.
Dead as of this comment posting.
Looks like yanking their revenue stream actually worked. Good job, guys, and thanks to Webclients for doing the right thing and pulling the ads.
p
In Korea, long hair is for old people!
Cause politics suck. Frankly, I'm tired of hearing about politics and business/lawsuits and all that junk. Not everyone cares for that shit.
MD5 is standardized and portable.
Perhaps some of the utilities you are using consider file metadata when generating the checksum?
Also beware of implicit conversions being done to your data by your I/O libraries, as other posters have noted.
DNA just wants to be free...
That is an incorrect assumption. The fundamental requirement is: It is hard (next to impossible) to find two inputs which produce the same digest (and still make sense
The message digest is usually shorter than the message, so this means that the digest contains less "information" that the message. Which means there will be more than one message for the same digest. This loss of "information" means also that you cant reverse a hash to get the original message and be 100% certain you have the right message. There is an infinite number of messages that produce that hash.
Don't shout, they'll all want one...
[.@.tumbleweed...@...]
Sorry.
I'll get my coat.
PS. Aw damn. Just noticed your nick. That's subliminableness for you.
PPS Bushism intended.
So yeah, someone has to start a slashdot team. I mean, we owe it to them for destorying their site a while back.
Join!
SAILING MISHAP
The argument boils down to this:
- A cryptographic hash function must to meet three criteria: non-invertible, 1st image collision resistance (given m, finding m' such that h(m) = h(m')) and 2nd collision resistance (finding m and m' such that h(m) = h(m')).
- There are some applications where 1st or 2nd collision resistance is not required - file integrity, web certificate verification and several others are not one of them.
- If I can find over $100,000USD worth stealing by producing a collision in MD5 (inspect your bank's website certificate, most US firms use MD5) than it's a business proposition, not an egghead research idea.
- Is a 56bit key secure? Bet you can't find the one I'm thinking of in the next 24 hrs. Is a 128bit hash secure when its effective strength is 64 bits? If you're a bank, no. If you're joe slashdotter, yes.
Almost forgot your comment about speed. SHA-1 is slightly slower then MD5. SHA-256 is slightly slower then SHA-1. SHA-384/512 use 64 bit operations so it is much slower on 32bit systems. In short, you concerns about speed are unfounded. Read on.
...
Run this command:
openssl speed md5 sha1
I get:
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
md5 13426.71k 46361.18k 124663.83k 222340.64k 286203.62k
sha1 11175.12k 30058.96k 69783.42k 104107.06k 121809.96k
I also ran "time md5sum file94mb" and "time sha1sum file94mb" file 3 times in succession. The performance is much closer.
a959b7de4f11fe89ba57ecc6fe2f6a07 file94mb
real 0m1.070s
user 0m0.860s
sys 0m0.060s
a959b7de4f11fe89ba57ecc6fe2f6a07 file94mb
real 0m1.070s
user 0m0.850s
sys 0m0.070s
a959b7de4f11fe89ba57ecc6fe2f6a07 file94mb
real 0m1.071s
user 0m0.810s
sys 0m0.110s
5d926755ef975a8900b89b514feac9ded29c4477 file94mb
real 0m1.538s
user 0m1.260s
sys 0m0.060s
5d926755ef975a8900b89b514feac9ded29c4477 file94mb
real 0m1.524s
user 0m1.270s
sys 0m0.040s
5d926755ef975a8900b89b514feac9ded29c4477 file94mb
real 0m1.520s
user 0m1.280s
sys 0m0.030s
Suddenly, I'm not so worried about Indians taking my job.
Are the MD5CRK folks trolling, smoking crack, or just not explaining themselves very well?
They "aim to disprove one of the fundamental requirements of a secure message digest: No two inputs can be found which produce the same digest - this is also known as a collision."
MD5 gives a 128-bit digest. There are more than 2^128 possible messages. Of course there are collisions. What MD5 claims is that the difficulty of coming up with two messages having the same message digest is on the order of 2^64 operations, and that the difficulty of coming up with any message having a given message digest is on the order of 2^128 operations.
No digest algorithm can claim to be free of collisions; they are many-to-one mappings.
Tom Swiss | the infamous tms | my blog
You cannot wash away blood with blood
Almost forgot your comment about speed. SHA-1 is slightly slower then MD5. SHA-256 is slightly slower then SHA-1.
By the numbers you gave (which running the suggested test on my own system more-or-less supported), for more than 16 byte blocks (ie, anywhere you'd use it, otherwise the idea of a "hash" doesn't mean a whole lot), MD5 performs roughly twice as fast as SHA-1.
I do not consider that insignificant. Perhaps not enough of a difference to matter in most cases, but why make a program slower for no good reason?
I do completely agree with your statement about the improved security of SHA; I don't believe I ever claimed otherwise. But I think you may have missed my entire point - Namely, "better" counts as a relative term. Better for crypto does not necessarily mean better for something like verifying a file, or even for a packet on an already-secure network. Yes, I most certainly want my bank using SHA, preferably even SHA512. No, a datafile from my mathematical recreation program of the week doesn't need an untamperable hash, it just needs a quick way to detect errors.
Unfortunately, looking at ESR's followup, it's going to be pretty difficult (without taking away perfectly valid functionality, anyway) to do what he's talking about. How exactly can you verify that there's not a Windows print server on a non-local subnet that you want to use? Or CUPS, or LPD? What is your machine going to do? Scan the entire IPv4, or IPv6, address space every time you want to add a stupid printer?
I mean, if the Windows print servers are local, and you can see the broadcasts, or you use SLP on your network, or you're using NetWare (NCP) printing, you can pick up on printers on your own network. And what about older printers that don't to IEEE-1284 bidi communication? It's not like they have the ability to tell anyone they're there.
I can see saying "these are the printers I can see just by checking", and narrowing the list by default, with a "Show all available communication methods" button (aka "Advanced..." or similar), but autodetection isn't perfect, and in certain cases isn't feasible (figuring out that every UNIX host on your network is running CUPS, as a condition for disabling printind via LPD? what if a host is packet filtering, so your scans are inconclusive? what then?), so taking away the ability to access those services just because the software can't automagically detect them is a mistake.
Sam: "That was needlessly cryptic."
Max: "I'd be peeing my pants if I wore any!"
-Cool
/back story of your choice.... Moderate as appropriate.
-Duh!
-huh?
-Whoa!
Please feel free to apply to comment of your choice, to the
> to...be able to reverse the plaintext from the hash
THE plaintext? Firstly, there cannot be only one plaintext. By the pigeonhole principle, a few byte sum cannot be unique for all multi-megabyte texts.
Besides, if that were possible, MD5 would not be destroyed; it would become the world's best compression.
ESR says, "Let's go back to the queue type selection screen. Remember that one? It looks like this: Locally connected, Networked CUPS (IPP), Networked Unix (LPD), Networked Windows (SMB), Networked Novell (NCP), Networked JetDirect". He then goes on to say that all of this should be autodetected and then the irrelevant options grayed out. According to him, each host do a Christmas tree scan (!!) of the local network to see what printer types to prompt for.
:)
:)
First of all, he'd better stay the hell away from my network. I thank goodness that no other (non-script-kiddie) application on this planet performs unprompted scans like this. DHCP, of course, doesn't count.
Second, what if the printer is currently down? Or I'm configuring a machine to be installed offsite? I can think of any number of scenarios where I'd want to configure a network printer that isn't currently on the network.
A program should NEVER think that it's smarter than the user. What if CUPS doesn't detect "wvlan0" as a network interface? Well, it would gray out all the network printer options. But that's clearly wrong -- the user *knows* that the machine is networked. If CUPS allowed him to configure the network printer, everything would just work. Note that CUPS probably should put up a warning dialog "Warning: I could not detect a network -- do you want to continue," but it should not prevent or restrict anything.
ESR's solution relies on too much magic and will cause support nightmares. It is too system-dependent -- it might work on Red Hat, but it'll probably break on SuSE. Or an ARM-based machine. Or a token ring network. Etc. And when it breaks, the user will be surprised and have no other recourse than to consult the documentation.
Incidentally, graying something out is almost always wrong because it gives no indication as to why it's grayed out! You should let the user select it, then put up an informative dialog telling the user that what he's doing doesn't make sense, and what he or she might do to fix it. Always, always, always tell WHY.
Yes, the CUPS UI is flawed ("client-error-forbidden! client-error-forbidden!"), but ESR's proposal is even worse. It's a measly six-item menu! If Easy Software did try to implement it, after a ton of programmer time they'd have an interface that is more surprising, less informative, and more fragile. Not a step in the right direction.
The proper way to fix this unfriendly menu is to create a wizard The first page would allow you to select a locally-connected printer or, if there are no unconfigured local printers, a network printer (possibly launching a Samba browser to help). Wizards are great for reducing perceived complexity without reducing functionality.
Creating a good user interface is hard. I think that ESR just proved this.
[100% ISO 646 Compliant]
SVM, ERGO MONSTRO.
thank you for posting that. I like it when people think outside the box. people criticized his flawed logic.
You turn his mistaken logic into a possible revolution in compression.
From the GNU text utilities man page, correct? If you read the texinfo document (like the man page suggests), you get more information:
While ryanr's comment about files transferred between systems being modified is valid (used to be a big problem with ftp and binary files in the old days), the one saying most operating systems will mangle files on the fly is false. Microsoft is the only one who I recall have ever done that by default, and then only because DOS text files end with a special character.
Unless my roommate's poison made me lose more memory than I thought, there was no conversion of anything--all text mode would do is make the OS stop reading after the end of file character, even if there was more content in the file.
Yeah, there are systems which convert to their native cr/lf order, but to do so by default would create a big mess...especially considering how hard it is to accurately detect if a file is text or binary.
For example, Linux's FAT (MS filesystem) driver has options to assume everything is text and convert it, and also an option to try and autodetect by file extension, but the default is to assume everything is binary, because that is the most sane one. Under Linux fopen's text mode flag doesn't do anything, because fopen is implemented in libc not the kernel, so it has no way of knowing if the file is on a native filesystem or an alien one...
No, not correct. From the program itself. Read what you copied there. ryanr's comment was completely correct with regard to text files.
Basically the point is that if people can do it in a smallish amount of time, then governments are probably already doing it about 10 times faster. I would personally not be concerned with most uses of MD5 at the moment (as collisions with downloadable files are almost certainly not going to be trojans, just big useless files), but if you're trying to hide from governments and big corporations, you'd better start using SHA-256 =)
Peachy. Where were you going to put the lookup table for that? 2^81920 is on the order of 10^25000. If you could store one of those documents on an atom (attach it with a little dab of glue, okay?) you'd have enough plaintext documents for every atom in this universe...and for every atom to have its own universe of attached atoms...and still have enough documents to be short several orders of magnitude of storage space. Generating the table is left as an exercise for the reader. Cheers.
~Idarubicin
As described in their FAQ, they need a cycle that contains a Distinguished Point. But it is not guaranteed - there might as well be a simple fixed point or a small cycle that does not contain any DPs. They do not address this in the FAQ at all! The clients may be stuck in loops without sending anything to the server (having effectively found a collision), but the organizers will have no idea.
By my calculations, at the current rate they'll take over 500 years to produce a collision. They need about a hundred times as many people on board to get anywhere.
1 9.78646399116343804161
The sum I did is
sqrt(-l(0.5)*2*2^128)/(1.325*10^9*86400*365)
5
N=2^128 is the space they're looking for a collision in. The expected number of collisions found after k items have been produced is very close to k^2/2N, so the probability zero have been found is exp(-k^2/2N) by the Poisson distribution. Assume exp(-k^2/2N) = 0.5 and solve for k, then divide by their declared rate of 1.325 gigaMD5s a second.
I don't know whether this inclines me to give the whole thing up or to climb on board. The latter is probably more fun.
Incidentally, the algorithm they're using to do the search efficiently is pretty cool. Paul C van Oorschot and Michael J Wiener, Parallel Collision Search with Cryptanalytic Applications (pdf)
Xenu loves you!
Well, now silicon.com is reporting that it has contacted the offending site's advertisers and forced them to stop paying ad revenues - thus effectively crippling the illegal site - after all, no revenue, no reason to the run the site.
And a good slashdotting to screw them all over.
That is bullshit. Of course two inputs can be found which produce the same message digest. This is the pigeonhole principle. Now the MD5CRK developers seem like smart people, and so it's more likely that they just haven't explained it very well.
They go on to say
But I don't see what that would achieve either: two strings of gibberish that happen to have the same MD5 sum. Find a way to produce two documents which both have meaning (perhaps two pieces of source code, or two different school reports) and have the same signature, and that would be impressive.
-- Ed Avis ed@membled.com
if you're trying to hide from governments and big corporations, you'd better start using SHA-256
Except that SHA-256 was developed by the NSA, which means it may have been designed with some intentional obscure "shortcut" that could be exploited by the NSA. The SHA message digest functions have been scrutinized quite heavily, though, so it seems unlikely in my humble opinion.
For the most part, I have no problem with what these essays say--better user interfaces are needed and so is documentation that ordinary users have a chance of understanding if they ever get around to reading it. But I think one of the conclusions toward the end is remarkably unproductive:
I've never seen people "congratulating [them]selves [...] on their dedication to freedom" on Slashdot or in anything from the open source movement. From both of these groups I've seen calls endorsing non-free software if that software is perceived to get people on with their task, and I've seen much maligning of RMS (usually coming from posters who apparently haven't read or heard what he actually endorses). It's ironic that ESR's self-described rant will only be taken seriously and/or fixed because of software freedom. If this were proprietary software he were complaining about, the most skilled hackers could do nothing but wait for the proprietors to make things better. Fortunately we are dealing with free software. If the people who's feedback he lists really think that the issue of freedom is so important and these problems with CUPS are crucial, they can write the software to fix the interface and improve the documentation, or they can hire someone to do these jobs for them. With a completely free software system you can do that, no matter what part of the system you're dealing with.
This also calls for an unnecessary ordering of attention (first we must stop paying attention to this, then we must start paying attention to this other thing) because there's no reason why we should drop software freedom in exchange for some practical technical advance. It's the open source movement (which ESR and others started over a decade after the free software movement began) that encourages users to dismiss software freedom for a development methodology. There's nothing wrong with having both software freedom and a better UI with applications that figure out your setup so you don't have to.
I appreciate the complaints he's making because I've raised similar ones myself in other forums (unlike him, I have experienced a great deal of trouble with printing with MacOS X and scanning with Microsoft Windows, while printing and scanning with Fedora Core 1 has been plug-and-play for my printer and scanner). I don't want anyone to stop raising issues and writing well-worded complaints (such as ESR's is). At the same time, I see far too little software freedom talk and I don't think we need to stifle freedom talk to get to the heart of the problem on improving UIs and documentation. GNOME hackers had demonstrated their commitment to improving their UI well before ESR's rant was written and it looks like Project Utopia will make things even better.
Digital Citizen
I'm confused. You say there is little speed difference between SHA1 and MD5, and then post figures to support your claim showing SHA1 to be 50%-100% slower than MD5. Eg processing 8k SHA1 is 122MB/s and MD5 286MB/s. Your processor time is 1.53s for SHA1 and 1.07s for MD5. Did you mean that the parent poster's speed fears are actually founded? Or am I misreading your figures, as they appear to show SHA1 significantly slower?
Phillip.
Property for sale in Nice, France
For all those interested in the MD5 signing of a message and how "impossible" it is - take a look at www.cryptool.org and the demonstration under "Individ. Procedures" -> "Attack on the hash value of a signature". You may be (unpleasently) surprised about how easy it is to match two completely different documents to have the same MD5.
Two wrongs may not make a right, but three
If you hash a document, and then change it by a few bytes, then the hash is likely to be very different. The point is, that while it may be possible to come up with large numbers of plaintext documents for any given hash, how likely is it that the file will be remotely related to the original? Not very I'd say.
If you are using the hash function for authentication then this is clearly not an issue, as anything that produces the same hash will allow you access.
The "term" significant is relative. If a mathmatical algorithm is 100% slower when dealing with purly CPU bound data and in real-life you use it on I/O bound data - I don't consider it to be significant.
10mins producing 1000's of hashs of files vs 13-15mins isn't going to kill you IMHO.
3DES is 200% slower than DES (1 + 200% == 3). Yet people have accepted that penalty. Why not 50% or 100%?
For the record - if you're really sensitive about performance and not concerned with cryptographic level of security - you should be using MD4 which is faster then MD5 and provides 128 bits of hash.
Google for MD4 collisions, you'll see people have infact inverted MD4 for certain inputs.
looks a bit like an iPod
Apple must be doing something right in their marketing if anything that's white and electronic with an LCD screen is referred to as : "looks a bit like an iPod". It looks more like a white PDA, or a small, white tablet PC. But an iPod? That's pushing it a bit.
In other words, md5 likely does not provide 128 bits of integrity. However, as far as the public knows, no attacks have been found yet.
It is very likely that the first 128 bits of a sha-n digest make a more secure hash function than md5. I'm actually pretty surprised that this is not more widely publicized.
Copyright Violation:"theft, piracy"::Anti-Trust Violation:"thermonuclear price terrorism"<-Overly dramatic language.
Even most software gurus don't understand the various properties of hash functions (weak collision resistance, strong collision resistance, one-way ness, good pseudo-randomness, etc.) and which are important in which situations. It's safest just to tell programmers "md5 is bad, use sha-n instead".
Also, as I mentioned in athoer post, there is good evidence that md5 is broken in the sense that it is possible to find attacks against md5 that are more efficient than the birthday attack.
I think the point is that the vast vast majority of people implementing security don't have a strong cryptographic background. Most people don't know about the evidence that md5 isn't strongly collision resistant. Furthermore, many people don't understand that even if md5 were strongly collision resistant, strong collisions could be found with a work factor of 2**64 (ignorance of birthday attacks). It's also easy to fool yourself into a false sense of security by using known weak methods and saying "almost good enough is good enough".
If you want fast and strong file integrity checking and are not concerned about willful deceit, I would suggest a concatenation of (Adler32, file modulo 2**32-1, CRC64). This will be significantly faster than md5. You could replace CRC64 with CRC32, 64-bit addition of all 64-bit blocks, or XOR of all 64-bit blocks, but this will reduce the strength of your integrity checks.
md5 is about 33% slower than md4. If you want something pre-implemented that does 128-bit checks fast and kinda-sorta-cryptographically robust, md4 is an option. However, if speed doesn't matter to the point that you're not using something fast like the concatenation I mentioned above, you might as well go right to sha-256 or sha-512. If you want cryptographic security then use it and while you're paying for it in CPU cycles, pay the little extra to do it right. Either you need security or you don't.
There have been colissions found in the md5 round function, but I believe these rely on getting the chaining variables set to some class of weak values. The design of md5 depends on its round function being strongly collision resistant for all values of the chaining variables. This does not mean that md5 is broken yet, but it is not good news.
md5 still looks perfectly good as a one-way pseudo-random function (uses like entropy gathering and password files). It also appears to provide 128-bit weak collision resistance in the strict cryptographic sense. This means it still looks okay for file integrity checks. However, the weaknesses found in the round function suggest it does not provide 128-bit strong colisison resistance and should not be used for electronic signatures. (Okay, there are situations where a weakly collision resistant hash function is acceptable in digital signatures, but you really have to know what you're doing. It's best to play it safe.) I'm not sure if any difinitive work has been done regarding the consequences of the round function weaknesses on the weak collision resistance of md5. Persionally, I would only use md5 as a one-way pseudorandom function and assume it is not even weakly collision resistant.
Copyright Violation:"theft, piracy"::Anti-Trust Violation:"thermonuclear price terrorism"<-Overly dramatic language.
Well obviously you're so good at it you don't have to try. You've completely ignored the substance of the article and taken the usual "it's the user's fault" approach that ESR was warning against. Like it or not, software is meant to be used and if that use requires hours of research in documentation then most people will just use something else. People don't expect to have to read a manual to learn how to change the channel on their TV and ESR argues that they don't want to have to read a manual to do what is otherwise a trivial task in another operating system. It's entirely possible for it to be a trivial task if some thought was put into using basic UI design principles and that is the big problem: there's no reason for the software to be so difficult to use! ESR sees this, most reasonable programmers see this (including the head of the CUPS team) and for some reason the idea is heretical to many of the bread-and-butter members of the Linux community.
I especially like how you ignored in your RTFM argument that the manuals were in this case both lacked information and were often wrong or just plain misleading. Is it's CUPS fault that ESR couldn't setup a printer? Of course it is! What is the point of software except to be used!? It's the fault of the UI design and the fault for inconsistent documentation, and it's a simple problem to fix!
You may be too shortsighted to see it, but the drive to replace Windows with Linux is a good thing for EVERYONE. EVERYONE is affected by the proliferation of shoddy, insecure software across Internet connected desktops, whether through spam, DDoS attacks, or just a sluggish connection. As such, by creating and distributing a superior software platform we will help create an online environment that is more useful and less frustrating.
In the end the major point is: There is nothing wrong with making software easy to use. A programmer isn't required to, but if they want people to use their software then they should at least make the effort. If you as a programmer don't care if people use your program or not that's fine, don't read ESR's article. He wasn't talking to you. The people he was talking to were those developers who (out of some sick desire apparently) actually WANT people to use their stuff. The rest of you can stay a bunch of elitist dinosaurs off in your own corners, snapping and snarling at users as they pass.
Maxim: People cannot follow directions.
Increases in truth directly with the length of time spent explaining them
Jean-Luc Cooke is a founder of CertainKey CryptoSystems, purveyors of finer digest algorithms, cryptographic, and security wares and services, as well as a primary organizer of and developer for the MD5CRK effort.
While this fact doesn't reveal any obviously untoward behavior, having it discovered independently and reported by a tinfoil-hat-sporting fella could easily cast it in a negative light.
Recommendations:
Make obvious your involvement with MD5CRK (add it to the FAQ, not just the obscure link at the bottom of the pages). Make obvious your involvement in CertainKey from your info on the MD5CRK site.
Clarify your motivation for the project. It's more or less clear to you, sure, but if you look at the MD5CRK site, there is no obvious indication of the point of the project. FAQ item #1 is "About the Magic Button?" Huh? But the FAQ is called "Frequently asked questions about MD5CRK?" IMO, #1 should be "What is MD5CRK?" Essential parts of the answer to that include motivation. How about:
It seems, from the inability of the MD5CRK effort's pages and posts to appeal to me, that you, as a driving force, are suffering from either technical myopia or underhandedness. I doubt it's underhandedness, but here are ways it could be technical myopia:
If your interest really is better general security through use of practically-same-cost algorithms, then I wish you luck.
How does the phrase "they don't distinguish between binary and text files" have nothing to do with text files??? Or are you saying DOS and Windows qualify as "most operating systems"? Or are you saying there are other systems which differentiate when switching between the text and binary modes of fopen? If so, please tell me which ones they are. The only one I've seen do it is DOS.