US District Court Says Calculating a Hash Value = Search
bfwebster writes "Orin Kerr over at The Volokh Conspiracy (a great legal blog, BTW) reports on a US District Court ruling issued just last week which finds that doing hash calculations on a hard drive is a form of search and thus subject to 4th Amendment limitations. In this particular case, the US District Court suppressed evidence of child pornography on a hard drive because proper warrants were not obtained before imaging the hard drive and calculating MD5 hash values for the individual files on the drive, some of which ended up matching known MD5 hash values for known child pornography image and video files. More details at Kerr's posting." Update: 10/28 16:23 GMT by T : Headline updated to reflect that this is a Federal District Court located in Pennsylvania, rather than a court of the Commonwealth itself.
The courts are finally getting up to speed on technology.
"Ein Volk, ein Reich, ein Führer." -Adolf Hitler
"We are one Nation, we are one People." -The One 'leader'
you can't generate md5s w/o actually looking at all of the data in the file.
This sounds like the worse possible way to search for kiddie porn, because a suspect who wanted to conceal his activities could just change a single pixel, and the entire hash would change. They would need a signature method that doesn't change dramatically when a single bit changes, like something based on a frequency analysis.
Palm trees and 8
The guy whose computer was searched, abandoned the computer and gave up any rights at that point, the person who found the porn was computers new owner. Just like any trash tossed out becomes public domain, there should have been zero expectation of privacy at that point. I am not a legal scholar, but I do not see how the 4th amendment applies here. It would be no different than if this was a diary in a different language and the person who inherited the diary found a translator, upon finding criminal evidence it would be fully admissible.
I will not be pushed, filed, stamped, indexed, briefed, debriefed or numbered. My life is my own.
Calculating hash values isn't search. Calculating them and comparing them to a database is. Not only is it quite clearly search (searching for files that match known MD5 signatures), it's hard to imagine another way to describe it without being deliberately obfuscatory.
Comment removed based on user account deletion
"some of which ended up matching known MD5 hash values for known child pornography image and video files." Wait, so law enforcement has a database of kiddie porn and kiddie porn md5's? Some perverted bureaucrat found himself the right job.
"A claim for equality of material position can be met only by a government with totalitarian powers." Hayek
When I submitted this story, I gave it the headline "US Court:...". Someone changed that to "PA Court Says...". That's wrong. This is a ruling from a US District (Federal) court, not a Pennsylvania state court, and so carries much more weight. ..bruce..
Bruce F. Webster (brucefwebster.com)
Or maybe get a proper warrant and follow procedures properly? Sorry, I am no fan of kiddie abusers but if we bent the rules the way you'd like them for this instance then what comes next? I break down your door as an officer, find nothing, and suffer a fine for having made a mistake? Sorry, the officers must follow rules same as you and I or they will become simple bullies. Oh wait....
Better a few guilty men go free on a technicality than allow officers to become a law unto themselves.
Build it, Drive it, Improve it! Hybridz.org
Quite honestly, the judicial tradition of suppressing evidence entirely because it was produced without a proper warrant is absurd.
So you're saying you have no problem with warrentless searches? Shall we continue this thought to it's logical extreme conclusion?
There's a reason the judicial system has the structure it does: so there's a strong trail of evidence, to ensure the rights of everyone involved have not been broken by law enforcement, to ensure nothing has been tampered with.
The law HAS to follow the law, otherwise what authority does it really have to enforce it?
How would you feel about this man if it was your child's photograph on this man's notebook.
How would you feel if it was your laptop that was seized without a warrant? "Oh I don't have child porn" you say. Sure...but without that warrant the cops may just plant the evidence. Now what say you?
Or, that friend you let borrow your machine last week, remember him? Yeah, he's not the church going fun loving person you thought. On that USB key with all of his work related stuff was a nice folder of child porn. Its a good thing he copied everything to your machine so you could work together on that big project that boss is asking about.
Or, that teenager in your house, yeah dirty young man. He's out browsing the internet looking for pictures. He accidently clicks on a link with under age "actors". Fortunately, he's a good kid and backs out of the site right away. Didn't look at anything, didn't mean to go there. Hell, you've even trained him well enough to erase cookies and temporary files. Hear that knocking? Yeah, that's the police showing up without a warrent and taking your machine. Oh look, they just found deleted child porn images on your computer. You sick bastard.
Without the warrant you have one more leg to stand on to fight these charges. Its there to protect the innocent.
What evidence? Some md5 hashes that happen to match hashes from a select number of images? Odds are if we hash out every file on your hard drive we will also find matches to that same list.
Actually, odds are the hashes will not match...
Bad police work is bad police work, no matter the criminal.
Here's a clue: be upset with the stupid officers that could've followed procedure and actually nabbed the guy instead of being lazy and screwing up the case instead of the judge for enforcing the law.
These are YOUR freedoms too.
- Michael T. Babcock (Yes, I blog)
"The law exists to serve the public good"
No, it doesn't. Government exists to uphold rights, and the law exists to provide government one of the tools to do that. Rights belong to individuals, not "the public".
What makes a child pornographer a criminal is the concrete harm he does to an individual -- not some abstract harm to "the public good".
The system is designed around that. The bill of rights gives weight to the rights of the accused for two reasons. First, it is the job of the justice system to protect everyone's rights -- to defeind the rights of the victim while still respecting the rights of the accused. Second, when we don't respect the rights of the accused, we tend to conflate "accused" with "guilty", and then nobody's rights (including the victim) are protected.
If you dont respect the rules of the system even when they make it harder to catch the bad guy, then you're really asking for a rule-less system that enforces your will. But watch out -- yours isn't the will that's going to prevail if the system heads that way.
"With this decision, the courts have just given license to all of those who kidnap or exploit children to make this pornography"
No, they haven't. They have not made child porn legal; they have reminded the authorities that they still have to do their job according to the rules even when it's a job that really needs to be done.
"How would you feel about this man if it was your child's photograph on this man's notebook."
If we left 'justice' in the hands of how those harmed by the crime feel, it would be revenge (which is not the same thing -- and which incidentally doesn't serve the "public good", either).
"the judicial tradition of suppressing evidence entirely because it was produced without a proper warrant is absurd. The man was clearly guilty and the evidence was there. Instead, fine the police for doing the wrong thing"
Here, I agree -- to a point. It doesn't change the fact that in the context of the system as it exists, the court's action is correct, though; today the remedy for illegal search is suppression of evidence.
But yes, I think holding law enforcement personally responsible when they violate the rights of the accused would be more just than penalizing the victim (and any potential future victims) by preventing a conviction when the accused really is guilty -- if such a system can be made to work.
There are two problems with that, though, which I don't know how to resolve:
1) Having performed an illegal search, which results in the conviction of a child pornographer, a police officer goes on trial. What jury will convict him? If the answer is none and that's ok with you, then you're really saying that the accused shouldn't have had rights in the first place.
2) Being personally liable for mistakes can create an incentive to do less work. I'm not saying this justifies a lack of personal accountability in general, but you do have to have a system in which the police are confident "if I do the right thing, I won't be punished". That's harder than it sounds.
Odds yes.
But no guarantee.
A better check is hash and file size, since it is more difficult for two files of the same size to have the same hash by chance. Especially using compression due to images or videos of the same dimensions reducing to different sizes.
Hash and file size checks are useful for checking if a file is intact and possibly not altered. They are great for lookups.
But, in the end, you still need the file to validate the correct item is found. Hashmaps store both the key and hash for this very reason. The hash is a quick lookup, but the key is needed to verify the right element has been found.
Unless the hash is the same size as the key.....
I rarely read replies, it's my opinion and if you thought about your opinion a little more, I'm OK with that.
Each character is a hex digit, not any alphanumeric, so it's 16^32=2^128 possibilities instead of 36^32. That's 186 billion times smaller, but it's still a lot.
>>>"Oh I don't have child porn" you say. Sure...but without that warrant the cops may just plant the evidence. Now what say you?
Even if they don't plant evidence, who wants to go through the hassle of losing their PC for one or two months while the cops scan it for hidden porn (or even stashed drugs). It's not about dishonesty by police, but stopping harassment of citizens. Nobody wants one or two months of their lives wasted just because the government agents have nothing better to do than grab private property.
"[the British government] has erected a multitude of new offices by a self-assumed power, & sent hither swarms of officers to harrass our people & eat out their substance;" - Declaration of Independence, 1776
FOX NEWS.com should be BANNED from television and internet. Have the Congress take it over and give us Truespeak.
so you mean youre scared of living in an environment that everyone not on the right has been living in from 2000-2006?
turn up the jukebox and tell me a lie
Not only did they search the drive without a warrant, but they also got the defendant to confess to putting the files there by questioning him without reading his rights and telling him that he didn't need an attorney. Genius.
Even dumber: Based on the testimony of the guy who originally found the child porn, they could have gone to a magistrate and gotten a warrant. Then there would have been no issue of a warrantless search.
BTW, for those considering the abandoned-property angle -- the court goes into that. It wasn't a legal eviction and the defendant hadn't abandoned his stuff; he merely hadn't removed it all yet.
Yes, that's the birthday paradox. I'm not sure offhand how big the NCMEC database is, which is usually what they're comparing against, but let's try some math.
Let's say your hard drive has N files and the database has M items (so, comparing a list of N to another list of M hashes). Your hard drive doesn't actually contain any of the files used to generate the "bad" hash list. The probability of a hash collision is approximately P = 1 - exp( -N*M / (2 * 2^128) ). Assuming the value in the exponent is small, this is approximately P = N*M/2^129. 2^129 is in the rough vicinity of 10^43. In order for you to have a one in a billion (10^9) chance of a false positive, the product N*M would have to be ~10^34. If the hash list has a billion items (I think it's smaller than that, by quite a lot), you'd need 10^25 files on your disk -- well beyond the capacity of readily-available desktop storage.
MD5 hashes are useful because they're resilient to even birthday collisions. What they're not resilient to, it turns out, is intentionally creating two files with the same MD5 hash. (Even then, it is infeasible to generate two files with the same MD5 hash and the same size.)
To exceed a .1% chance of finding a match with MD5 (a 128-bit hash) you would need to compare
n(p;H) ~ sqrt( 2*H*ln (1/(1-p)) )
n(.001;2^(32-1)) ~ 2^60
pictures. So to have a .1% of finding a collision of a legitimate picture and malicious picture in the FBI database one would have to compare about 830,000,000,000,000,000 pictures (8.3*10^17). You don't understand what it means to say that "MD5 is broken." Please leave the cryptography to the cryptographers.
I apologize for interrupting the false dilemma here, but would it be a reasonable option to prosecute both the criminal who was caught and the cop who violated the Constitution to catch him? I know, I know, we've got two guilty people on our hands, and our natural, rational instinct is "let them both go unpunished, then set fire to our own hair"... but perhaps there's a way to disincentivize police excesses without giving criminals a get-out-of-jail-free card.
I suppose there's an argument that anyone who would violate the Fourth Amendment can't be trusted as part of a chain of evidence... but in that case, shouldn't the guilty cop be kicked off the force entirely, not just distrusted regarding a single case?
Those are just thoughts in general, though, not necessarily a recommendation for this particular case. Even if it was admissible, I'm not sure I'd want to prosecute someone with evidence like "Look at what we found on his computer, thanks to the help of some guys who felt cheated by him, took his computer, reported incriminating files to us, and totally pinky swear that neither of them put them there themselves."