US District Court Says Calculating a Hash Value = Search
bfwebster writes "Orin Kerr over at The Volokh Conspiracy (a great legal blog, BTW) reports on a US District Court ruling issued just last week which finds that doing hash calculations on a hard drive is a form of search and thus subject to 4th Amendment limitations. In this particular case, the US District Court suppressed evidence of child pornography on a hard drive because proper warrants were not obtained before imaging the hard drive and calculating MD5 hash values for the individual files on the drive, some of which ended up matching known MD5 hash values for known child pornography image and video files. More details at Kerr's posting." Update: 10/28 16:23 GMT by T : Headline updated to reflect that this is a Federal District Court located in Pennsylvania, rather than a court of the Commonwealth itself.
The courts are finally getting up to speed on technology.
"Ein Volk, ein Reich, ein Führer." -Adolf Hitler
"We are one Nation, we are one People." -The One 'leader'
you can't generate md5s w/o actually looking at all of the data in the file.
This sounds like the worse possible way to search for kiddie porn, because a suspect who wanted to conceal his activities could just change a single pixel, and the entire hash would change. They would need a signature method that doesn't change dramatically when a single bit changes, like something based on a frequency analysis.
Palm trees and 8
Hash is ~$30/gram depending on quality. Seems like those folks in PA have been smoking something else if they thought they needed to calculate an emmm-dee-five.
The guy whose computer was searched, abandoned the computer and gave up any rights at that point, the person who found the porn was computers new owner. Just like any trash tossed out becomes public domain, there should have been zero expectation of privacy at that point. I am not a legal scholar, but I do not see how the 4th amendment applies here. It would be no different than if this was a diary in a different language and the person who inherited the diary found a translator, upon finding criminal evidence it would be fully admissible.
I will not be pushed, filed, stamped, indexed, briefed, debriefed or numbered. My life is my own.
Calculating hash values isn't search. Calculating them and comparing them to a database is. Not only is it quite clearly search (searching for files that match known MD5 signatures), it's hard to imagine another way to describe it without being deliberately obfuscatory.
Comment removed based on user account deletion
"some of which ended up matching known MD5 hash values for known child pornography image and video files." Wait, so law enforcement has a database of kiddie porn and kiddie porn md5's? Some perverted bureaucrat found himself the right job.
"A claim for equality of material position can be met only by a government with totalitarian powers." Hayek
Yes, I would qualify parsing someone's file system into file sized chunks and processing them bit by bit and feeding that data into a hashing algorithm as searching.
When I submitted this story, I gave it the headline "US Court:...". Someone changed that to "PA Court Says...". That's wrong. This is a ruling from a US District (Federal) court, not a Pennsylvania state court, and so carries much more weight. ..bruce..
Bruce F. Webster (brucefwebster.com)
The problem I have here is I would think that this would come under reasonable cause.
Someone calling the police and saying "Hey I found kiddie porn on this computer." seems to be reasonable cause to me.
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
Comment removed based on user account deletion
Or maybe get a proper warrant and follow procedures properly? Sorry, I am no fan of kiddie abusers but if we bent the rules the way you'd like them for this instance then what comes next? I break down your door as an officer, find nothing, and suffer a fine for having made a mistake? Sorry, the officers must follow rules same as you and I or they will become simple bullies. Oh wait....
Better a few guilty men go free on a technicality than allow officers to become a law unto themselves.
Build it, Drive it, Improve it! Hybridz.org
Quite honestly, the judicial tradition of suppressing evidence entirely because it was produced without a proper warrant is absurd.
So you're saying you have no problem with warrentless searches? Shall we continue this thought to it's logical extreme conclusion?
There's a reason the judicial system has the structure it does: so there's a strong trail of evidence, to ensure the rights of everyone involved have not been broken by law enforcement, to ensure nothing has been tampered with.
The law HAS to follow the law, otherwise what authority does it really have to enforce it?
How would you feel about this man if it was your child's photograph on this man's notebook.
How would you feel if it was your laptop that was seized without a warrant? "Oh I don't have child porn" you say. Sure...but without that warrant the cops may just plant the evidence. Now what say you?
Or, that friend you let borrow your machine last week, remember him? Yeah, he's not the church going fun loving person you thought. On that USB key with all of his work related stuff was a nice folder of child porn. Its a good thing he copied everything to your machine so you could work together on that big project that boss is asking about.
Or, that teenager in your house, yeah dirty young man. He's out browsing the internet looking for pictures. He accidently clicks on a link with under age "actors". Fortunately, he's a good kid and backs out of the site right away. Didn't look at anything, didn't mean to go there. Hell, you've even trained him well enough to erase cookies and temporary files. Hear that knocking? Yeah, that's the police showing up without a warrent and taking your machine. Oh look, they just found deleted child porn images on your computer. You sick bastard.
Without the warrant you have one more leg to stand on to fight these charges. Its there to protect the innocent.
The man was clearly guilty and the evidence was there.
What evidence? Some md5 hashes that happen to match hashes from a select number of images? Odds are if we hash out every file on your hard drive we will also find matches to that same list. There for, by your own logic, we should arrest you, put your name on the sexual offenders list, and drag you into court, all with out a warrant.
If you really want to live in a country with that much legitimate power in the government, there are numerous flights to China every day.
In short:
Good: Civil liberties defended.
Bad: Possible case against alleged child porn possessor blown.
Worse: Cops too f'ing incompetent/lazy/ill-trained to get a freaking warrant.
The problem here is not civil liberties getting in the way of prosecution, it's the prosecution failing to follow the law.
-Rick
"Most people in the U.S. wouldn't know they live in a tyrannical state if it walked up and grabbed their junk." - MyFirs
Even if the hard drive has a couple of million files on it and there are a few thousand known hashes of illegal files, the odds of having a different file with a matching hash are in the neighborhood of 10^28 to 1 against.
"How would you feel about this man if it was your child's photograph on this man's notebook."
How would you feel if it was your notebook I said had a picture of a child in it?
If our judicial system doesn't work right, we should fix it; I'm not taking a position on whether it works right in general. But let's assume we carefully figure out a set of rules and get our judicial system to work right for all manner of crimes from shoplifting to murder; rules that properly balance the rights of the (possibly innocent) accused. Turning around and throwing those rules aside for certain crimes is madness. That's what we mean by "think of the children" stuff: it doesn't help children any to do an intentionally bad job running the justice system for crimes related to children.
What evidence? Some md5 hashes that happen to match hashes from a select number of images? Odds are if we hash out every file on your hard drive we will also find matches to that same list.
Actually, odds are the hashes will not match...
Better a few guilty men go free on a technicality than allow officers to become a law unto themselves.
The largest US gang has a well documented record that would seem to indicate your statement is out of date.
As another everyday example, here's a big surprise, no?
I'm not intending to troll/flamebait here, but MY perception is there is very little accountability for the 'on the job' crew in blue amongst themselves. It is also my perspective that there is very little integrity once one subscribes to the original meaning of the thin blue line.
Bad police work is bad police work, no matter the criminal.
Here's a clue: be upset with the stupid officers that could've followed procedure and actually nabbed the guy instead of being lazy and screwing up the case instead of the judge for enforcing the law.
These are YOUR freedoms too.
- Michael T. Babcock (Yes, I blog)
"The law exists to serve the public good"
No, it doesn't. Government exists to uphold rights, and the law exists to provide government one of the tools to do that. Rights belong to individuals, not "the public".
What makes a child pornographer a criminal is the concrete harm he does to an individual -- not some abstract harm to "the public good".
The system is designed around that. The bill of rights gives weight to the rights of the accused for two reasons. First, it is the job of the justice system to protect everyone's rights -- to defeind the rights of the victim while still respecting the rights of the accused. Second, when we don't respect the rights of the accused, we tend to conflate "accused" with "guilty", and then nobody's rights (including the victim) are protected.
If you dont respect the rules of the system even when they make it harder to catch the bad guy, then you're really asking for a rule-less system that enforces your will. But watch out -- yours isn't the will that's going to prevail if the system heads that way.
"With this decision, the courts have just given license to all of those who kidnap or exploit children to make this pornography"
No, they haven't. They have not made child porn legal; they have reminded the authorities that they still have to do their job according to the rules even when it's a job that really needs to be done.
"How would you feel about this man if it was your child's photograph on this man's notebook."
If we left 'justice' in the hands of how those harmed by the crime feel, it would be revenge (which is not the same thing -- and which incidentally doesn't serve the "public good", either).
"the judicial tradition of suppressing evidence entirely because it was produced without a proper warrant is absurd. The man was clearly guilty and the evidence was there. Instead, fine the police for doing the wrong thing"
Here, I agree -- to a point. It doesn't change the fact that in the context of the system as it exists, the court's action is correct, though; today the remedy for illegal search is suppression of evidence.
But yes, I think holding law enforcement personally responsible when they violate the rights of the accused would be more just than penalizing the victim (and any potential future victims) by preventing a conviction when the accused really is guilty -- if such a system can be made to work.
There are two problems with that, though, which I don't know how to resolve:
1) Having performed an illegal search, which results in the conviction of a child pornographer, a police officer goes on trial. What jury will convict him? If the answer is none and that's ok with you, then you're really saying that the accused shouldn't have had rights in the first place.
2) Being personally liable for mistakes can create an incentive to do less work. I'm not saying this justifies a lack of personal accountability in general, but you do have to have a system in which the police are confident "if I do the right thing, I won't be punished". That's harder than it sounds.
Odds yes.
But no guarantee.
A better check is hash and file size, since it is more difficult for two files of the same size to have the same hash by chance. Especially using compression due to images or videos of the same dimensions reducing to different sizes.
Hash and file size checks are useful for checking if a file is intact and possibly not altered. They are great for lookups.
But, in the end, you still need the file to validate the correct item is found. Hashmaps store both the key and hash for this very reason. The hash is a quick lookup, but the key is needed to verify the right element has been found.
Unless the hash is the same size as the key.....
I rarely read replies, it's my opinion and if you thought about your opinion a little more, I'm OK with that.
Each character is a hex digit, not any alphanumeric, so it's 16^32=2^128 possibilities instead of 36^32. That's 186 billion times smaller, but it's still a lot.
>>>"Oh I don't have child porn" you say. Sure...but without that warrant the cops may just plant the evidence. Now what say you?
Even if they don't plant evidence, who wants to go through the hassle of losing their PC for one or two months while the cops scan it for hidden porn (or even stashed drugs). It's not about dishonesty by police, but stopping harassment of citizens. Nobody wants one or two months of their lives wasted just because the government agents have nothing better to do than grab private property.
"[the British government] has erected a multitude of new offices by a self-assumed power, & sent hither swarms of officers to harrass our people & eat out their substance;" - Declaration of Independence, 1776
FOX NEWS.com should be BANNED from television and internet. Have the Congress take it over and give us Truespeak.
so you mean youre scared of living in an environment that everyone not on the right has been living in from 2000-2006?
turn up the jukebox and tell me a lie
Odds of one innocent file's md5 hash matching one identified file's hash md5 is insignificant. But in this case we are talking about and entire hard drive's worth of files compared to a database of all known digital kiddie porn.
Take a PC that has been in heavy use for a few years, you might have a couple hundred thousand files, each of which could collide with any of the hundreds of thousands (millions?) of hashes for every known kiddie porn related file on the internet.
Think of it like rolling dice. Rolling a double 6 on a pair of 6 sided dice is a 1/36 chance, but rolling any doubles is a 1/6 chance.
The odds of any single file on your hard drive matching any single file they have on record is significantly better than a specific file on your hard drive matching a specific file they have on record.
-Rick
"Most people in the U.S. wouldn't know they live in a tyrannical state if it walked up and grabbed their junk." - MyFirs
Not only did they search the drive without a warrant, but they also got the defendant to confess to putting the files there by questioning him without reading his rights and telling him that he didn't need an attorney. Genius.
Even dumber: Based on the testimony of the guy who originally found the child porn, they could have gone to a magistrate and gotten a warrant. Then there would have been no issue of a warrantless search.
BTW, for those considering the abandoned-property angle -- the court goes into that. It wasn't a legal eviction and the defendant hadn't abandoned his stuff; he merely hadn't removed it all yet.
Chain of custody. Very important in forensics.
The landlord and his friend might have had a motive to lie about the guy that was behind on his rent payments. From the blurb from the article, it doesn't seem that his landlord had completed the eviction procedure yet, and was anxious to get Crist out of his house and a new tenant in. The eviction process is not immediate. So he gives Crist's computer to his friend, his friend backdates the clock, and his friend puts kiddie porn on there and turns it over to the cops.
The fact is that the police cannot be certain of the chain of custody in this case without a warrant. With a warrant, they take affidavits in support of chain of custody before they go poking around. It's clear and documented using established procedure. The landlord and his friend can still lie, but they're now subject to the penalties for filing a false statement. Without that supporting documentation and especially because of the nature of the case and the possible motives of the landlord and his friends, it makes the chain of custody issue important.
Comment removed based on user account deletion
How would you feel about this man if it was your child's photograph on this man's notebook
If you're going to bring emotion, kicking and screaming, into a discussion on legal procedure, let's go all the way: How would you feel if it was your Constitution being ignored?
If I have been able to see further than others, it is because I bought a pair of binoculars.
According to the article, the computer was removed from the defendant's residence by his landlord's friend because the landlord was in the process of evicting the defendant for non-payment of rent. This computer was not found abandoned on the side of the road with the trash. There's no clear indicator that the defendant gave the computer to the landlord's friend, which means the computer is the defendant's property. Therefore, the landlord's friend does not have the right to consent to a search of the computer. This means that the police need a warrant to search that computer, and given the evidence that the landlord's friend had, they would have likely gotten a warrant without any issue.
It's a procedural screwup on the part of the police. It happens. They're human.
Comment removed based on user account deletion
I'd like to just add on to your post, because I think otherwise part of your point may be missed. The reason this judgement is good is not because it protects people who have child pornography, but because it protects people who don't have it.
If you make an exception and say that it's ok to do otherwise illegal searches so long as you're looking for child pornography, then you've opened a back door for police to search *any* computer under the guise of looking for child porn. So then, some day in the future, some police officer would be able to take your computer without a warrant, scan your hard drive, and then say, "Well, we were looking for child pornography, so what we did was legal, but we found instead this other information. Since the search was completely legal, we can use that information against you."
In effect, it would mean that they wouldn't need a warrant to search computers anymore.
The problem I have here is I would think that this would come under reasonable cause.
Someone calling the police and saying "Hey I found kiddie porn on this computer." seems to be reasonable cause to me.
It seems that way to me as well, and had they tried to get a warrant based on probable cause, it probably would have succeeded.
Conducting the search without a warrant, however, isn't going to fly unless their are also "exigent circumstances". Which in this case would mean the police have reason to believe that any potential evidence on the laptop would vanish before they could acquire a warrant. Since the laptop was in the possession of the 3rd party who called the police to report the crime, that seems unlikely.
So not getting the warrant was a big mistake, and it's likely a criminal will walk as a result. Even though it's sad, this has to happen. Failing to get a conviction and having the perp walk free is the only thing that motivates police to follow all the correct procedures and guarantee all the suspect's rights. Now the police know that a warrant is not optional when searching a laptop. So in the future the cops won't make this mistake, perps will be caught using proper rules of evidence, and our rights will be more secure.
The enemies of Democracy are
But the recent civil forfeiture provisions for copyright infringement they're trying to get signed (maybe already signed?) into law will allow them to do the same thing. The Feds can already seize your property on the mere suspicion that it is being used for illegal drug activity, and are not required to even file charges. When said seizure happens, the burden of proof is on the owner prove that it wasn't used for illegal activity.
The society for a thought-free internet welcomes you.
It wasn't a legal eviction, that's why that argument wasn't invoked. In effect, the computer was stolen. The defendant even reported it as stolen way before the child porn was reported.
It is a massive cock up by the police. If they had done things by the book it might have been alright, but to be honest I think the chain of custody of a stolen PC would pretty much erase any case they would have.
The biggest issues I see with going 4th amendment rights on this is the fact that the defendant doesn't own the computer anymore. From the article he lost it because of problems not paying rent. It changed hands to an uninvolved third party who noticed the files were on, now his, computer.
It's also impossible to prove that anything found on the computer is anything to do with the defendant. The rent dispute is an obvious motive for the former landlord to take all sorts of malicious actions.
Yes, that's the birthday paradox. I'm not sure offhand how big the NCMEC database is, which is usually what they're comparing against, but let's try some math.
Let's say your hard drive has N files and the database has M items (so, comparing a list of N to another list of M hashes). Your hard drive doesn't actually contain any of the files used to generate the "bad" hash list. The probability of a hash collision is approximately P = 1 - exp( -N*M / (2 * 2^128) ). Assuming the value in the exponent is small, this is approximately P = N*M/2^129. 2^129 is in the rough vicinity of 10^43. In order for you to have a one in a billion (10^9) chance of a false positive, the product N*M would have to be ~10^34. If the hash list has a billion items (I think it's smaller than that, by quite a lot), you'd need 10^25 files on your disk -- well beyond the capacity of readily-available desktop storage.
MD5 hashes are useful because they're resilient to even birthday collisions. What they're not resilient to, it turns out, is intentionally creating two files with the same MD5 hash. (Even then, it is infeasible to generate two files with the same MD5 hash and the same size.)
False. MD5 has the property that if you can find two bytestreams that collide, appending identical data to the end will continue to produce two different files that collide. Furthermore, the collision-finders are able to take an arbitrary prefix, and then append random data to that prefix until a collision is found.
What does this mean? It means you can take a file with a blob of random data in the middle, then generate two files with identical hashes but different random blobs of data in the middle.
This, in turn, allows you to do things like create applications, postscript files, HTML files, and other things which hash identically but act or display completely differently. (You embed both behaviors in the file, then switch depending on the contents of the random data. A close examination will turn up the "bad" side, lying inactive, but simply opening the file will make it appear that all is well.)
It's certainly not as good as being able to match an arbitrary hash, but MD5 collisions are entirely practical to take advantage of today.
At this point, MD5 should be considered to be a checksum, not a validator. MD5 is still very good at detecting random noise injected into a data stream but it should no longer be considered to have any real utility for detecting malicious changes.
If you mod me Overrated, you are admitting that you have no penis.
It's worth pointing out that literally 99.999999% of child porn is made by people who have legitimate possession of the children. Either actual guardians or people temporary watching them.
There is almost no instances of child porn being made with kidnapped children, and it extremely unlikely someone would kidnap children for that, as opposed to incidentally doing it to children they already kidnapped.
Hence, 'child porn' is not placing anyone's children in danger. Not people possessing it, and not even people making it.
The danger is child abuse. It is not child kidnapping, and it's certainly not the entirely hypothetical 'child kidnapping to make porn'.
Child abuse happens almost entirely by people who are entrusted with children, not random strangers.
If corporations are people, aren't stockholders guilty of slavery?
To exceed a .1% chance of finding a match with MD5 (a 128-bit hash) you would need to compare
n(p;H) ~ sqrt( 2*H*ln (1/(1-p)) )
n(.001;2^(32-1)) ~ 2^60
pictures. So to have a .1% of finding a collision of a legitimate picture and malicious picture in the FBI database one would have to compare about 830,000,000,000,000,000 pictures (8.3*10^17). You don't understand what it means to say that "MD5 is broken." Please leave the cryptography to the cryptographers.
I wish I could remember the author and book name but I can't so take this as anecdotal until someone comes up with references.
A while back, there was a book getting some attention on CSPAN and in the literary and legal press that posited warrants were not conceived as common things. A warrant, so the thinking went, would indemnify the police from damages if they searched an innocent party. If the police searched someone without first getting a warrant and that person turned out to be guilty, then the search was fine in a "no harm, no foul" sense. If the police did not get a warrant and searched someone innocent, then the person searched would take legal action and be directly awarded large penalties from the police.
The position of the book was that warrants were originally conceived to be rare things, only gotten when there was an edge case where the police reasonably suspected wrongdoing but weren't absolutely sure of their facts. Supposedly, if the police were absolutely sure, they should be free to go ahead and kick in doors. Generally, though, the police were assumed to be unwilling to do so in any but the most obvious cases because to do so incorrectly would bring major penalties down on their heads.
The book cited old English and colonial cases where police made mistakes and courts then ordered the police to directly pay damages to the former suspect.
Such a system could have worked back in the day. Nowadays, not so much. So much of what is illegal these days is invisible or not easily discernible that the need for warrants, even under the old criteria, is huge. Add to that the common practice of police not acting with integrity (I came of age in Houston, Texas in the 1970s. If you learned to deal with cops in that time and place, you'll never, ever, ever trust any cop to tell the truth about anything. You will forever assume that any evidence found by cops was planted. Period.), and the whole "Cops won't hurt innocents because they're afraid of the repercussions" notion simply falls apart.
I said all that to say this - I have some appreciation of the reasonableness of the attitude that if evidence of a crime is found, it doesn't really matter how it was obtained. On balance, I don't agree with that position but I do believe that it should not dismissed out of hand. It has some theoretical merit. It has no practical utility these days, but the theory isn't all crap.
I apologize for interrupting the false dilemma here, but would it be a reasonable option to prosecute both the criminal who was caught and the cop who violated the Constitution to catch him? I know, I know, we've got two guilty people on our hands, and our natural, rational instinct is "let them both go unpunished, then set fire to our own hair"... but perhaps there's a way to disincentivize police excesses without giving criminals a get-out-of-jail-free card.
I suppose there's an argument that anyone who would violate the Fourth Amendment can't be trusted as part of a chain of evidence... but in that case, shouldn't the guilty cop be kicked off the force entirely, not just distrusted regarding a single case?
Those are just thoughts in general, though, not necessarily a recommendation for this particular case. Even if it was admissible, I'm not sure I'd want to prosecute someone with evidence like "Look at what we found on his computer, thanks to the help of some guys who felt cheated by him, took his computer, reported incriminating files to us, and totally pinky swear that neither of them put them there themselves."
"maybe-underage girls" reminds me of an old joke I heard once.
A cop is driving his beat when he sees a car parked in a location commonly used by young couples for various forms of irresponsible behavior. He pulls up beside it, and walks up, and taps on the window.
Rather than finding the occupants engaged in anything untoward, he finds a man with a magazine and a young woman with a spool of yarn.
He asks the young man: "What are you two doing up here tonight."
The young man replies: "Well, I'm catching up on the financial times, and she's knitting a sweater."
The cop is a bit confused by this, but follows through with another standard question:
"And how old are the both of you."
The young man replies rather glibly: "Well, I'm 25, and in about 15 minutes from now she'll be 18"
Bull.
The fact is that there are so many laws on the books that, no matter how clean you are, someone could probably find some evidence that maybe you committed some kind of crime, even if only by technicality.
The protection against unreasonable searches is to prevent harassment. Without that protection, the police could just search your home and your computer on a regular basis, just because they didn't like you or didn't agree with your politics. If someone does that long enough, they can find some obscure law that you've technically broken even if it's something so innocuous that they wouldn't normally prosecute it, and go ahead and arrest you.
The purpose of the 4th amendment isn't particularly to protect criminals who are rightfully under investigation, and though it might protect law abiding citizens from embarrassment, that's not particularly the purpose either. The purpose is specifically to prevent agents of the government from harassing citizens, targeting particular people and digging through their lives looking for acts that might possibly be stretched to be criminal.
Law enforcement can't investigate people they think are bad until they find crimes, but instead they have to go the other direction-- investigating particular crimes until they find the guilty parties.