Federal Judge Says Internet Archive's Wayback Machine A Perfectly Legitimate Source Of Evidence

← Back to Stories (view on slashdot.org)

Federal Judge Says Internet Archive's Wayback Machine A Perfectly Legitimate Source Of Evidence

Posted by msmash on Thursday May 19, 2016 @04:05AM from the reason-to-live-by dept.

Tim Cushing, reporting for TechDirt (condensed): Those of us who dwell on the internet already know the Internet Archive's "Wayback Machine" is a useful source of evidence. So, it's heartening to see a federal judge arrive at the same conclusion, as Stephen Bykowski of the Trademark and Copyright Law blog reports.From the report: The potential uses of the Wayback Machine in IP litigation are powerful and diverse. Historical versions of an opposing party's website could contain useful admissions or, in the case of patent disputes, invalidating prior art. Date-stamped websites can also contain proof of past infringing use of copyrighted or trademarked content.From TechDirt: The defendant tried to argue that the Internet Archive's pages weren't admissible because the Wayback Machine doesn't capture everything on the page or update every page from a website on the same date. The judge, after receiving testimony from an Internet Archive employee, disagreed. He found the site to a credible source of preserved evidence -- not just because it captures (for the most part) sites as they were on relevant dates but, more importantly, it does nothing to alter the purity of the preserved evidence.

54 comments

Min score:

Reason:

Sort:

interestingly by Anonymous Coward · 2016-05-19 04:12 · Score: 1

this means archive.is isn't as good a source, since it heavily alters pages in the process of storing them.
1. Re:interestingly by MrLint · 2016-05-19 04:29 · Score: 1
  
  I don't see that conclusion in this documentation of this ruling. Please be wary of armchair extrapolation.
2. Re:interestingly by Applehu+Akbar · 2016-05-19 05:03 · Score: 1
  
  "this means archive.is isn't as good a source, since it heavily alters pages in the process of storing them."
  And yet the same legal system still insists that we use faxed signatures instead of digital signatures.
3. Re:interestingly by Anonymous Coward · 2016-05-19 05:12 · Score: 0
  
  this means archive.is isn't as good a source
  Of course, what credibility does an "Islamic State" archive have?
4. Re:interestingly by Anonymous Coward · 2016-05-19 05:42 · Score: 0
  
  How so, exactly?
5. Re:interestingly by pseudorand · 2016-05-19 06:16 · Score: 1
  
  Digital signatures (assuming you mean the cryptography ones rather than a .jpg of your handwriting) require that the user appropriately protect his or her private key. The average person doesn't know how to do that and even for those of us who do, it's inconvenient and error-prone. (I assume my personal PC hasn't been hacked, but I have no way to know for sure.) Digital signatures are therefore not really much more secure than paper+ink+analog phone line sort.
  Since they involve what the public and many judges think of as "computers magic", using them run the very high risk of treating them as a form of non-repudiation even when limited ability to ensure the secrecy of the private key makes that inappropriate.
Except that evidence can and has been destroyed by Anonymous Coward · 2016-05-19 04:13 · Score: 3, Interesting

Just submit a DMCA request. Poof!
1. Re:Except that evidence can and has been destroyed by sims+2 · 2016-05-19 04:24 · Score: 4, Informative
  
  even better park the domain with a robots.txt
  User-agent: *
  Disallow: /
  and archive.org will promptly nuke the site from its archive. :(
  
  --
  Minimum threshold fixed. Thanks!
2. Re:Except that evidence can and has been destroyed by OakDragon · 2016-05-19 04:33 · Score: 1
  
  even better park the domain with a robots.txt User-agent: * Disallow: /
  and archive.org will promptly nuke the site from its archive. :(
  It will respect that retroactively?!
  
  --
  Dark Reflection
3. Re:Except that evidence can and has been destroyed by TypoNAM · 2016-05-19 04:35 · Score: 1
  
  Unfortunately yes it does. :(
  
  --
  This space is not for rent.
4. Re:Except that evidence can and has been destroyed by sims+2 · 2016-05-19 04:36 · Score: 1
  
  IIRC yes it does. Hopefully I'm wrong on that but I don't think I am.
  
  --
  Minimum threshold fixed. Thanks!
5. Re:Except that evidence can and has been destroyed by stoborrobots · 2016-05-19 04:54 · Score: 2
  
  I was under the impression thst it stops saving new pages, and stops *displaying* old pages, but does not nuke the old pages from storage. If your robots.txt goes away in the future, the old pages come back.... Ay least, that was my understanding from long ago...
  
  --
  "Go to CNN [for a] spell-checked, fact-checked summary" -- CmdrTaco
6. Re:Except that evidence can and has been destroyed by ShaunC · 2016-05-19 04:56 · Score: 2
  
  It will respect that retroactively?!
  Yes, permanently; once a site is excluded there's apparently no way to get it back in the archive. I let a domain lapse a few years ago and someone else parked it for a year. I've had it back for several years with a permissive robots.txt but Wayback still says the site is excluded.
  
  --
  Thanks to the War on Drugs, it's easier to buy meth than it is to buy cold medicine!
7. Re:Except that evidence can and has been destroyed by sims+2 · 2016-05-19 05:10 · Score: 1
  
  I think your right on that. But I still don't like it as it means I can't trust that I will be able to find things in the archive at a later date if the domain changes hands.
  And domains that get parked tend to stay that way so access may be lost permanently. So it may be a distinction without a difference.
  
  --
  Minimum threshold fixed. Thanks!
8. Re:Except that evidence can and has been destroyed by Dragonslicer · 2016-05-19 05:22 · Score: 1
  
  Just submit a DMCA request. Poof!
  You can use a DMCA request to plant fake evidence?
  
  Remember, absence of evidence is not evidence of absence. This ruling says that historic pages found on archive.org are considered reliable enough to be admitted in civil litigation. I would be pretty surprised if a federal judge ever ruled that a lack of material in the Wayback Machine should be considered by a judge or jury.
9. Re:Except that evidence can and has been destroyed by hankwang · 2016-05-19 05:36 · Score: 1
  
  "I let a domain lapse a few years ago and someone else parked it for a year. I've had it back for several years with a permissive robots.txt but Wayback still says the site is excluded."
  Unless they changed the rules in the past two years, that is not their normal policy. Robots.txt is only supposed to affect the availability as long as robots.txt is up. It would suck if a temporary syntax error in robots.txt would purge a site forever. There is a case of a dispute where one party refused to remove robots.txt in order to prevent the counterparty from gathering incriminating evidence. The judge had to force removal of robots.txt.
  Archive.org has a special process for permanent purging of a site, but I doubt that a domain squatter would have bothered. If you had an obscure website, chances are that it was never archived to start with.
  
  --
  Avantslash: low-bandwidth mobile slashdot.
10. Re:Except that evidence can and has been destroyed by Jim+Hall · 2016-05-19 07:51 · Score: 1
  
  I was under the impression thst it stops saving new pages, and stops *displaying* old pages, but does not nuke the old pages from storage. If your robots.txt goes away in the future, the old pages come back.... Ay least, that was my understanding from long ago...
  I requested a site be deleted from Wayback a number of years ago. It was a test site, and I stupidly didn't put a "Disallow" robots.txt file on it. I recall that the overview you describe is correct: adding a "Disallow" robots.txt file removes the site from display. But to remove the site from their storage, I had to contact an admin. They asked me to demonstrate that I was the owner of the site (by copying my email message to them as a comment on the website's front page) then they deleted my website from their Wayback archive. However, that was when everyone used spinning disk to store data, and before write-only media became popular in the data center. Facebook stores photos on BluRays these days .. maybe Wayback does now too. If Wayback does something like this, it would be impossible to completely delete the data, although they would (theoretically) be able to remove references from their database.
Which Serves as Further Evidence... by Anonymous Coward · 2016-05-19 04:15 · Score: 0

that once something hits the Web, it's there forever.
1. Re:Which Serves as Further Evidence... by sims+2 · 2016-05-19 04:34 · Score: 1
  
  It's a nice thought but many things do vanish. Servers crash then as it turns out they had archive.org blocked and nobody happened to have any backups.
  Especially smaller and niche stuff is vulnerable to this.
  Someone out there may have a copy but without any way to contact them it pretty much doesn't exist anymore.
  
  --
  Minimum threshold fixed. Thanks!
2. Re:Which Serves as Further Evidence... by Anonymous Coward · 2016-05-19 05:05 · Score: 0
  
  that once something hits the Web, it's there forever.
  It's a stupid myth people keep parroting. I'm an early netizen, and I can assure you that many, many things I said and done in the forums of the past [or even as recent as 2005] have vanished beyond recovery. And I must say that I am a specially cunning searcher.
Big Butt by Anonymous Coward · 2016-05-19 04:25 · Score: 1

Does this 1990 archive of clownsex.org make my ass look big???
1. Re:Big Butt by Salgak1 · 2016-05-19 04:30 · Score: 1
  
  No. **VISION** does. (evil grin)
2. Re:Big Butt by Anonymous Coward · 2016-05-19 04:52 · Score: 0
  
  I always knew there was a reason why I hated that Avenger... Dang him for making our asses look fat!
3. Re: Big Butt by Anonymous Coward · 2016-05-19 05:43 · Score: 0
  
  "Becky look at her butt. It's so big!" -- SirMixALot
Pragmatic judge, but... by Anonymous Coward · 2016-05-19 04:30 · Score: 0

what if wayback machine starts altering pages? Who will independently verify that they were not tinkered with? Can we trust a private entitity with collecting information?
Same applies also to companies which collect torrent infringement info. They can put there anything they want, even inflate the numbers (e.g. this is why Switzerland banned such collection of data from being evidence).
1. Re:Pragmatic judge, but... by Anonymous Coward · 2016-05-19 04:40 · Score: 0
  
  If you cryptographically sign each of your webpages, then the wayback machine won't be able to alter them.
2. Re:Pragmatic judge, but... by pseudorand · 2016-05-19 06:20 · Score: 1
  
  ...won't be able to alter them without being detected, that is.
  Seriously though, this makes the Wayback Machine a huge target for hackers, doesn't it? Imagine advertising on darknet the ability to plant evidence in the Wayback Machine. I expect someone would pay a pretty penny for that.
3. Re:Pragmatic judge, but... by JackieBrown · 2016-05-19 08:46 · Score: 1
  
  Can we trust a private entitity with collecting information?
  What difference does it make? Corruption is present on every level. In fact, I think a private organization is less likely than FBI/CIA/NSA/etc to frame evidence to get you transferred to Guantanamo Bay
Oh dear by Anonymous Coward · 2016-05-19 04:33 · Score: 1

The internet archive is, frankly, quite very crappy. I'm going to ignore the problems they have retaining good and knowledgeable employees. Their internal data structures are lossy as much of what they ought to have you cannot access because the software is doing really stupid stuff in the background. This is visible from the outside and I had (back then still) employees confirm that to me. Short version: Their framing sucks big large hair balls through small tubes. There is also that blanket robots.txt set up by domain squatters are allowed to retroactively alter the visible record.
So, while what they have is occasionally useful (though more often the stuff I need is simply not accessible so they are only useful as a source of last resort), and their own current employees will naturally insist that archive.org is not terminally broken somehow, using them as evidence is iffy at best. The defendant has the right of it, that archive.org can very well distort the evidence it coughs up in dangerous ways, if not so much by altering the record, though it might do that to since you do not get a full clean original back, not by a long shot, then at the very least by omission. And that too can be quite damaging and distortive.
1. Re:Oh dear by LeadSongDog · 2016-05-19 04:46 · Score: 1
  
  There is also that blanket robots.txt set up by domain squatters are allowed to retroactively alter the visible record.
  Key term there is "visible". So long as the archive is preserved, the court can access it.
  
  --
  Oh, I'm sorry sir, I thought you were referring to me, Mr. Wensleydale.
2. Re:Oh dear by Anonymous Coward · 2016-05-19 05:19 · Score: 0
  
  > The internet archive is, frankly, quite very crappy.
  Please stop posting your drunken opinions that nobody cares about.
3. Re:Oh dear by Anonymous Coward · 2016-05-19 07:06 · Score: 0
  
  Then they'd still have to pick apart the broken pseudo-xml-in-xml, provided they can get the archive guys to dig up the right files, match the right data to the meta-data, and be sure they coughed up everything they have. The robots.txt idiocy is really but the top of the iceberg.
Internet Archive has a DMCA Exemption by JcMorin · 2016-05-19 04:40 · Score: 5, Informative

Internet Archive has a DMCA Exemption http://archive.org/about/dmca....
Amazing such a thing would be trusted by Anonymous Coward · 2016-05-19 04:51 · Score: 1

It's amazing what is trusted these days. For example, archive.org is not regulated, controlled, managed, or ANYTHING WHATSOEVER that could be considered legally binding yet here they are trusting it for legal decisions. Do they not understand how easy it would be to put fake data up there, remove data, alter data, etc? This is equivalent to asking a random private citizen that has nothing to do with a case to testify as a witness in said case. It's ridiculous.
If they do want to make legal decisions then the source should be a legally liable source bound by strict legal guidelines and control.
1. Re:Amazing such a thing would be trusted by stoborrobots · 2016-05-19 04:58 · Score: 3, Informative
  
  A random private citizen who is know for pointing a video camera at the relevant section of street every day. Like, say, some business that operates a surveillance security camera where the field of view includes the crime scene. Evidence like that is routinely gathered and used in court.
  Archive.org operates a similar video camera pointing at many web servers.
  
  --
  "Go to CNN [for a] spell-checked, fact-checked summary" -- CmdrTaco
2. Re:Amazing such a thing would be trusted by Dragonslicer · 2016-05-19 05:25 · Score: 3, Insightful
  
  This is equivalent to asking a random private citizen that has nothing to do with a case to testify as a witness in said case.
  Er, what do you think an eyewitness is? Other than "random", but archive.org isn't randomly selected either.
3. Re:Amazing such a thing would be trusted by coldsalmon · 2016-05-19 06:22 · Score: 3, Insightful
  
  Admitting the evidence is not the same as trusting it. The general rule is that any relevant evidence is admissible, and any evidence is relevant if "it has any tendency to make a fact more or less probable than it would be without the evidence." The Wayback Machine easily passes this test. The trier of fact has to look at all of the relevant evidence and make decisions about the quality of all of the items; he or she may decide that the data from the Wayback Machine is not of high quality. However, excluding the evidence means that the trier of fact cannot consider that evidence at all. It seems plain that the Wayback Machine is relevant evidence in an IP trial, as TFA says.
ArchiveTheWeb Chrome Extension by Anonymous Coward · 2016-05-19 04:53 · Score: 1

Please consider installing the "ArchiveTheWeb" Chrome extension then: https://chrome.google.com/webstore/detail/archivetheweb/jgpbjlabbfodbjecclkddfnanflgkjfe?hl=en-US
It automatically saves the web pages you surf and browse TO The Internet Archives' Wayback Machine.
Well if Federal Courts say it's valid by Virtucon · 2016-05-19 04:57 · Score: 2, Insightful

Where's the federal funding to make sure that it's a maintained repository? it's a charitable organization but I would think some sort of royalty arrangement should be provided. I mean if the copyright/trademark/patent system is making use of it or the plaintiffs/defendants then it should have some direct funding stream in terms of its value as a provider of information. I could also see litigants subpoenaing witnesses to ascertain how information is collected etc. That doesn't come for free, not by a long shot.

--
Harrison's Postulate - "For every action there is an equal and opposite criticism"
1. Re:Well if Federal Courts say it's valid by thegarbz · 2016-05-19 06:30 · Score: 2
  
  Where's the federal funding to make sure that it's a maintained repository? it's a charitable organization but I would think some sort of royalty arrangement should be provided.
  How did you get that logic? If all evidence submitted in court were combined with a requirement for continued funding for future litigation then the USA could likely add another zero to it's national debt.
2. Re:Well if Federal Courts say it's valid by SlaveToTheGrind · 2016-05-19 10:30 · Score: 1
  
  That doesn't come for free, not by a long shot.
  It isn't. The Internet Archive has a well-established process and payment schedule for requesting an affidavit on the authenticity of a given archived page.
does nothing to alter the purity of the preserved by Anonymous Coward · 2016-05-19 05:22 · Score: 0

it does nothing to alter the purity of the preserved evidence.
Good, but that shouldn't be enough. What it does to avoid that purity being altered?
Don't have a leg to stand on by Anonymous Coward · 2016-05-19 05:37 · Score: 0

The case appears to be about a trucker jobs website using a trucking companies "Trademark" (a logo maybe?) without permission. The "trademark" apparently was removed from their website before the court case but was still available on the Wayback Machine. Maybe if there was some time frame that was necessary to the case they might have a point but the case is about the defendant EVER using the logo. So unless they can prove that the Wayback Machine ADDS random content to websites it archives they don't have a leg to stand on.
Questions arise by Anonymous Coward · 2016-05-19 06:24 · Score: 0

Several questions arise.
1) How do I know the data in the Wayback machine has not been tampered with?
2) How is the chain of custody of the information verified?
The rules of evidence are that when evidence is recovered from the crime scene, everything must be accounted for, including how the evidence was obtained and handled.
The wayback machine supposedly scrapes the web and saves the data. But, what happens to the data between the hosting server and the Wayback machine's client? How do we know what route the data took? How do we know it was not tampered with along the way? How do we know it was not tampered with once it arrived at Wayback's server? How do we know it was not tampered with while in Wayback's custody?
These are important questions that need to be resolved before the legitimacy of the Wayback machine can be trusted. Otherwise, what's to stop me intercepting a connection between the wayback machine and, say, a church pastor's personal website, and inserting a bunch of child porn that could later be used to prosecute the pastor?
1. Re:Questions arise by Asgard · 2016-05-19 10:33 · Score: 1
  
  I think the veracity of Wayback would be an issue at trial, and both sides would present their theories / subpoena the admins of Wayback, and the jury would have to decide if the content was reliable or not.
does it leave what we browse incognito out?? by Anonymous Coward · 2016-05-19 07:10 · Score: 0

does it leave what we browse incognito out??
1. Re:does it leave what we browse incognito out?? by HornWumpus · 2016-05-19 11:50 · Score: 1
  
  It attaches your identification and passwords to the file, of course. What did you think incognito mode meant?
  
  --
  John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
That means it could be used for Prior Art by Leslie43 · 2016-05-19 08:08 · Score: 1

This means it could be used to fight patent trolls who abuse open source and creative commons items. Makerbot comes to mind on this.
Quick! by dlenmn · 2016-05-19 10:46 · Score: 2

We need to make an internet archive archive!
1. Re:Quick! by Anonymous Coward · 2016-05-23 04:52 · Score: 0
  
  Put it all on a blockchain!
Using websites as evidence of patent invalidity by Anonymous Coward · 2016-05-19 11:23 · Score: 0

Was addressed in Voter Verified v. Premier Election Solutions, 698 F.3d 1374 (2013). And that was some random forum.
See pages 8-9: http://www.cafc.uscourts.gov/sites/default/files/opinions-orders/11-1553.pdf
"Who controls archive.org controls the future" by Baldrson · 2016-05-19 12:11 · Score: 1

Well, the actual quote is:

"Who controls the past controls the future; who controls the present controls the past." -- Ingsoc

Of course, in the case of archive.org the equivalent of 1984's "Memory Hole" is the way they treat domain hijackers that put a robots.txt block on prior content of that domain name:
It's gone. Poof.. not even a bright flash of plasma, let alone smoke.

--
Seastead this.
robots.txt and preservation by illtud · 2016-05-19 14:04 · Score: 1

An interesting link on robots.txt and preservation:
http://www.netpreserve.org/web...
(SPOILER: no anwsers)
unintended consequences by Anonymous Coward · 2016-05-19 15:03 · Score: 0

This will just result in more businesses opting out of the Wayback machine as a matter of course, limiting its usefulness.