When the Internet Archive Forgets (gizmodo.com)

← Back to Stories (view on slashdot.org)

When the Internet Archive Forgets (gizmodo.com)

Posted by msmash on Thursday November 29, 2018 @09:30AM from the PSA dept.

A reminder that Internet Archive's Wayback Machine, which many people assume keeps a permanent trail and origin of web-content, has little feasible choice but to comply with DMCA takedown notices. As a result of which, a portion of the archive of things people submit to the website continues to quietly fade away. Gizmodo: Over the last few years, there has been a change in how the Wayback Machine is viewed, one inspired by the general political mood. What had long been a useful tool when you came across broken links online is now, more than ever before, seen as an arbiter of the truth and a bulwark against erasing history. That archive sites are trusted to show the digital trail and origin of content is not just a must-use tool for journalists, but effective for just about anyone trying to track down vanishing web pages. With that in mind, that the Internet Archive doesn't really fight takedown requests becomes a problem. That's not the only recourse: When a site admin elects to block the Wayback crawler using a robots.txt file, the crawling doesn't just stop. Instead, the Wayback Machine's entire history of a given site is removed from public view.

In other words, if you deal in a certain bottom-dwelling brand of controversial content and want to avoid accountability, there are at least two different, standardized ways of erasing it from the most reliable third-party web archive on the public internet. For the Internet Archive, like with quickly complying with takedown notices challenging their seemingly fair use archive copies of old websites, the robots.txt strategy, in practice, does little more than mitigating their risk while going against the spirit of the protocol. And if someone were to sue over non-compliance with a DMCA takedown request, even with a ready-made, valid defense in the Archive's pocket, copyright litigation is still incredibly expensive. It doesn't matter that the use is not really a violation by any metric. If a rightsholder makes the effort, you still have to defend the lawsuit.

7 of 71 comments (clear)

Min score:

Reason:

Sort:

Move to Canada by JMJimmy · 2018-11-29 09:32 · Score: 4, Interesting

They should move to Canada as we have an exemption for archives which would allow the content to remain.
Library of Congress by JBMcB · 2018-11-29 09:40 · Score: 5, Interesting

Get a charter from the Library of Congress, which can essentially bypass DMCA restrictions by fiat. The LoC usually seems pretty progressive about these things.

--
My Other Computer Is A Data General Nova III.
Anyways. Remember to Donate by martiniturbide · 2018-11-29 09:50 · Score: 3, Informative

Remember to donate to the Internet Archive: https://archive.org/donate/
DOOM ? : https://archive.org/details/do...
Apple II : https://archive.org/details/ap...
Arcade: https://archive.org/details/in...
DOS GAmes: https://archive.org/details/so...
Like political office holders? by the_skywise · 2018-11-29 09:51 · Score: 4, Insightful

if you deal in a certain bottom-dwelling brand of controversial content
I like how this insinuates that it's the "dark web" trying not to be blocked when it's political leaders, actors and other public personae (people very much out front and wanting to be seen) that go out of their way to delete their internet history when it contradicts with whatever they're pushing today so they can say "this has always been who I am!"
Archiving and to a greater point JOURNALISM (not "reporting" but actually chronicling and journaling the days' notable events in an objective manner) is an indispensable requirement for any person to become educated on a topic and to make an informed decision.
Eventually these things become history and are lost to current though until somebody digs through the archives to rediscover the truth. Except now we can make it go away with a keypress and, poof, we've always been at war with Eurasia.
The fact of removal can still be shown by mi · 2018-11-29 10:01 · Score: 3, Interesting

So, someone requested, you remove a page — and you decide to comply. By replacing it with something like "Content removed by on date on request from such and such."
Requesting removals of evidence suddenly becomes less effective — an explicit record of removal may appear even more sinister, than whatever was there before...

--
In Soviet Washington the swamp drains you.
Re:Spirit of the protocol by darkain · 2018-11-29 10:33 · Score: 4, Informative

The big issue came about in that some domains lapsed, years later someone else registered said domains, put up robots.txt, and as such the entire history from the previous owners were inadvertently deleted.
holding news media accountable by eaglesrule · 2018-11-29 11:08 · Score: 3, Insightful

Eventually these things become history and are lost to current though until somebody digs through the archives to rediscover the truth. Except now we can make it go away with a keypress and, poof, we've always been at war with Eurasia.
There is more of an immediate need, since the ability to stealth edit a story after publishing it is too great a temptation to resist. There's been too many examples of 'reputable' news sources getting caught red handed doing this.
Anyway, an archive source that is subject to the hideously malformed DMCA is hardly an archive source at all.