Slashdot Mirror


Fixing Broken Links With the Internet Archive

eggboard writes "The Internet Archive has copies of Web pages corresponding to 378 billion URLs. It's working on several efforts, some of them quite recent, to help deter or assist with link rot, when links go bad. Through an API for developers, WordPress integration, a Chrome plug-in, and a JavaScript lookup, the Archive hopes to help people find at least the most recent copy of a missing or deleted page. More ambitiously, they instantly cache any link added to Wikipedia, and want to become integrated into browsers as a fallback rather than showing a 404 page."

1 of 79 comments (clear)

  1. Re:No. 404 is important! by SunTzuWarmaster · · Score: 4, Interesting

    So let's say that my company has three lines of products on three different webpages. We decide to discontinue two of the lines of products for being unprofitable, and remove the pages. Google search results still show the pages, and archive.org still shows them to users. These products are still shown to my potential customers, who experience frustration when they attempt to get them.

    Alternately, I create a temporary webpage for displaying some demo content to a potential client. It is a demo page, and ridden with bugs, holes, and other areas that need improvement. Archive.org still shows this page as part of search results? What will potential clients think of my company, given that it put up a buggy/terrible page?

    Alternately, let's just say that I rename a longstanding webpage (technology.slashdot.org to tech.slashdot.org) and delete the old URL. Should archive.org redirect to false content?

    Or, let's say that my restaurant decides to take down its 2013menu.html page, and doesn't wish customers to be able to compare its new and old menu side by side to see where prices inflated.

    Error messages have purpose. While the most common case is that the page/server went offline, there are many times where a page URL changes as a result of regular website updates, where you don't want users to obtain old content.

    Sometimes things are deleted for a reason.