404-No-More Project Seeks To Rid the Web of '404 Not Found' Pages
First time accepted submitter blottsie (3618811) writes "A new project proposes to do away with dead 404 errors by implementing a new HTML attribute that will help access prior versions of hyperlinked content. With any luck, that means that you'll never have to run into a dead link again. ... The new feature would come in the form of introducing the mset attribute to the <a> element, which would allow users of the code to specify multiple dates and copies of content as an external resource."
The mset attribute would specify a "reference candidate:" either a temporal reference (to ease finding the version cited on e.g. the wayback machine) or the url of a static copy of the linked document.
As someone who deals with SEO on a daily basis, 404 errors are quite annoying. But there is always a reason to why there is a 404, and a missing/deleted page is not always the reason. This could include a misspelled file name.
Furthermore, linking to expired, cached, or archived versions of a page could be just as problematic as it could have outdated and incorrect information which might infuriate the user even more.
Individual websites should get their 404s under control themselves.
...someone types http : //tech.slashdot.org/story/14/04/21/2218253/404-no-more-project-seeks-to-rid-the-web-of-404-not-found-pages-but-really-is-it-going-to-work-with-this-amazing-new-link?
Tired of my customary (Score:1)
Smells like a sneaky way to bring back Clippy: "It looks like the page is missing. Would you like me to run a Bing search for you?"
Table-ized A.I.
Given the choice to display either out-of-date information (potentially causing liability or other miscommunication) or simply putting up a catch-all branded error page with a link back to the site's home, I'm not sure what sort of organization would choose the former.
We already have redirects. They work just fine.
Great so now instead of getting a 404 to know I am accessing old or removed content I will now get out of date and potentially wrong content instead of being informed of the error.
Basically wouldn't this become a way to hijack requests to drive ad revenue for whoever? :( It Seriously bugs me when Comcast pulls stuff like this -- though perhaps processing this html tag could be something disabled via the browser?
It seems to me that they are reinventing the <a> element, badly. Semantically, what they are trying to express is a series of related links. What they should be doing is relaxing the restrictions on nested <a> elements and defining the meaning of this, then defining a suitable URN for dated copies of documents. That way they don't need to replicate perfectly fine attributes such as rel in a DSL that isn't used anywhere else and the semantics of the relationship are more accurately described.
Bogtha Bogtha Bogtha
The proposal doesn't say a whole lot about why one would want to do it. So I can attach a date to a link. How does this guarantee that _those_ links won't die?
404 no more! Make it 600 instead!
When you use the mset attribute, you would be saying where the content is hosted, yes? What happens when sites like the Wayback Machine cease to exist?
I think Ted Nelson et al. would love to say "I told you so."
sorts of new security exploits. Do we really need it?
I always thought that URIs were supposed to handle precisely this - that they were supposed to be unique, universally accessible identifiers for contents and resources - identifiers that, once assigned, wouldn't need to be changed to access the same contents or resources in the future.
That's the intent: cool URIs don't change. But in the real world, URIs disappear for political reasons. One is the change in organizational affiliation of an author. This happens fairly often to documents hosted "for free" on something like Tripod/Geocities, a home ISP's included web space, or a university's web space. Another is the sale of exclusive rights in a work, invention, or name to a third party. A third is the discovery of a third party's exclusive rights in a work, invention, or name that make it no longer possible to continue to offer a work at a given URI.
There aren't many 404s left anyway. Domain dealers are quick to put their hands on every dead link. Which is a shame, because a 404 would be more informative.
They go so well with value scarves.
Seriously, 3 typos in the first sentence? "A new project proposes _an_ do away with dead 404 errors by implementing (missing: "a") new HTML attribute _hat_ will help access prior versions of hyperlinked content."
Where are the editors? Oh, right. Carry on.
Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
What happens if the original site goes out of business and a new company buys the domain name?
Now imagine the hilarity if a G-rated website is bought by an X-rated website (or more amusingly, vice versa). I can see some idiot representative trying to pass a law saying that websites are required to figure out which defunct version of the website the 404 URL refers-to and redirect the user to content appropriate to the website that the user intended. :D!
Let's have some body add a heartbeat mechanism while we are at it, 'cause you know God forbid we expose a user to technology on something as simple as a computer.
all I have to do is update the dead links with new links that wont go to the dead links I never bothered updating in the first place
Could the tags instead be used for scammy redirect tricks (like "Open"DNS "search results")?
...a utility that would go through my gazillion saved bookmarks from forever, and see if each still has something of value there. :\
I sometimes in an odd moment sift through the oldest, and easily 8/10 are dead links.
-Styopa
{
bHideBehindABrokenUrl = true;
bObtainOutOfDateContent = true;
bProvideContentThatCouldBeUsedToSueUs = true;
bDontBotherDoingYourJob = true;
}
else
{
SomeoneWhoCanDoTheirJob();
}
Honestly, its projects like this, the ones that promote bad practice, which really fu** me off.
Welcome to one of the many pointless projects of 2014, sponsored by some kid who just left highschool.
You know, you could always FIX THE BROKEN LINK! :P
There are good reasons for 404! So who decides what YOU get served up when this happens? Can you say "Just Plain STUPID"?
If not, try "NSA"?
There goes my plan for writing a script that finds every possible three-letter permutation and tries it along with .com .org and .net and tells me if it gets a 404.
Similar to how imperative programming doesn't answer all our needs, this solution doesn't either. It would be better to either just accept that sometimes a page doesn't exist or do what I think should be done. Functional html : html pages that look up the link on demand and if its not available throw you out into the 404 with embedded links to all search engines.
Sounds like what they're trying to fix is scenarios where the server SHOULD be returning 410, not 404. This has nothing to do with PROPER 404 status codes.
Anyway, is something is so important mirror it yourself and be done with it. No need for html tags. BTW, it seems to blur the line between HTML and HTTP too much.
HTML was meant to be easy and accessible to all. At least HTML 2 and 3 and maybe even 4. But stick a fork in the HTML dream, it died a long time ago.
And everyone at least knew what a 404 was.
Now we need to eliminate those to monetize things or redirect to a page full of Facebook javascript or other ass-raping javascript to destroy privacy or some page with at least some ads and hopefully 3rd party cookies and a hidden tracker image.
Or something?
We live in the age of the "EVIL INTERNET" --- the internet isn't some awesome thing here for us to explore, the internet is an evil corporate device meant to screw the user over by any means possible, whether or not that involves lying "Hey it isn't a 404! Here, go to this evil page you aren't looking for so we get ad displays and our javascript can privacy-rape you!"
Welcome to the 2014 Internet --- the *cannibal internet* that lies to you, tracks you and eats all your rights away so some companies can make a 3/100s of a penny! And if you don't support this, you are part of the "problem".
Priest: "Universe from nothing, no laws of physics, sped up time"+ huge discrepancies. Creationism? No. Big Bang Theory
As user of both Bittorrent and Git and a creator of many "toy" operating systems which have such BT+Git features built in, I would like to inform you that I live in the future that you will someday share, and unfortunately you are wrong. From my vantage I can see that link rot was not ever, and is not now, acceptable. The architects of the Internet knew what they were doing, but the architects of the web were simply not up to the task of leveraging the Internet to its fullest. They were not fools, but they just didn't know then what we know now: Data silos are for dummies. Deduplication of resources is possible if we use info hashes to reference resources instead of URLs. Any number of directories AKA tag trees AKA human readable "hierarchical aliases" can be used for organization, but the data should always be stored and fetched by its unique content ID hash. This even solves hard drive journaling problems, and allows cached content to be pulled from any peer in the DHT having the resource. Such info hash links allows all your devices to always be synchronized. I can look back and see the early pressure pushing towards what the web will one day become -- Just look at ETags! Silly humans, you were so close...
Old resources shouldn't even need to be deleted if a distributed approach is taken. There is no reason to delete things, is there not already a sense that the web never forgets? With decentralized web storage everyone gets free co-location, essentially, and there are no more huge traffic bottlenecks on the way to information silos. Many online games have built-in downloader clients that already rely on decentralization. The latest cute cat video your neighbor notified you of will be pulled in from your neighbor's copy, of if they're offline, then the other peer that they got it from or shared it with, and so on up the DHT cache hierarchy all the way to the source if need be, thus greatly reducing ISP peering traffic. Combining a HMAC with the info hash of a resource allows secured pages to link to unsecured resources without worrying about their content being tampered with: Security that's cache friendly.
<img infohash="SHA-512:B64;2igK...42e==" hmac="SHA-512:SeSsiOn-ToKen, B64;X0o84...aP=="> <-- Look ma, no mixed content warnings! -->
Instead of a file containing data, consider the names merely human readable pointers into a distributed data repository. For dynamism and updates to work, simply update the named link's source data infohash. This way multiple sites can be using the same data with different names (no hot linking exists), and they can point to different points in a resource's timeline. For better deduplication and to facilitate chat / status features some payloads can contain an infohash that it is a delta against. This way, changes to a large document or other resource can be delta compressed - Instead of downloading the whole asset again, users just get a diff and use their cached copy. Periodic "squashing" or "rebasing" of the resource can keep a change set from becoming too lengthy.
Unlike Git and other distributed version controls, each individual asset can belong to multiple disparate histories. Optional per-site directories can have a time component. They can be more than a snapshot of a set of info-hashes mapped to names in a tree: Each name can have multiple info-hashes corresponding to a list of changes in time. Reverting a resource is simply adding a previous hashID to the top of the name's hash list. This way a user can rewind in time, and folks can create and share different views into the Distributed Hash Table File-system. Including a directory resource with a hypertext document can allow users to view the page with the newest assets they have available while newer assets are downloaded. Hypertext documents could then use the file system itself to provide multiple directory views, tagged for different device resolutions, paper vs eink vs screen, light vs dark, etc. CSS provides something similar, but why limit th
http://404kids.org/
I have a cool 404 set up on my webspace. I know it hardly ever gets a hit, but this project would prevent some random person from getting a smile in their day. There are whole sites dedicated to cool 404's. Again, we are victim to some committe of young framework-trained certified and inexperienced people.
#getoffmylawn
Aboutt 400,000 people live in 404 area!
My websites generate 404s all the time for script kiddies trying out exploits. Why would I change that?
You know, you could always FIX THE BROKEN LINK! :P
...or just not break it in the first place.
The problem stems from the URL being a machine address necessary to acquire content, but one that is also human-readable which inspires people to treat it as if it were the page's title or something and so they edit it as freely as they edit the page's content.
Just name all of your web pages by UUID. Then, since one random number is just as good as any other, you'll never again have the urge to change your URLs.
Similarly, just use random UUIDs as domain names, and you'll never again be bothered by having your first choice be unavailable. An added advantage of this is that you can name your site whatever you want regardless of what domain names are available since you've decided that your domain name doesn't have to match your web site's name. Suddenly the fact that virtually every domain name is being squatted upon no longer matters.