Robust Hyperlinks: The End of 404s?

Re:404 Gallery by Anonymous Coward · 2000-03-02 10:16 · Score: 0

(please, sir, may I have another?) http://www.brunching.com/nothing.html

Re:Sounds very iffy to me by Anonymous Coward · 2000-03-03 00:33 · Score: 0

I haven't examined robust.jar, but I believe Netscape has reused Sun's jar authentication scheme (make a file with hashes of some of the files in the zip archive, and sign that file) to support signed JavaScript. I don't see the point, though, as Navigator allows unsigned JavaScript to take any number of unfortunate actions).

Re:Won't work with Linux, sorry. by Anonymous Coward · 2000-03-03 00:37 · Score: 0

Nobody publishes their ActiveX source, so don't forget the PE loader and x86 machine code emulator!

Re:Not unlike Freenet by Anonymous Coward · 2000-03-03 00:47 · Score: 0

This sounds pretty similar to an Eternity Service, and I suggest that would be a less confusing name.

Re:The real solution ... by Anonymous Coward · 2000-03-03 01:00 · Score: 0

This is what responses like 301Moved Permanently and 410Gone are for. By definition, any server using 404Not Found has no idea what happened to the resource in question and couldn't have notified anyone else.

Re:Oops, I'm redundant.(OFF-Topic) by Anonymous Coward · 2000-03-01 21:15 · Score: 0

Possibly this is a sign that your thoughts weren't that interesting in the first place?

If you find yourself consistently wantING to delete your own comments, maybe you should just wait about 5 mintues before posting and see if anyone else has posted the same thing.

If so, post something more original, or just troll.

NATALIE PORTMAN NAKED PERTRIFIED GRITS DOWN MY/HER PANTS

Re:Try ftp'ing instead by Anonymous Coward · 2000-03-01 21:40 · Score: 0

Moderate this Informative. Thanks man.

Would random sig be better? by Anonymous Coward · 2000-03-02 01:45 · Score: 0

Seems like it would be better to just pick a large random number encoded for readability and use that as a signature. Almost zero chance of two web pages having the same signature. Say 6-bit encoding to 40 printable characters. 2**240 should be OK.

BTW, I wish people would stop referring to a web page as a "Paper". "Treatise" would be more appropriate.

I thought of this... by Anonymous Coward · 2000-03-02 02:01 · Score: 0

I thought of this years ago. If somebody patents this and makes money off it I'll be so pissed...

New Hampshire 4 Damn Sure by Anonymous Coward · 2000-03-02 03:07 · Score: 0

You must be from massachusetts (massho ...)

I'll say it again "Live Free or Die". Don't like it? Mind your own business.

Most people in the world will return a 404 when asked about New Hampshire. And we like it that way.

Mirror It Please Somebody by Anonymous Coward · 2000-03-01 20:41 · Score: 0

Could someone please mirror the page if they get to read it or provide a short blurb.

Re:Mirror It Please Somebody by Hammer · 2000-03-01 21:06 · Score: 1

So much for robust hyperlinks ;-)
Re:Mirror It Please Somebody by SEWilco · 2000-03-01 23:26 · Score: 1

Summary:
The document is analyzed and a few unusual words are selected. These are used in a signature which is either put in links (within the anchor tag) or in a URL like this:
http://www.cs.berkeley.edu/~daf/?lexical-signature =bregler+interreflection s+zisserman+cvpr+iccv

The advantage of putting it in the URL is that bookmarks may work. Implementation can be in server or client and there are advantages to both methods. If it's in a noninformative client then you might not be aware of redirection (unless the wrong page is retrieved and it is obvious).

what if I really enjoy 404 errors? by Anonymous Coward · 2000-03-01 20:48 · Score: 0

Well first off, I can't read the web page (since it was slashdotted) yet I will still comment.

the signature is fed into any Web search engine to find the new site of the page

How long would this take? I'd imagine this would clutter web search engines even more...what if altavista had three links to the a page with three different URLs (two of them, being old links, one being the recent location). Its sounds like a great idea and even somewhat practically but I still not sure if it would be generally accepted.

I actually read through my error logs to let webmasters or even myself know where bad links are located on their site.

Here's a fun web page, 404 Not Found Homepage

Won't work with Linux, sorry. by Anonymous Coward · 2000-03-01 20:50 · Score: 0

Read the page. It's based on ActiveX and requires IE 4 or better.

This is just *another* case of Linux falling behind due to it's lack of support for common Internet standards. Where is our ActiveX? COM? Granted, I can occasionally watch as the Java ads on Slashdot cause Netscape for Linux to crash, but that seems to be the extend of Linux's so-called internet connectivity.

And you wonder why people are forced to use windows+IE? If they want to make use of the latest technologies, for example 'Robust URLs' (though maybe they should have invested in a Robust Server), then Linux, sadly, can't keep up. We as a community are being left behind in the Internet arms race. Fortunatly, I have a few ideas:

Get a task force composed of Richard Stallman, Bruce Perens, and ESR to develop and debug ActiveX support for Linux. Estimated time: 2 months.

Form an Open Source Browser Committee to create a new, Open Source web browser that supports all the latest standards (CSS, DOM, DNA) Estimated time: 3 months.

Push for Perl to be embedded in all new web browsers so that CGI programs can be run on the user's machine, which will reduce server loads. Estimated time: 1 month.

Design a new, Internet-ready desktop for Linux, Give it a web browser, probably the new one I described above, and embed it in everything: file manager, word processor, start button, etc. Estimated time: 4 months.

I think that with these items accomplished, Linux will truly begin to shine as a web platform, even for the newest users.

Re:Won't work with Linux, sorry. by LocalH · 2000-03-01 21:30 · Score: 1

Read the page. It's based on ActiveX and requires IE 4 or better.

Yet another small attempt to make Windows the 'better' OS for the Internet...

This is just *another* case of Linux falling behind due to it's lack of support for common Internet standards. Where is our ActiveX? COM?

Falling behind? I'm grateful that Linux doesn't have ActiveX (read: a huge security hole).

Granted, I can occasionally watch as the Java ads on Slashdot cause Netscape for Linux to crash, but that seems to be the extend of Linux's so-called internet connectivity.

What? The extent of Linux's 'internet connectivity'? What crack have you been smoking lately? Linux is more intimately tied to the net than any other OS (except for other Unixen) due to the fact that TCP/IP is an integral part of Linux/*BSD/etc. Just because Linux doesn't support a Microsoft-developed technology, it's all of a sudden not suitable for the Internet?

And you wonder why people are forced to use windows+IE?

It has much more to do with the fact that there are no 'major' apps available for Linux (by major, I mean the industry standard - Photoshop, Illustrator, most M$-crap) than it does about ActiveX. Before anybody jumps at me and says 'What about The GIMP?', Adobe Photoshop is the industry standard for pixel-based graphics design and photo editing. Most professionals (including myself) are experienced with Photoshop. To retrain oneself for a different program is harder than learning it from scratch.

If they want to make use of the latest technologies, for example 'Robust URLs' (though maybe they should have invested in a Robust Server), then Linux, sadly, can't keep up. We as a community are being left behind in the Internet arms race.

Why? I'm sure that someone will develop a Linux/*BSD implementation of Robust URLs and the incompatibility is solved. The Linux community is not being left behind at all, just because that can't use a few CraptiveX controls.

Fortunatly, I have a few ideas:
Get a task force composed of Richard Stallman, Bruce Perens, and ESR to develop and debug ActiveX support for Linux. Estimated time: 2 months.

Bad idea! Supporting ActiveX on Linux is (in my eyes, FWIW) tantamount to giving out your root password. Anything that allows automatically downloaded/embedded code to have FULL ACCESS to my hardware is inherently evil and should be destroyed. And Authenticode? Give me a break...that only tells you who to blame if you get a trojan and not whether the control is safe or not...

Form an Open Source Browser Committee to create a new, Open Source web browser that supports all the latest standards (CSS, DOM, DNA) Estimated time: 3 months.

Well, we do have Mozilla, even though it is not GPL'ed, it's Open Source.

Push for Perl to be embedded in all new web browsers so that CGI programs can be run on the user's machine, which will reduce server loads. Estimated time: 1 month.

Should be quicker than that - just provide an interface to the existing Perl implementation.

Design a new, Internet-ready desktop for Linux, Give it a web browser, probably the new one I described above, and embed it in everything: file manager, word processor, start button, etc. Estimated time: 4 months.

This is a great idea, which will (if implemented correctly) make the barrier-to-entry much lower than it currently is. Graphical configuration tools are also needed (but don't change the underlying architecture, let those who want to use the console).

I think that with these items accomplished, Linux will truly begin to shine as a web platform, even for the newest users.

I fully agree, except for the ActiveX support. Just because Microsoft develops it doesn't mean that Linux should strive to be compatible (else we will eventually have another Windows).

Disclaimer - My comments do not represent the views of ABC19 WKPT and are my own.
_______
Scott Jones
Newscast Director / ABC19 WKPT
Commodore 64 Democoder

--
FC Closer
Re:Won't work with Linux, sorry. by shub · 2000-03-01 21:01 · Score: 1

My understanding is that ActiveX is actual binary code, so you'd basically have to incorporate Windows95/98/NT/2000 into your OS in order to succeed at doing ActiveX. In fact, I believe that Linux already has something along these lines -- it's called "Wine".

Besides, even if you could succeed at making this happen, Micro$oft would be sure to change the code slightly so as to break your version, and if it hurts some of their customers in the process, well what do they care?

No, this is a fundamentally unworkable plan.
--
Brad Knowles

--
Brad Knowles
http://daily.daemonnews.org/ -- if you're not
Re:Won't work with Linux, sorry. by AntiNorm · 2000-03-02 00:54 · Score: 1

>Granted, I can occasionally watch as the Java ads on Slashdot cause Netscape for Linux to crash, but that seems to be the extend of Linux's so-called internet connectivity.

That problem is with Netscape, not Linux. Yes, I often have problems with Java crashing Netscape, but that happens regardless of whether I am using the Windows version or the Linux version. Point is, Linux is great, Netscape is okay, but Netscape's implementation of Java leaves a lot to be desired in the way of stability.

=================================

--

I pledge allegiance to the flag...
of the Corporate States of America...

HELLO MODER-IDIOTS!!! by Anonymous Coward · 2000-03-01 22:46 · Score: 0

Did any of you morons who moderated this up actually notice that the link is to www.scoobydoo.com? Geez! At least two of you didn't!

The parent message is a fucking troll!!!

Re:HELLO MODER-IDIOTS!!! by Anonymous Coward · 2000-03-02 00:06 · Score: 0

Proves my point: Moderators are idiots half the time. Looks like my score is 8 so far. Not bad, i've only been doing this for a week.

No it wouldn't by Anonymous Coward · 2000-03-02 04:28 · Score: 0

Nope, and it wouldn't work very reliably even without people actively trying to subvert it. Their paper on the subject notes several major problems (what if the page changes? can't find it. what if the keywords are unique now, but not in the near future? can't find it. what if the new location hasn't been index yet by the search engine you use? can't find it).

Re:finally! no more lost porn! by Anonymous Coward · 2000-03-02 06:35 · Score: 0

to elliminate pop-up windows, just turn off java and javascript.

Re:It's down already by Anonymous Coward · 2000-03-02 08:00 · Score: 0

It's not down, it just looks like it. It's a little joke. Read the page carefully, it's not a real 404 error, but merely a statement. They're making a point that you won't have to see it in the future. Don't be so quick to click back next time. Muerte

Re:Wasn't this what URI's were supposed to address by Anonymous Coward · 2000-03-01 23:20 · Score: 0

URNs are still being hammered out. In the meantime, oclc.org has already implemented most of URNs' probable features in what they call Persistent URLs (PURLs). They allow access to the source code for PURL servers (which basically do an HTTP-Redirect from your unchanging PURL to your mobile URL). Don't know about licensing off-hand.

finally! no more lost porn! by norelidd · 2000-03-01 20:42 · Score: 0

maybe now i can finally find all of those porn sites that keep on goin missin

Re:finally! no more lost porn! by GoofyBoy · 2000-03-01 21:11 · Score: 1

Maybe that was their motivation :)

I just wish I could stop all of these pop up windows.

--
The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.

The effect by vanza · 2000-03-01 20:45 · Score: 0

Now one has to wonder if the UC at Berkeley is the new player in the Slashdot effect game, or if the use of Napster in university campi really can use all that bandwidth... =)

--
Marcelo Vanzin

Re:Either that... by Ranger+Rick · 2000-03-01 21:57 · Score: 1

Not to mention that I'm on a Dvorak keyboard, and that makes his e-mail __y.'abaw@nmu.blw.'__ :)

Oh, and holding down left-shift on my keyboard didn't seem to help any.

--

WWJD? JWRTFM!!!

Robust until... by axolotl · 2000-03-01 21:55 · Score: 1

... the porn servers start embedding shitloads of common 5-word phrases in their pages so every 404 takes you straight to "101 pussies for today" or wherever.

Re:Wasn't this what URI's were supposed to address by Mithrandir · 2000-03-02 07:15 · Score: 1

Already done. Most of the URN work is already hammered out, and a few of the older RFC's need to be updated a bit.

From what I'm reading here, the form of those URLS this guy is generating is actually illegal syntax. That is, with the '?' character, that is intended as a query and any proper web server would attempt to run a CGI type script with it.

If you want to know more about URNs, and my implementation of them in Java (replaces most of java.net) go to http://www.vlc.com.au/~justin/java/urn/

--
Life is complete only for brief intervals in between toys or projects -- John Dalton

Re:Hijacking redirectors ?? by cpt+kangarooski · 2000-03-02 00:22 · Score: 1

Sadly NH still hasn't gotten around to changing it to "Live Free or Die, Punk"

--
-- This and all my posts are in the public domain. I am a lawyer. I am not your lawyer, and this is not legal advice.

Some details (and a complaint) about how it works by artdodge · 2000-03-02 00:58 · Score: 1

Actually, a quick (15-second) perusal of the actual materials shows that their approach is more like:

&lt A HREF="http://my.outdatedsite.com/page?robusturlkey words=farts+sandler+zippo+methane+boom" &gt

So the "robust keywords" are just an HTTP query string attached to the usual URL. When the server goes to produce a 404, it presumably calls a CGI (the distribution's jar file probably contains a 404 servelet or some such beastie) which re-directs (301 or whichever) to google.com with an appropriate query string based on the keywords in "robusturlkeywords".

As an HTTP junkie, I have to say I'm not too fond of it; you're ruining the whole point of 404 semantics. (Kinda like sites that redirect you to their homepage when you give them a bogus URL - it irks me to no end.) It would be much more straightforward (and less prone to attacks and the general unreliability of search engines) for server administrators to start maintaining proper 301-Moved Permanently databases and perform lookups in those whenever the server hits a 404 condition.

Just MHO.

Either that... by flanker · 2000-03-01 20:47 · Score: 1

or use relative addressing.

--
Left shift 1 for e-mail...

Re:Either that... by flanker · 2000-03-01 22:38 · Score: 1

Yes, the drunk/misaligned typist school of crytography. A little-known branch that died out soon after it was proposed in a Berlin beerhall.

--
Left shift 1 for e-mail...
Re:Either that... by jallen02 · 2000-03-01 21:26 · Score: 1

Your email thing is just to hard to figure out!

Kidding, And here I doing a left bitwise shift then I looked at my keyboard for a sec. heehee

Now if only web sites/servers can be robust by funkman · 2000-03-01 20:44 · Score: 1

Robust links may be great but I can't connect to the site to learn more about it.

Oops, I'm redundant.(OFF-Topic) by funkman · 2000-03-01 20:48 · Score: 1

After posting this, I notice the comment submitted before me said the same thing I just wrote. (The comment after me also says the same thing) It would be nice if we were allowed to delete our own comments to save some moderators some redundant moderating.

I think the reply is by kaisyain · 2000-03-01 21:10 · Score: 1

"ActiveX and COM aren't common Internet standards they are just the work of a proprietary company!"

Of course, that doesn't stop many of those same people from complaining about lack of Java on Linux ;-). And personally, I don't see a whole lot of difference between ActiveX and Perl (or TeX or Python or TCL or...), neither are standardized in any even vaguely meaningful sense of the word.

Apparently de factor "standards" only count when they come from the Good Guys.

Irony by maroberts · 2000-03-01 20:49 · Score: 1

I got a 404 Not Found error when I tried the link!

[OK, it was a server down or unreachable error, but it was funnier the other way]

Has it been Slashdotted already

--

Donte Alistair Anderson Roberts - hi son!
Karma: Chameleon

Too good to be true ;-) by Bartmoss · 2000-03-01 20:42 · Score: 1

If something is too good to be true, it probably is. Besides, chances are someone'll patent it sooner or later. ;-(

the end of 404s, but what about /.s by option8 · 2000-03-01 21:14 · Score: 1

love the idea of no more 404s, but, seeing as the server is appropriately and thoroughly slashdotted (or was when i looked), what about a url that's robust enough to survive the 'effect'?

--
- Entertaining Bits from the Ancient Kernel Tree

Re:There's always a "but(t)." by Briareos · 2000-03-02 01:29 · Score: 1

I can just see it... pr0n sites won't no longer need all those senseless keywords in their meta-tags to show up on innocent-looking keywords you feed a search engine...

No - now all they have to do is to stuff the "Robust Redirector" with some makeshift-keywords they extracted by spidering over a load of webpages, and presto! --- You've Got PR0N!1!

That's kind of like they do now, with sitenames that are popular "speling" errors of other sites...

Also, who's going to prevent people using the same keywords for their page, and how is the process of choosing between n possible redirections going to be handled, as it should be "transparent" to the user?

I guess there's a lot of thought-work left before this reasonably can go live... and still, how many of you have Smart Browsing enabled in Netscape, and how does this differ, privacy-wise?

(Mmmmh... Portscan... ARGH)

np: Boards Of Canada - Unknown Track 2.mp3 (Live)

As always under permanent deconstruction.

--

"I'm not anti-anything, I'm anti-everything, it fits better." - Sole

Oh, the irony by Webmonger · 2000-03-01 20:45 · Score: 1

So the link to Robust Hyperlinks doesn't work. Sigh.

Interesting, for unique keywords by Quack1701 · 2000-03-01 21:41 · Score: 1

This sounds interesting if your using unique keywords for something like a family web site and you have a unique sername. However, what will happen when I havea site that needs to use "Hot", "Sex", "Babes", "XXX", "Nude" for my keywords? How many other sites are going to have the exact same keywords? Or more seriously, how about "Smith", "Family", "Web", "Page"?

How will it help me if my URL changes?

quack

Re:Sounds very iffy to me by Voivod · 2000-03-02 07:56 · Score: 1

On the other hand, it migth be that the method uses Javascript, but at which point this nulls and voids any statement on "working on all existing browsers".

From freashmeat you can see that the appropriate file for it is called Robust.jar, so I think you're probably correct there :)

JavaScript has nothing to do with Java. The fact that the file ends in .jar implies that the system was implemented in Java. So, it probably uses Servlets which can quite easily produce content which will work on all browsers.

Re:Damn! by SEWilco · 2000-03-01 23:15 · Score: 1

Look more carefully at that 404 message. It's a joke.

Irony patrol by Sebbo · 2000-03-01 22:11 · Score: 1

Hm. The site seems to be slashdotted.

read before you write! by rp · 2000-03-02 15:26 · Score: 1

Please take the trouble to read the first paragraph of their article before making such comments. What they want to do is append a signature, something like an MD5 hash that depends only on the document content.

With Harvest, indexing software that is several years old, an indexing engine that identifies documents by their MD5 signature is easy to build, I've done this. So what these people are proposing isn't exactly rocket science

distribute the redirectors by rp · 2000-03-02 15:32 · Score: 1

If documents are identified by their digital signatures, the indexing space (of possible signatures) can be divided up among a whole network of redirectors, each responsible for a small subsace of signatures. Each rdirector would have to be replicated, of course.

All of the required technology is present in Harvest, it just never became popular. My guess is that cool ideas have to be reinvented in Berkeley before the world gets to see them applied at large, see Yahoo! for another example.

Needed: database of cross-links to make this work by AngusSF · 2000-03-01 23:18 · Score: 1

Seems to me what is needed to make this (Robust 404s) work is a database whereby all the URLs that refer to your pages go first, to be redirected to your page. When you change a page, you notify the database and off-site redirections follow the moved page. If you're careful on your own site, you never kill off a URL but instead have it refer forward .... doesn't work, of course, if your main URL moves but. Of course, this may be what the original article talks about, but it's either 404 or /.ed so I can't read it.

--
"A gun is a tool, Marian. No better, no worse than any other tool. An axe, a shovel, or anything." Shane (1953)

Not done alot of web programming I see by shinji · 2000-03-01 22:02 · Score: 1

Well I would say something like ActiveX is bad is not common at all. But I prefer to address the embedded perl statement, since I program in Perl. Why would I want the perl to run on the client machine? The problem with Javascript, JScript or any other client-side technology is the client. Major vendor refuse to follow any sort of standard forcing me to write 4 different version of the code to do the same thing and detect what platform and browser its running on. Cross-platfrom does not mean to me writing it for each platform then choosing the correct one to run. Server-Side technology such as PERL, PHP, C++ etc... allow me to access databases, generate dynamic code, but still spit out plain ole HTML.

If you have an idea don't pass the buck and say all these "famous" OpenSourcers need to do this. Go do it yourself...then maybe you won't be so quick to say how easy and quick it would be.

--
Remove the spam reference to email

May be answered.. by Junta · 2000-03-01 20:46 · Score: 1

I can't get through to the site to see if they address the most common 404 problem I have. The problem is that I sdo a search, find the page, never been then before, but now it has moved. How am I supposed to get this extended data about the page if the page moved before I ever saw it without webs earch engines storing this information too... Sure, Google can do it because of caching, but the others would be out of luck. In any case, 404 can never go away, things come up, things go down, things move. It may be possilbe to fix moving problems, but once a page goes down, it goes down :) Maybe forcing everyone to chmod directories so we get 403s instead, then 404s wouldn't be around so much :)

--
XML is like violence. If it doesn't solve the problem, use more.

Re:Wasn't this what URI's were supposed to address by yandros · 2000-03-02 00:59 · Score: 1

URI's are the generic term; you mean `URN'.

There are several different proposed URN systems being worked on right now (the document even mentions some, such as PURLs and handles). The big problem with these new specs is that there are a larger number of conflicting requirements dependsin on what you really want to do, so they're unlikely to be able to settle on just one proposal (they've been trying for several years).

Still, after looking through the `robust Hyperlink' documents, basically all of the old URN specs that I've seen are better than this, so I hope it doesn't distract people too much.

Why can't we eliminate 404's already? by Pentagram · 2000-03-01 21:09 · Score: 1

IANA 404 research scientist, but... Why can't my browser just open a connection to the web page, and if the heading starts with "404", not load the page and simply flash a warning that the page is not available?

May rely on typos--broken by spell checkers? by cphoenix · 2000-03-02 10:55 · Score: 1

The software seems to pick out the most unusual words in a page. Typos can get quite unusual. One of their papers gives an example that uses "peroperties" as an index word. On the target page, it's clearly a typo for "properties". If the authors of that page ever bothered to spell-check it, that word would go away, and the paper would be that much harder to find.

(I've already sent them an email about this.)

Chris

--
Ask me about Nanotechnology, Dyslexia Correction. Tell me about A.I., robotics, infrastructure.

Re:Hijacking redirectors ?? by theaphila · 2000-03-01 22:48 · Score: 1

actually, the welcome signs are even better

live free or die
pay toll ahead

Two Different Webs... by Delusion_ · 2000-03-01 23:51 · Score: 1

The web is great for the sorts of things that lots of people (particularly fellow geeks) are interested in: software, OS issues, MP3s, goat pornography, and Mahir Cagri.

But what if I'm looking for something specific? The web has been nearly useless to me when I wanted to find information on ancient illuminated Arabic text, or pictures of Microsoft Bob in action (for a parody).

So do "robust hyperlinks" help me or hurt me? Say I get a dog who has certain unsavory habits with regards to my cats, and I want to look up links about "interspecific coprophagia". Also assume for a moment that the next Korn clone band names themselves "coprophagia". Good search engines allow me to exclude entries that have certain words, but what happens when "robust hyperlinks"-based software assures me that http://www.coprophagiaonline.com/new_releases/ive_ got_the_word_yo.asp is a document on canine interspecific coprophagia based on the presence of several uncommon words...

...are we just using new technology to make search engines even more frustratingly inaccurate?

lexical-signature= "sex+mp3+porn+alissa%20milano+beanie%20baby+jesus% 20christ+coprophagia+free%20pics+online% 20investing

404 Error will never die... by ruhk · 2000-03-01 23:08 · Score: 1

... Not so long as I own the domain name! *muahahahahahahahahhaaaaaaa*

Then again, the domain name just won't be funny anymore if 404 Errors go away. *sigh*

--Ruhk

--

404 Error: .sig not found.

More Porn. by Chilles · 2000-03-01 20:54 · Score: 1

I can see it now.
Porn sites start copying the five words of large portal and news sites and in the event of a 404 for one of those sites you automatically get redirected to the site you really "wanted" to visit anyway.

Anybody know if this is going to be an actual standard or just something usefull until a new truly robust adressing system gets adopted. It might be on the site but that's sort of unreachable right now.

Great idea: leave forwarding message by kimihia · 2000-03-02 05:41 · Score: 1

I've just had a really great idea!

When you desert one host or modify your site, why don't you leave forwarding messages (or 302 responses) to tell people where to find your new content?

How's that for a great idea?

Berkeley? Pick up the cloo phone, it's for you! by gblues · 2000-03-02 02:10 · Score: 1

Oh GREAT. Just what we need--so much for the whole "you can't 'accidentally' find porn on the Internet" argument. This just throws that out the window, because all a porn site needs to do is hijack the right search keywords and wait for cnn.com to have a broken link.. *poof* millions of users get sent to porn.

Not only that, but it makes site debugging a pain in the ass.

Thanks Berkeley!

we can only pray... by geeKing · 2000-03-02 05:24 · Score: 1

...just what it says... We can only pray. I hate 404's and i am assuming you do too. I hope with all my heart and soul this works...

--
"As many of you know, I was very instrumental in the founding of the Internet" --Al Gore to Katie Couric 3/99

Re:404 Gallery by dingbat_hp · 2000-03-01 21:31 · Score: 1

You like 404s ? Try this one: http://www.g-wizz.net/wibblewibblewibble.swf.

Yes, that file extension is a hint...

This is old news... by athmanb · 2000-03-02 05:45 · Score: 1

I think it was back in 1995 when I saw a warez page (on Geocities) which used a feature like that.
If a visitor couldn't reach the site because Geocities had taken it down, he just needed to feed "paer9udtzk6gn8modfi" (paraphrased, of course) into Altavista to be pointed to the new location.

Re:Not unlike Freenet by stiefvater · 2000-03-03 00:49 · Score: 1

WOW.

This is a fantastically great idea.

How long before we get URLS like freenet://contraband_information.html ?

-k

Re:Not unlike Freenet by RickHunter · 2000-03-02 04:03 · Score: 1

I took a look at this, and it looks quite neat. If Freenet manages to get this right, I hope it really takes off. I especially like the idea of not having to dole out tons of cash or make do with a free web service in order to get something published.

-RickHunter
--"We are gray. We stand between the candle and the star."
--Gray council, Babylon 5.

the true dead links by yuval · 2000-03-01 21:58 · Score: 1

by far the largest number of problems i have with chasing information down (information that was not removed, but simply moved to another location) is because it has been moved OFF the world wide web and into the INVISIBLE WEB, meaning that it is accessible through a query to some database. the thing is, that the final location of these content pieces is generally known in advance to the site that is hosting them - and then the easiest way for users to relocate content would be to attach to it tags that define its location as a function of time.

Re:Off topic, but interesting. by GossG · 2000-03-02 02:42 · Score: 1

"Live free or Die" - Ironically, seen on a license plate.

It's worse than that. That state tried to penalize someone for covering the slogan. When someone tried to exercise his freedom of (non)speech by putting electrical tape over the slogan, the state took him to court. I seem to recall the case going on for a long while through several appeal processes where the state tried to force people to spout slogans about freedom. The irony was apparently completely lost on the bureaucrats enforcing the slogan.

Re:Not unlike Freenet by GossG · 2000-03-02 04:01 · Score: 1

I'll re-iterate what the AC said, only without the flamebait.

FREENET is already a widespread term, referring to MANY local public-access community supported ISPs. A quick lookup gives 16 countries with 233 separate groups.

It is unfortunate that nobody told you of the name overlap before this, but using "freenet" for your web will only generate anger among people already familiar with the community free ISP usage.

Hmmmm - Is it possible that the socialists (free public access to whatever) and the libertarians (Where were you when they took our freedoms?) have really never heard of each other's Freenet until now? I'm only familiar with the ISP usage, where it is

pr0n by DrEldarion · 2000-03-01 23:16 · Score: 1

Does anyone else think that this will just be another way for people to trap you into looking at their shitty pr0n sites?

-- Dr. Eldarion --

Maybe I'm missing something but... by aunitt · 2000-03-01 21:53 · Score: 1

If you are relying on a search engine to "reconnect" the link you are going to have problems.

Even the best search engines only index a small percentage of the entire web and then they are hideously out of date.

Not to mention the problems of someone hijacking your unique id by stuffing the search engine with bogus words.

(Disclaimer - I haven't read the actual article due to it being /.ed so I probably am missing the point entirely!)

Cool by code0 · 2000-03-01 20:41 · Score: 1

This would be cool. No more 404s! That's the problem with the web. I also like the idea that this is being put in open source so that we can all benifit. At least it isn't Microsoft...

--
---------- I laugh at a dumb SysAdmin.

Unique tokens, or fetch by MD5 by shalunov · 2000-03-02 00:25 · Score: 1

Well, it would be much easier to include a token somewhere (e.g., in a comment) that would be unique to this page. A randomly generated string of 20 ASCII characters would do the job.

But this is prone to the same highjaking attack as the original scheme.

A much better solution would be to fetch by MD5: teach search engines to compute MD5 sums of every document they index, then include MD5 sum somewhere in the URL.

That would also allow for better caching!

--

-- Stanislav Shalunov

robust link apache mod by joltrushsoon · 2000-03-01 20:52 · Score: 1

how about an apache mod that automatically checks the urls as they are sent and changes them. then there'd be no need for any browser modifications.
infact, it wouldn't have to be an apache mod - any kind of executable that could be cron'd to check links every so often could have the same effect.

i'm not sure how this would fit in with the whole signature thing. i suppose we could just pgp sign our web pages and but the signing in comments.

but as with most of my ideas, someone's probably already coded this.

Re:Won't work with Linux -- WRONG!!! by Weberik · 2000-03-02 00:59 · Score: 1

I went through the entire site, including the white papers. I looked at the actual Java code. Not a lick of ActiveX anywhere. Whomever posted this anonymously is either smoking crack, working for M$, or both. Robust Hyperlinks is pure Java.

This is very important for a few reasons. by dougman · 2000-03-01 20:58 · Score: 2

I'll explain the 2 that come to mind right away:

1) Growing sites that may change servers, or domain names (add/on to dedicated URL, change domain name for legal/incorporation/buyout reasons), will see the massive traffic bleed they suffer until everyone realizes their site has changed virtually disappear. Yes, putting a redirect page on your "old home" may help, but for things like RSS file addresses, and other external connectors, which may have an effect on your site, this is a problem.

Ultimately, of course, for this to TRULY work there needs to be technology like this built into not only browsers, but virtually any software that uses HTTP communication (XML parsers, bots, spiders, etc).

2) I want to start offering streaming video on my site, and the single biggest obstacle for doing that is COST. Bandwidth, unless you OWN the pipe, is NOT cheap. I can (albeit in a somewhat underhanded fashion) set up a script to register, say, 24 different "free site" pages with the content to be the "correct" version of my page once an hour, and, unless the content is in VERY heavy demand, essentially have a free method of streaming video on my site.

Egads, I'm already feeling dirty about what I just said. Okay, maybe that's a little TOO unethical. But I guarantee someone will do it.

Sounds very iffy to me by Masem · 2000-03-01 20:47 · Score: 2

First, I did try to access the link in the article, but the berkeley server appears to be down or slow.

That said, the concept seems iffy. Based on the above, the fact that it works in all existing browsers, suggests to me that the form of the URL is the following:

>a href="http://robusturl.server.com?http://my.outdat edsite.com&keyword1="whatever"<

Namely, that anchors that use this URL will be sent to this server (apparently fixed in place), then redirected either to the working page, or to the appropriate search engine results. This means that the robust server will be running scripts. While I don't believe that the indent as described here would be to catalog all matches, all you need is one unscrupulous company that uses this and can now trace where you are and where you are going to quite easily with a bit of modification. I really don't like this potental, and personally I'll take a 404 anyday over potental privacy problems.

On the other hand, it migth be that the method uses Javascript, but at which point this nulls and voids any statement on "working on all existing browsers".

--
"Pinky, you've left the lens cap of your mind on again." - P&TB
"I can see my house from here!" - ST:

Re:Sounds very iffy to me by ehiggins · 2000-03-01 22:53 · Score: 2

Ummm, .jar files do *NOT* indicate JavaScript.

Java != JavaScript, people!

--Earl
Re:Sounds very iffy to me by spiralx · 2000-03-01 20:52 · Score: 2

On the other hand, it migth be that the method uses Javascript, but at which point this nulls and voids any statement on "working on all existing browsers".

From freashmeat you can see that the appropriate file for it is called Robust.jar, so I think you're probably correct there :)

Wasn't this what URI's were supposed to address? by X · 2000-03-01 21:07 · Score: 2

I'm pretty sure URL's where just a makeshift URI and some day the IETF was going to figure out how to do URI's right. Am I wrong?

--
sigs are a waste of space

Re:Not unlike Freenet by Sanity · 2000-03-02 08:24 · Score: 2

This has been discussed to death on our mailing lists. Basically our view is that if Freenet is as popular as we hope it will be, then "Freenet" is the perfect term for it, it is possibly more deserving of the term than the other projects which currently use it. If, on the other hand, Freenet is not a success, then this won't affect anyone and it won't matter.

--

404 Gallery by dattaway · 2000-03-01 20:52 · Score: 2

Some 404's are just a way to pass time. Sometimes I go from site to site looking for pages that don't exist just to see what happens.

Re:404 Gallery by kwsNI · 2000-03-01 20:59 · Score: 2

Yeah, like Userfriendly. I love their 404.
You're in the midst of nowhere
a droplet in a mist,
you musta typed in something weird
this URL, it don't exist.

kwsNI

reinventing the wheel... by cabbey · 2000-03-02 04:47 · Score: 2

...poorly.

anyone who's looked at the http spec for more than a millisecond will see that it already handles this case quite gracefully with the 3xx series of responses, including:

301 Moved permanently
302 Moved Temporarily

I think /. even uses these once a story has been archived.

Re:Hijacking redirectors ?? by UncleRoger · 2000-03-02 13:19 · Score: 2

A very valid point:

Will this still work even if someone tries to add lots of context words to the search engines so it comes to their page instead?

Perhaps one of the keywords should be the previous URL? In fact, perhaps a better solution would be a new Meta tag of "Prev-URL" (or something similar) that search engines could look at and use to update their databases?

On an anecdotal note (or is that redundant?), I remember searching once, for the web site of a Land Rover owners club (I think it was Ottawa Valley Land Rovers in Canada) and was directed to a auto parts store in Australia -- turned out that the web pages had the names of lots of auto clubs in meta tags. The idea was to get people searching for the clubs to go to the store's site.

--
Stupid people will be persecuted to the fullest extent allowed by law.

smart and dumb by josepha48 · 2000-03-02 00:43 · Score: 2

I guess for those of us who don't want to make that move just yet, we can have our 404 document, say, "Sorry I am just a dumb server and don't know where the page has gone." Come back later when I get smarter.

send flames > /dev/null

--

Only 'flamers' flame!

Good Idea but 90% of 404's are deleted pages by bug_hunter · 2000-03-01 22:24 · Score: 2

This sounds like a good idea but you'll still see plenty of 404s if this gets into action.
Why, because 90% of 404's are a result of the page been taken down completely (especially if it's on geocities or xoom or some free provider).

A program that you could install for your browser like NetAccelerate (loads links off current page into cache when the bandwidth isn't been used) but simply loads the links far enough to detect a broken link or not would be very handy. Although it wouldn't solve any problems it would alteast stop you from getting your hopes up when you've finally found a link to a page that claims to be what you've been searching for for an hour.

--
It's turtles all the way down.

Nice idea, shame about the... by WhyteRabbyt · 2000-03-01 22:52 · Score: 2

<ASSUMPTION>The 'word description' is going to be capable of describing a page adequately, and uniquely, per page, like an MD5 digest, rather than a simple text descriptor. The latter would just be silly.</ASSUMPTION>

I can see some value to this if the page is static and likely to be relocated, rather than rewritten, or deleted, but how is this going to work if the page is, dynamically generated from a database, and the whole site is prone to reorganisation (like what Microsoft's seems to be).

It might help more if there was a way to uniquely identify snippets of content within a page, and provide a universal look-up scheme based on unique fingerprints of these 'snippets'. Although I'm sure that pouts it straight into XPointers territory, isnt it...?

And an 'opt-out' system is necessary. There are lots of reasons one might want particular content to be transient.

--
free experimental electronic music netlabel at www.viablehybrid.com

Re:The real solution ... by HalJohnson · 2000-03-03 16:55 · Score: 2

Yes, but thats only one side of it, the pull side. Eventually systems will evolve to the point where a push model exists along-side the pull model for robustness. Unfortunately data structures change, companies reorganize, and no type of pointer will really ever suffice. It will have to change at some point. The robustness of a push model will facilitate these scenarios. It's not a question of if, it will happen, eventually.

The real solution ... by HalJohnson · 2000-03-02 03:15 · Score: 2

And the logical next step is inter-server communication. At some point we'll end up with a defined way for servers to communicate with each other, so that when an object is moved or removed, the server that "owns" that object can notify other servers that own objects with links to it. The worse case would be better than what we have now, if the object has been removed, the other server could mark it as unavailable and notify the site owner that it needs to be updated. Some site management utils already have a process for checking broken links (pull model), we need a push model.

This will also allow site owners to see who's linking to them, but obviously it should be utterly transparent (so that you can still link in private, but then you wouldn't get updates).

At some point we'll get there, it's just a matter of time. Questionable schemes such as the topic of this story are just a kludge, and probably not worth the effort.

Damn! by PacketOfCrisps · 2000-03-01 20:46 · Score: 2

I am getting a 404 not found on the sites' homepage.

PoC

It's down already by spiralx · 2000-03-01 20:43 · Score: 2

Well it sounds like an interesting concept bu unfortunately I can't get to the site already. Surely it's too soon for the /. effect?

We need URLs first by dingbat_hp · 2000-03-01 21:26 · Score: 2

This sounds great - practical solutions to a real problem.

OTOH, there are already far too many sites where there just isn't an accessible URL anyway. Some are frame-based, some are dynamically generated. They all have the problem of not being bookmarkable (from within the browser's normal "Bookmark Here" function). Some do try to solve this though, by separately publishing a bookmark that will take you back to the same content.

If this idea is to really work, then it needs to be supported by dynamic sites publishing their Robust Hyperlinks, even for pages that don't have a "traditional" URL to begin with.

Re:Wasn't this what URI's were supposed to address by Shimbo · 2000-03-02 01:29 · Score: 2

There is a good paper by the man himself on the problem of URL persistence.

Definitely a heads-up for anyone looking for a quick technical fix to the problem.

Here's another way to do it by hoss10 · 2000-03-01 20:54 · Score: 2

Simply having a search string included seems a bit of a kludge to me.
What about it the link tag in the html also contained the date/time it was created. This way the browser would now how old it was. It the browser sent this to the server as a header then if the server couldn't find it it could check some database or whatever to see what the directory structure was like at that time and work out what redirect to use. If bookmarks also contained this date/time then surely the server could tell the browser to update the bookmark (after warning the user, of course).

This would be pretty cool on an interactive site where the server could rearrange query strings or whatever if the serverside scripting had been given a big overhaul/re-organization.

Basically, surely the server itself, and not some search engine would best know how to fix a broken link and it would only requires a couple of new headers and should be easy to implement at least on the client side.

------------------------------------------------ -
"If I can shoot rabbits then I can shoot fascists" -

thoughts by bons · 2000-03-02 01:28 · Score: 2

The situation:

My page has been moved for some reason or another.
The old page no longer exists at all, i.e. I don't have a redirect on it. (side note, surprisingly enough, many providers will be happy to keep your redirects around for an almost infinate length of time. It's not like they take up a lot of space or bandwidth.)
I built the first page with a specific set of keywords and I kept those keywords on the new page
The search engines FINALLY got around to spidering/accepting my site. (Note that it can currently take up to 6 months to be spidered and Yahoo may not reaccept you site.)

And this allows us what?

Well, it means we have to make sure we register with all the possible search engines, including the ones we usually don't care about.
It means someone will come up with a "find that 404" search engine that you'll have to submit to as well.
Meanwhile, people will notice that you've moved and will create redirect porn pages with your keywords and register them with the 404 search engine.
Microsoft will add something to Front page to create default keywords that send your 404 to microsoft.com
The new stardards are not part of the official Web Standards so Mozilla will not support it and w3.org will barf errors out about your HTML code.
Someone will figure out how to use this technology so that they can set up emergency /. effect mirror sites.
Someone will get smart and figure that trick out really quick and take advantage of it."I'm sorry, the page you want has been slashdotted, welcome to geocities."

-----

--

No Zen is good zen

Alexa's solution to "404 errors" by Animats · 2000-03-02 01:41 · Score: 2

Alexa has had a solution to 404 errors for years. They have a large archive of the web, and will give you a copy of a deleted page. Unfortunately, the Alexa client has ballooned into a combination advertising delivery system and portal. They're just now adding Amazon's shopping system. It's turning into a piece of bloatware.

Alexa also collects detailed information about what you look at with your browser, although they of course claim to use it only in the aggregate.

I see a problem with this... by Megane · 2000-03-01 22:43 · Score: 2

This makes one big whopper of an assumption: that the web page has moved and still exists somewhere. Well, the major cause of 404s that I know of is web sites simply going away.

So you get a 404 and you want to use a search site to find where it went? That's fine if it's been long enough since the move to give the web crawlers time to find it... there's a lot of web space out there to search!

But here's the good one: what if someone decides to hijack your web site by simple keyword spamming? All they have to do is set up their own page with the right keywords, get it indexed, and anyone who uses an "old" link will get redirected to them instead! And if web pages can be defaced, they can be removed, too, thus forcing the 404 and the search!

Better yet, use wholesale keyword spamming to get all those "dead" web pages pointing to your e-commerce site!

--
#naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }

There's always a "but." by UncleOzzy · 2000-03-01 20:56 · Score: 2

... as in, "It's a good idea, but!" As has been pointed out, there are potential privacy issues. For the "average" user, though, I don't think this is a terribly big deal. What becomes a problem, then, is access to the Robust URL redirector (as I understand it from posts, the site seems to either be simply down, or a victim of the /. effect). Since all Robust URLs have to pass through the redirector, what happens if the redirector is down? What happens if the redirector is unreachable?

Furthermore, simply feeding keywords to a search engine doesn't guarantee finding your page quickly, or even finding it at all. Designers would have to include unique keywords - words that might not even apply to their page - so that a Robust URL search would turn up only their page. Not only does this bloat HTML code, but it also confuses people using search engines in the usual way.

Certainly a good idea, as many people hate 404s (bah, they're just a fact of life), but it seems like it's got more than a few bugs left in it.

Not unlike Freenet by Sanity · 2000-03-01 20:58 · Score: 3

I am working on a project that will do something like this - and a whole lot more. The primary intention is to create an information publication system similar to the world wide web, but where censorship is much more difficult or impossible. However there is more to the system than that, it incorporates intelligent decentraised caching making it much more efficient than the world wide web, and also intelligent mirroring meaning that information on the system will never be slashdotted as this site appears to be! The homepage may be found at http://freenet.sourceforge.net/. We are looking for testers and developers right now in preparation for our first release which will happen in the next few weeks.

--

Re:Wasn't this what URI's were supposed to address by SimonK · 2000-03-01 21:26 · Score: 3

You're not wrong. There is in fact a proposal about the form and resolution of URNs (which are location independent) from the IETF. I don't know its status.

Dynamic content by Hard_Code · 2000-03-02 00:37 · Score: 3

As far as I can tell this scheme relies on checksums of the static content of web pages to find the correct web page. So what does this do to dynamically generated content?

Also, somebody else mentioned that they had a project on SourceForge which was basically like the Web, but in a completely distributed manner. This makes a lot more sense to me. The notion that my bits must cross a continent to retrieve data on a certain TOPIC seems a bit archaic. I shouldn't know or care where the data of the topic is stored...I just want it. Also, having a distributed web like this, as the person suggests, will make it a lot harder to invade privacy or censor material.

--

It's 10 PM. Do you know if you're un-American?

Hijacking redirectors ?? by UnknownSoldier · 2000-03-01 20:46 · Score: 3

Will this still work even if someone tries to add lots of context words to the search engines so it comes to their page instead?

Don't mean to be the Devil's Adocate, it is just my game programming / design skills kicking in. Whenever someone adds a usefull feature, you must look at the ways people will try to exploit this.

"Live free or Die" - Ironically, seen on a license plate.

Replacing a broken link with a Google search? No. by rambone · 2000-03-01 21:32 · Score: 3

Like any search, the search that tries to reunite your 404 error with the correct address is going to be wrong quite often.

Frankly, I'd rather just get the 404 than waste time digging through erroneous links.

By the way, there are hypertext systems that address this issue in ways that actually solve the problem - the now defunct HyperG system was very intelligent about redirecting requests.

Try ftp'ing instead by EricWright · 2000-03-01 20:46 · Score: 5

From the freshmeat announcement, you can ftp it from here. I was able to connect just fine...

Eric

Slashdot Mirror

Robust Hyperlinks: The End of 404s?

105 comments