Can rev="canonical" Replace URL-Shortening Services?
Chris Shiflett writes "There's a new proposal ('URL shortening that doesn't hurt the Internet') floating around for using rev="canonical" to help put a stop to the URL-shortening madness. In order to avoid the great linkrot apocalypse, we can opt to specify short URLs for our own pages, so that compliant services (adoption is still low, because the idea is pretty fresh) will use our short URLs instead of TinyURL.com (or some other third-party alternative) replacements."
I read the first link, sounds like complete and total batshit paranoia. I can't be alone in this opinion. Really, tinyurl has been around the entire 11+ years I've been on the internet, and somehow the internet's survived just fine.
tag:slownewsday anyone?
What value are these new URLs if they aren't cute?!?
Seth
$5 / month hosted VPS on linux = awesome!
I didn't understand a single word of the submission, and I used to teach Web design. Is it too much to ask submitters to define terms they use?
how about we just kill all twitter users instead?
http://developers.slashdot.org/comments.pl?threshold=2&mode=thread&commentsort=3&sid=1196477
Best Slashdot Co
There's also hiding links to shock sites.
For anything that isn't electronic, a shortened URL has you make less mistakes. For example: example.com/typeskjd583 is going to be more accurately typed than somesite.org/wiki/index/cool_tips/code/perl/hello_world.php . A lot of people when they see a site in print can easily mentally change it around, so somesite.org/wiki/index/cool_tips/code/perl/hello_world.php might become somesite.com/wiki/index/cool_tips/code/perl/hello_world.php , the shortened URL protects from this because people aren't trying to convert it to words and then type it, for example, something that was written as "Gray" may be mentally changed by someone to "Grey" because when they say the word "Gray" in their heads they see it written as "Grey".
Its like typing in those serial numbers with software compared to cheat codes in old-school video games. The serial numbers are abstract so the letters in it are simply letters, whereas the cheat code may spell part of some word, if someone frequently misspells it (or the code is a misspelling of a word), it may be harder to enter.
Taxation is legalized theft, no more, no less.
isn't the limit mainly for its utilization of SMS?
Absolute power corrupts absolutely. indymedia
Yes, TinyURL hasn't killed anyone. BUT... any attempt to fix this is entirely missing the point anyway. From the article:
If they fix twitter to support links with proper labels or tag contents --- Oh, I don't know, like HTML has supported from the very beginning --- then there wouldn't be a problem.
Don't work around the bugs, fix the bugs. Links are designed for machines, the higher-level marked up text is for people.
Twitter is essentially an SMS aggregation and redistribution tool. SMS is limited to 140 character messages. I do not think you understand the meaning of the word "arbitrary".
Linkrot happens when a URL shortening site (such as tinyurl) is pulled offline. Billions of dead links is not good.
On the Twitter /. feed, this of course shows as:
slashdot Can rev="canonical" Replace URL-Shortening Services? http://tinyurl.com/c3j4n8
P.S. Now if you want a really short URL, try http://tinyarro.ws/ (no affiliation; just impressed by the idea)
somesite.org/wiki/index/cool_tips/code/perl/hello_world.php
That's just wrong.
512 MB RAM, 20 GB disk, 200 GB transfer, five datacenters. $19.95/month.
This is a phone-related problem. The basic problem is that URLs are being sent to devices that don't cut, paste, and bookmark. This is only an issue if you have to type the URL manually.
Maybe what's needed are smarter Twitter clients.
How about Twitter just stops arbitrarily limiting characters. Go by word count, perhaps?
I know some avid twitter users, and the majority of them apparently use the idiotic SMS message system to 'tweet' each other all throughout the day on their phones. Twitter can't abandon the 140-character limit for this reason.
For the record, I am against anything that keeps the SMS system relevant in this day and age. It should have been abandoned long ago in favor of standard data packets on the internet, rather than control packets on a proprietary wireless system. There's no good reason to keep this system alive when it either forces you to pay $X per month for it, or pay $.15 per 140 characters when one of your idiot friends 'texts' you. There's no way (that I know of) to force incoming SMS to route through GPRS, so you are hit with SMS fees even when you already pay for unlimited data. It also invites spam that you actually DO pay for, quite literally, and from which the wireless carrier profits as well. It should be illegal for the carrier to charge you for incoming SMS messages. Anyone who agrees with me should call their congressperson to protest this policy and call their wireless carrier to block all SMS messages.
idontthinkthatwillworkverywell.
Life is like a web application. Sometime you need cookies just to get by.
Instead of using a plethora of different URL shortening services, any of which might disappear at some point in the future, Twitter should implement its own URL shortening service (using, say, the domain http://tw.it/ or similar) and thereby shorten any URL's that Twitter users post. Assuming the Twitter team can manage this (given their track record with things like message queues, however...) then there would be no possibility of linkrot.
Unless you're using shortened URL's somewhere besides Twitter, of course. But why on Earth would you do that?
There's all this talk of URL shortening services - whether third-party, or in-house implementation.
The question here is this: Why are the URLs so long to begin with?
Why does it have to be:
http://shiflett.org/blog/2009/apr/save-the-internet-with-rev-canonical
A full title in the URL is, IMHO, a very inefficient idea. The excuses I've heard are:
Search Engine Optimizations (better performance when keywords are in the URL)
Okay, I can't argue that some search engines do stuff like that. But shouldn't the TITLE or META tags have more bearing on this than how ridiculously long the URL is?
"The URL has meaning, so you know what you're clicking", Context, etc.
I suppose that when I see a URL like
http://shiflett.org/blog/2009/apr/save-the-internet-with-rev-canonical
as opposed to something like
http://example.org/blog/526
I would have a slightly better idea of the article's content before clicking on it. But then again, I can't really say that I've decided against clicking on a link just because of the link URL. I would, instead, decide whether I'd want to visit the link by its link text/description.
So <a href="http://example.org/blog/526">blog on link shortening</a> would still have the same effect on me as a long URL IMO. If it were bookmarked, the same rules would apply.
Hell, if I were handed an obfuscated shortened URL without context, I'd know even less of what I was getting myself into.
I think the proper solution is to just stop making ridiculously long URLs to begin with, so we don't have to rely on obfuscation/hashing/shortening to accommodate services that have character limit restrictions. And we'd save bandwidth too, apparently. Win-win?
Direct link to the revcanonical website. It really is "rev" rather than "rel"; evidently this attribute is an HTML 5 proposal which hasn't been accepted, or so it says at http://benramsey.com/archives/a-revcanonical-rebuttal/
Here's the thing: it's not just the path that is the problem, it's also the domain name. You can shorten "/blog/2009/apr/save-the-internet-with-rev-canonical" to "/abc123", but if your domain name is something plus-sized like "rickosborne.org" or worse ... how much have you really gained?
It's a little helpful, but not really. What you've done is remove the little bit of semantic meaning from the link, all in the name of being able to ego surf easier. Huzzah.
LOL! Only in America, the free market bastion of the world, do you have to pay for incoming texts.
Free Manning, jail Obama.
"Because bigger is better, right?" http://www.hugeurl.com/
1999 called, it wants its charges back.
People pay for SMS in your country? Here even pay and go plans have unlimited SMS bundles.
And I can't even parse this statement.. "or pay $.15 per 140 characters when one of your idiot friends 'texts' you"
How can your friends make you pay for SMS? Do you have some way of sending bills over it or something?
All this short URL stuff sounds like some phishing scam if you ask me. Short cryptic URLs obviously exist to make me transpose a couple of letters or numbers and end up at some fake bank site. No, give me large detailed URLs so I can see those dead giveaways like pid=poor_sucker&sid=steal_credit_card_info !
Short URLs indeed... no thank you Nigerian scammers... I won't be transferring any large sums today!
On a serious note, why is this news exactly?
In the US, if someone sends you a text message, you have to pay for it, and if you don't have a plan each text typically costs ~$0.15
Unfortunately, it's not yet an integral part of web frameworks that I have seen. So I am adding it in a new web site I'm building. It means I have to add the feature to the web server.
It works like this. Every part of the web site code that builds URLs for the same site passes them first through the mapping logic. This basically builds an SHA1 checksum of the canonicalized URL string. Then it looks up the string in a fast database (I'll be using Berkeley DB for this). If it's already there, and is the same URL, it generates a new URL that references the checksum. If it was a different URL, it notifies me that it found an SHA1 collision. If not already there, it adds it. The original URL is thus replaced with the mapping URL.
Code added to the web server will be designed to detect checksum URLs. If it looks like one, it looks it up in the database to get the original URL, and proceeds with the request using that URL. Original URLs would still be processed as usual, in case they leak out, or are intentionally made to bypass the mapping for special purposes. Basically it's like a tiny URL service, but integrated without the need to do a redirect.
One thing I am looking at doing is shortening even these URLs, even though they should be short enough already. But this raises the chance for a collision to the point I'll need to add logic to deal with it. How I would do that is similar to a hash data structure collision, but by expanding on the SHA1 checksum by adding back digits that were removed to shorten it.
External URLs to other sites can be done the same way. This does add the extra redirection. I could limit the use of this only to long external links, since this being a web interface, should handle long external links OK. It could be an option.
now we need to go OSS in diesel cars
Because of this:
http://www.google.com/search?hl=en&lr=&c2coff=1&rls=GGLG%2CGGLG%3A2005-26%2CGGLG%3Aen&q=http%3A%2F%2Fwww.google.com%2Fsearch%3Fhl%3Den%26lr%3D%26c2coff%3D1%26rls%3DGGLG%252CGGLG%253A2005-26%252CGGLG%253Aen%26q%3Dhttp%253A%252F%252Fwww.google.com%252Fsearch%253Fhl%253Den%2526lr%253D%2526c2coff%253D1%2526rls%253DGGLG%25252CGGLG%25253A2005-26%25252CGGLG%25253Aen%2526q%253Dhttp%25253A%25252F%25252Fwww.google.com%25252Fsearch%25253Fsourceid%25253Dnavclient%252526ie%25253DUTF-8%252526rls%25253DGGLG%25252CGGLG%25253A2005-26%25252CGGLG%25253Aen%252526q%25253Dhttp%2525253A%2525252F%2525252Fwww%2525252Egoogle%2525252Ecom%2525252Fsearch%2525253Fsourceid%2525253Dnavclient%25252526ie%2525253DUTF%2525252D8%25252526rls%2525253DGGLG%2525252CGGLG%2525253A2005%2525252D26%2525252CGGLG%2525253Aen%25252526q%2525253Dhttp%252525253A%252525252F%252525252Fuk2%252525252Emultimap%252525252Ecom%252525252Fmap%252525252Fbrowse%252525252Ecgi%252525253Fclient%252525253Dpublic%2525252526GridE%252525253D%252525252D0%252525252E12640%2525252526GridN%252525253D51%252525252E50860%2525252526lon%252525253D%252525252D0%252525252E12640%2525252526lat%252525253D51%252525252E50860%2525252526search%252525255Fresult%252525253DLondon%25252525252CGreater%252525252520London%2525252526db%252525253Dfreegaz%2525252526cidr%252525255Fclient%252525253Dnone%2525252526lang%252525253D%2525252526place%252525253DLondon%252525252CGreater%252525252BLondon%2525252526pc%252525253D%2525252526advanced%252525253D%2525252526client%252525253Dpublic%2525252526addr2%252525253D%2525252526quicksearch%252525253DLondon%2525252526addr3%252525253D%2525252526scale%252525253D100000%2525252526addr1%252525253D%2526btnG%253DSearch%26btnG%3DSearch&btnG=Search
My blog
A couple of good questions I have seen, and my best attempt to answer them:
1. Don't you mean rel? No, I mean rev. It indicates a reverse link.
2. Why not make your URLs short in the first place? I happen to like my URLs and have made them as short as I want them. They're only too long in some very specific use cases, like Twitter. I could just complain about Twitter, or I could support an idea that makes URL shortening suck less. I chose the latter.
Thanks for reading, and please do feel free to criticize whatever you think is wrong with this idea. I'd like a way to indicate a preferred short URL for my own stuff, and this seems like a pretty good way to do it that makes sense semantically and is easy to implement. For an ongoing discussion about adding an HTTP header to do the same thing (so that only a HEAD request is required), read here:
http://shiflett.org/blog/2009/apr/a-rev-canonical-http-header
Or in print. Most people can manage xkcd.com/84 without writing it down.
US wireless carriers charge on both ends -- both the receiver AND the sender will pay the 15 cents per message, assuming neither one of them has an unlimited plan. I think this charge used to be 10 cents, but was raised to 15 cents last year. Or maybe it was 15 cents and was raised to 20 cents. I have no idea, but either way it is terrible. I think plans are typically $5/month for 200 'texts' or $15/month for unlimited.
And don't even get me started on MMS messages. I received my first MMS spam the other day. My first thought was "ooh, nice tits", but my second thought was "$#%&, I probably just got charged $3.00 for this spam!"
Yes, it looks hideously long. It also works fine, it's clickable, I really don't get the big deal.
Don't thank God, thank a doctor!
Surely the author of that rant knows about dns cache ... your pc will only consult the NS for tinyurl, etc once per day -if at all- depending on how many of those you click on.
...
... you still would have the 140 char limit.
And if you click on them rarely the delay would be neglible, cos you only use them rarely
Plus this, interesting as it may be, still does not solve how to get a long url into a Tweet... it does not matter if Twitter can go look up the small URL on its own
Why do so many URLs look like RDBMs queries? Has someone been sold a bill-of-goods?
As for shorter URLs, they become much shorter minus the DB cruft. And then all it takes is a modicum of logic to form some durable system.
Some people cannot avoid flavor-of-the-month. Those people should not be making decisions with any sort of permanence or continuity.
digg
It wasn't even the Digg Bar exactly. Gruber didn't like it because of the obvious reasons (breaks bookmarks, history, hides the site, etc) but mainly because the DiggBar was turned on by default for all users. Other sites have things like the Diggbar, but no-one really complained about them because users had to turn them on by default.
If he alone had not liked it you would not have seen the rush to block it from all quarters. I as a user despised it myself, and am happy to see all framing mechanisms die a horrible death.
Shortening services that use a redirect, he and others have no issue with.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
No, rev was in previous versions of HTML, but was apparently dropped in HTML 5, probably because people didn't understand the different between rev and rel.
rel="canonical" and rev="canonical" are different things
To get something done, a committee should consist of no more than three persons, two of them absent.