Broken Links No More?
johndoejersey writes "Students in England have developed a tool which could bring the end to broken links. Peridot, developed by UK intern students at IBM scans company weblinks and replaces outdated information with other relevant documents and links. IBM have already filed 2 patents for the project. The students said Peridot could protect companies by spotting links to sites that have been removed, or which point to wholly unsuitable content. 'Peridot could lead to a world where there are no more broken links,' James Bell, computer science student at the University of Warwick, told BBC News Online. Here is another story on it." See also the BBC story.
Hang on. On similar lines, I've a great idea. Suppose I type a nonexistent hostname into my browser. Wouldn't it be good if the DNS server just gave me its best guess instead of an error message. Or some kind of Site Finding search engine. That'd be even better than
Athletic Scholarships to universities make as much sense as academic scholarships to sports teams.
Wouldn't this idea work a lot better with semantic web markup attached to links and also to intranet pages?
Agile Artisans
My biggest problem is when I follow a link to a website that's no longer there. Yeah, moved pages happen, but I don't think they happen as often as deleted pages, expired domains, deleted websites, etc.
Do not meddle in the affairs of sysadmins, for they are subtle, and quick to anger.
This sounds a little like SiteFinder from Verisign. Click a broken link and isntead of a helpful error message you get whatever content IBM thinks is appropriate. Certainly this could be useful, but it could also end up as just another vehicle for advertising.
The "related" search isn't what you should be looking at.
Try this.
I've had enough abrasive sigs. Kittens are cute and fuzzy.
"Peridot could lead to a world where there are no more broken links". Yes, it could. Peridot could also lead to a world where broken links are not manually and intelligently spotted and repaired, but automatically repaired. Automatic resolution of what a link "ought" to point to is never going to be accurate (look at search engines), and could make a company website a minefield of confusion and frustration for the user.
Only time will tell, I suppose.
Some algorithm cruising through my website, rearranging files as it sees fit?
Sounds like a recipe for utter disaster in the worst case, and a source of mildly embarassing incidents at best.
How about this algorithm just report dead links to a human instead of trying too hard to be clever?
This sounds like someone had to come up with a final project, and settled on this one.
Maybe I'm being overly naive, but checking for broken links doesn't seem all that spectacular to me. It wouldn't take long to write a script to find all the broken links on a page.
The only parts that seemed worth while are replacing the links automatically, and testing if links are relevant.
I'm not so sure I'd trust a computer to do those things though. I'd much rather have the links flagged and checked by a human.
Slashdot Syndrome: the sudden, extreme urge to correct someone in order to validate one's self.
Any good Content Management System should already take care of any internal broken links automatically, or notify the webmaster so he'll be able to take care of it manually (in the case of page deletion, etc).
The only kind of people who'd go out of their way to use this software, probably have already use some sort of CMS.
A link points to document X.
If document X moves, and the link is invalid, a search for the link might actually find document X, and therefore, you have your benefit, and you would have saved a 404.
However - if a document becomes deprecated and deleted, then how can you assume the link is valid?
Or indeed, if the document has no relevant substitute.
A genealogy providing a link to another Willian Wallace wouldn't be good news if the original page went missing.
A better system is automated 404 alerting to the webservers administrator.
A bad link gets hit, bam, what document, from where. You can work things out intelligently, not automatically.
I think this is silly, perhaps grasping at straws, I see no reason why we would replace all our links to google 'I feel lucky' searches, so why do something like this?
This is the essence of what they have, and all they have done is coulded the search IP field (which is important) with 2 more patents, again increasing costs and endangering open source innovation, the true innovative playing field.
Of course, I could be wrong.
#hostfile 0.0.0.0 primidi.com 0.0.0.0 www.primidi.com 0.0.0.0 radio.weblogs.com
this isn't about replacing links on the internet as a whole... it's about replacing links on your company website, or at least reviewing those links.
not everything that happens in the world is an attempt by big brother to steer internet traffic to verisign or microsoft.
ErrorDocument 404 script.pl
Where script.pl parse the wanted URL and ask an indexing engin to find the most relevant page associated with the query...
Trolling using another account since 2005.
I'd prefer a more helpful 404 page, maybe with some links to the homepage or main sections of the site on it.
Sort of a "cannot find hello.jpg, click here to go back to the main page".
My point being, if the document I'm looking for is not there, I want to know it's not there. I don't want to read something else, thinking it's what I meant to read.
Usually when I'm googling around and clicking stuff I'm looking for the answer to some coding or computer related problem. I don't want to click on a link for "configuring Samba 3.0 with AD support", and wind up on a "Configuring Samba 2.2 with LDAP" and waste my time following bad advice.
I don't need no instructions to know how to rock!!!!
After RTFA it doesn't seem like a fair comparison to say it's like google's "related" or Verisign '
"product", this looks like a technology a webmaster would use on there own site. It also gives them they option to accept the suggestion or not. This could be really good for corporation with large intranet sites as webmaster leaves documents constanly get moved etc.
I think had the original poster read the article they wouldn't have gone of half cocked. IBM must also be somehwat confident that this is new technology or else they wouldn't have filled two patents for it.
500 dollar reward for tip(s) leading to the arrest of the person(s) who stole my sig.
On a slightly related note, a Firefox extension that searched links ahead and removed the link rendering for those that return a 404 might be handy (albeit fairly evil).
On a less related note, I've long been disappointed that some 300 series status codes in HTTP are so under-exploited, both by clients (e.g. automated bookmark management) and people running web sites.
Soon the target network would be back up, but all your links would be lost and randomly changed to something less useful. Good Invention!