Broken Links No More?
johndoejersey writes "Students in England have developed a tool which could bring the end to broken links. Peridot, developed by UK intern students at IBM scans company weblinks and replaces outdated information with other relevant documents and links. IBM have already filed 2 patents for the project. The students said Peridot could protect companies by spotting links to sites that have been removed, or which point to wholly unsuitable content. 'Peridot could lead to a world where there are no more broken links,' James Bell, computer science student at the University of Warwick, told BBC News Online. Here is another story on it." See also the BBC story.
There are two parts to this tool, one of which is bad quite and one of which is quite good.
First, replacing links. This is a rather quite bad idea. Here's why, with an example.
In general, we can all agree that the technology behind Google is pretty impressive. It has its own "More Pages Like This" feature, which we can assume is at least somewhat similar to this one. Complex content analysis amoung billions of pages, to determine which are similar and which are different.
So, suppose we had a link to Major League Baseball, www.mlb.com on our page. And suppose, for whatever reason, that their site went away (perhaps a few more players' strikes?).
Well, what does Google suggest as a replacement? Check it out here.
First the National Football League (NFL), then the National Basketball Association (NBA), and then the National Hockey League (NHL). Followed by the ESPN sports network, and NASCAR racing.
Obviously if wanted to link to a site about baseball, all of those (other than ESPN) are really entirely irrelevant.
But if we wanted to link to a site about professional sports organizations, all of those (other than ESPN) are QUITE relevant.
Can this software know our intent?
Hardly.
You really have to question the ability of machines to select relevant links.
The situation is this: If someone goes to the trouble to manually create links in the first place, those should not be automatically changed to other sites that some computer program thinks may be related. Links shouldn't be inserted automatically; if someone needs more information on something you haven't linked to, they can use a search engine. And then your company isn't liable to look idiotic by linking to irrelevant sites.
Now, the other aspect of this product.
Removing dead or changed links is quite another matter. Automated removal of links is a great idea and quite useful. For example, consider when someone's domain name expires and it is taken over by a porn site. It'd be great to have a program that automatically removes links to it from your site. Like this tool, this could be based on a percentage of changed content--if the content changes significantly, remove the link quickly and automatically. If the content changes some intermittent amount, flag the link as needing review by the webmaster.
But in those both case, the software should present the webmaster with a list of such questionable links, those it has removed from the site temporarily, and then allow the webmaster to select replacement links.
Manually. With relevance.
I decided it'd be too hard for software to decide whether a change was significant. I wonder how this software does it - presumably, you can change the threshold?
some over funded jumped up interns have developed a high tech, method and software and system to stop the slashdot effect.
Each webserver will return a redirect to a google cache lookup for itself if the load sever gets too high.
1: Stupid idea
2: Patent
3: Wait 'til someone nudges at your generously worded patent
4: happily license this unrelated technology to keep thier VC peeps in the green.
#hostfile 0.0.0.0 primidi.com 0.0.0.0 www.primidi.com 0.0.0.0 radio.weblogs.com
You could create your links using googles im feeling lucky feature, assuming it was just a generic link site looking for interesting sites rather then specific articles.
s +For+Nerds&btnI=Google+Search
e.g:
http://www.google.com/search?hl=en&ie=UTF-8&q=New
And voila, you'll site will take you to the most popular related site to news for nerds, automagically, if slashdot died one day, another site would take it's place in the google rankings. FF.
This is quite often true in respect to sites/companies with large webpages and hence lots of links. One company I used to work in the internet/intranet division for kept links to several partners' webpages. When one of those partners let their domain expire, it was bought out by a pr0n company.
You can imagine how much the staff enjoy the content on the new page... and the IT Security folks especially as the proxy was suddenly giving them lots of nice warnings about workers' viewing inappropriate conduct (probably due to the nasty popups, etc).
I, personally, hate dead links with a passion. And, usually, I can devise a Google search that will give me the new home of the old link--often nothing's changed other than the server. A tool that does this for me is useful. Sure, there's plenty of issues that need to be looked into, but that's what we used to call "the Next Version".
It's easy to nitpick this. Seeing the technology, and then seeing how it can be improved in its next iteration is what separates a visionary from a Slashdot howler monkey.
Potato chips are a by-yourself food.