Using Google to Calculate Web Decay
scottennis writes: "Google has yet another application: measuring the rate of decay of information on the web.
By plotting the number of results at 3,6, and 12 months for a series of phrases, this study claims to have uncovered a corresponding 60-70-80 percent decay rate.
Essentially, 60% of the web changes every 3 months." You may be amused by some of the phrases he notes as exceptional, too.
This kind of thing can be a good application of Google's SOAP interface!
"Why did they cancel my favorite Sci-Fi show? I downloaded ALL the episodes!"
Are google claiming that they can check through the entire internet inside a timescale of 3 months, ready to check through again at the start of the next quarter?
Surely this can't be true. Check Google's cached pages - see the dates on there?
Google is turning into another history book.
Roadkill is yummy.
It seems to me that in a way, the web is like an organism, whose smaller constituents are constantly (or not so constantly, depending on the webmaster) renewing themselves. It's a truely adaptive medium, and thus drastic change in short times like this as interest shifts should be quite expected.
That said, this is one of the many ways in which Google is an invaluable tool for research. Not just finding information, but generating it. Thanks Google!
For once, that is on topic. I'm glad to see that the phrase 'bill gates sucks' had the lowest decay rate of the phrases that the guy tested for.
Boy, that looks like some detailed analysis he's done there.
I actually always wondered about this. Really interesting, although I guessed that there would be a rapid rate of decay due to the nature of "information." Things get old and pass with time. An interesting application of this would be to keep records over a number of decades and figure out the average life/revival span of certain trends.
How long until all the cheesemakers have fully decayed and are no longer blessed?
I don't look forward to that day.
Long live cheese and cheese makers!
Saying your OS is the best because more people use it is like saying MacDonalds make the best food
It would also be interesting to see how much of the web no longer exists... like at what rate the web is dying. God knows there's enough dead links out there...
Once upon a time...
For a few moments, I thought that the phrase "base" (for baseline) on his graph was a reference to "all your base are belong to us." It would have been neat to see how quickly that phrase appeared, then decayed!
--All your stolen base are belong to Rickey Henderson
After reading the artical, I found a few things to be disturbing...
:)
First of all, he showed very little of his actual data. This makes it difficult to tell if his interpretation is correct.
Thirdly, what the heck was this guy smoking when he came up with search phrases. Most of these phrases seem to be tangental to the main purpose of most web sitees on the internet.
Finally, Timothy, why didn't you put the foot icon by the story?
But, it is interesting to see his results. I can only imagine that if Archive.org did a study like this, they would be able to make a more legitimate conclusion. Perhaps some collaboration is in order?
I only do this since I know an angelfire page will get /. and reach bandwidth limits fast! However, there is a pretty excel chart on there so bookmark and come back much later.
Web Decay
by Scott Ennis
4/26/2002
Knowing how anxious most companies are to keep their web content "fresh," I was curious how "fresh" the web itself was.
In order to come up with a freshness rating for the web you need to sample a very large number of pages. Not wanting to do this, I opted to use the Google search engine as a method for reviewing the web as a whole.
My hypothesis is this: By searching Google using some common english phrases and returning results at various time points, a baseline can be reached for the common rate of freshness of overall web content.
I took the total number of pages found for each given phrase at 3, 6, and 12 months. I calculated a percentage for each of these points based on the total number of results found with no date specified.
For example: Phrase 3 mos. 6 mos. 12 mos. Total
buy low sell high 4700 5470 6200 7830
60% 70% 79% 100%
Note:
This method excludes any pages which are not text and more specifically, not English text.
This method relies on a random sampling of phrases.
Using this methodology I determined that the average rate of decay of the web follows a 60-70-80 percent decline at 3, 6, and 12 months.
Therefore, If a company wants to maintain a freshness rate on par with the web as a whole, their site content should be updated at the inverse rate. In other words:
60% of the site should change every 3 months
70% of the site should change every 6 months
80% of the site should change every 12 months
The only way to do this effectively is to either have a very small site, or have a site with dynamically generated information.
The following graph shows the decay rate for a few phrases. I selected these phrase to display because of their unique characteristics.
bill gates sucks--This phrase had the lowest decay rate of any phrases I searched.
life's short play hard--This phrase had the greatest decay rate of any I searched (note: this search was also very small).
blessed are the cheesemakers--This phrase was relatively small, but demonstrates that quantity of pages may not be important in determining decay rate.
late at night--This phrase returned the highest number of results of any I searched and yet it also adheres closely to the 60-70-80 rule.
Conclusion:
Web content decays at a uniform, determinable rate. Sites wanting to optimize their content freshness need to maintain a rate of freshness that corresponds to the rate of web decay.
The ultimate network admin tool needs HELP!
Digital libraries and World Wide Web sites and page persistence
Sig: What Happened To The Censorware Project (censorware.org)
It makes a ton of sense to conclude that information on the web 'decays' at a specified average rate based on the observations of 5 phrases.
Good job, goober. Here's your PhD
To arrest that decay rate, here's my contribution.
:)
Bill Gates SUCKS
Bill Gates SUCKS
Bill Gates SUCKS !!
BASE BASE BASE
BASE BASE BASE
BASE BASE BASE
Late at Night
Late at Night
Late at Night
life's short play hard
life's short play hard
life's short play hard
blessed are the cheese makers
blessed are the cheese makers
blessed are the cheese makers
I request all members of the forum to link this post in all the websites you could access, and post this message too
Anything you put in the freezer won't decay fast. Tehy didn't listen, and look how fast the web is decaying.
From the evidence, he searched for very few phrases. The sample size is way too low to be representive of the web - which some estimates put at several billion more pages than there are people on the planet! There are no signs of more than about 5 different phrases being searched for here..
Can a few simple searches on Google really generate a large enough sample to draw such large conclusions?
The report is one page long, hosted on Angelfire. There is no substantial data to back up his claims. Is this report reliable in any way?
I'm amazed this got posted on the front page of Slashdot..
This makes the job of Archive.org - like sites damn tough.
P.S. Are we losing information at a comparable rate to generation....?
This is news ?
/. will dig for 'stories'
Or a joke ?
Must be a joke - anyone basing 'research' or a 'survey' on 'bill gates sucks' and 'blessed are the cheesemakers' is either really bored, or trying to see how deep in the barrel
A slashdotting - you get the stick first and then the carrot !
He creates a problem for himself by not providing us with his raw data, making any subsequent verification of the trend difficult. In fact, the one data set he gives us:
Phrase 3 mos 6 mos 12 mos. Total
buy low sell high 4700 5470 6200 7830
60% 70% 79% 100%
seems to demonstrate the opposite of the trend that he describes. Indeed, a current search on google shows about 1,270,000 results (makes you wonder when he did his searches that the current number of results is so many orders of magnitude in difference). The methodology also fails to take in to account any growth in the size of the web, which could mask the effects of decay.
if there is any part of the web that has been around from the begining and not changed at all.
http://b.150m.com
Why was this modded as Troll?
Check out his credentials.
bill gates sucks
He used to suck, sucks and will suck forever. this phrase is eternity and wil be there till forever. Googles will come and go, the net will decay and do radioactiavity but the eternal truth will remain forever --My Aurora : http://www.youtube.com/watch?v=o91ZsGwJYyg
FB : https://www.facebook.com/TanveersPhotography
Yet another crippling bombshell hit the beleaguered web community when recently IDC confirmed that the web accounts for less than a fraction of 1 percent of all server usage. Coming on the heels of the latest Netcraft survey which plainly states that the web has lost more market share, this news serves to reinforce what we've known all along. The web is collapsing in complete disarray, as further exemplified by failing dead last in the recent Sys Admin comprehensive networking usage test.
You don't need to be a Kreskin to predict the web's future. The hand writing is on the wall: the web faces a bleak future. In fact there won't be any future at all for the web because the web is decaying. Things are looking very bad for the web. As many of us are already aware, the web continues to lose market share. Red ink flows like a river of blood. Dot-coms are the most endangered of them all, having lost 93% of their core developers.
Let's keep to the facts and look at the numbers.
The web leader Theo states that there are 7000 users of the web. How many users of other protocols are there? Let's see. The number of the web versus other protocols posts on Usenet is roughly in ratio of 5 to 1. Therefore there are about 7000/5 = 1400 other protocols users. Web posts on Usenet are about half of the volume of other protocols posts. Therefore there are about 700 users of the web. A recent article put the web at about 80 percent of the HTTP market. Therefore there are (7000+1400+700)*4 = 36400 web users. This is consistent with the number of Usenet posts about the web.
Due to the troubles of Walnut Creek, abysmal sales and so on, the web went out of business and was taken over by Slashdot who sell another troubled web service. Now Slashdot is also dead, its corpse turned over to yet another charnel house.
All major surveys show that the web has steadily declined in market share. The web is very sick and its long term survival prospects are very dim. If the web is to survive at all it will be among hobbyist dabblers. The web continues to decay. Nothing short of a miracle could save it at this point in time. For all practical purposes, the web is dead.
Fact: the web is dead.
testing out my trending skills
I'm not impressed. The article does not define what he means by decay, or how he measured it, except in the vaguest of terms. The analysis of the data is poor; anyone interested in decay would suspect some kind of exponential decay. They would therefore plot the data logarithmically, and perhaps calcualte a half life. Piss poor.
Ne mæg werig mod wyrde wiðstondan, ne se hreo hyge helpe gefremman.
The "Study" does not take into account new web pages that have replaced the old.
But then again it is an interesting piece of trivia
Tim Berners-Lee wrote :"There are no reasons at all in theory for people to change URIs (or stop maintaining documents), but millions of reasons in practice.":
http://www.w3.org/Provider/Style/URI
and advocated creating a web where documents could last, say, 20 years and more
The Open Directory Project [www.dmoz.org]
The nature of information is decidedly ephemeral compared to the static nature of much of the web. Perhaps the surge in Weblogging has altered this dynamic even more than the hypercommercialization, but I'll dispute the 60% figure if it is based only on those four phrases. Much of the early Web was fairly static research and information hosted on .edu domains from what I gather. Since the tide shifted away to .commercialization and tripe, the nature of "information" has little to do with the state of the web, and more to do with tidiness. How much of the Web is long abandoned fan sites and dusty old means abandoned from the "information superhighway"?
In fact, Information Superhighway would be a great data point for this subject. Another consideration, which would be difficult to accomodate, is the reality of mirrors and shuffling pages to different URLs.
Most importantly, I strongly hope that your "interesting application" never gets implemented, because I can see no application of the resulting data that doesn't make my blood run cold. Psychological Warfare and hostile advertising are the bane of the Post-WWII US, and (likely) the world. Propeganda is a pernicious technology, and I fear further development in this area.
Okay, I'll admit that was a touch trollish. Because the Psych. Warfare genie was already released from it's NAZI bottle and invited into the US (along with other valuable sciences), it's a little late to advocate repression of this technology. Yet I still reel from my country's increasingly malevolent commercialism aspects, which have spun off from Capitalism without any of Capitalism's redeeming social aspects. I almost want to become a socialist, until I consider that this state of affairs sprung from the National Socialist state.
In any case, while the WWW may be evolving, is certainly isn't in the Darwinian sense that was likely intended. Vestigal Geocities homepages long abandoned are plentiful, and are less temporary, giving search engines a better shot at crawling than dynamic, or "living" news portals. This sickly "creature" is more of a construction than the product of evolution (unless you consider pre-Charles Darwin senses of the word). If you want to research the nature of information and survivability/mutability, the Freenet Project would provide a much more fruitful environment, if it ever reached widespread useage. I would have less strenuous objections to classifying the Freenet an "ever-evolving creature".
Doesn't Google keep improving its search algorithm so that only relevant sites are provided in the hits? Did this "researcher" hit the link that includes the filtered out near duplicates?
Temporarily Unavailable
The Angelfire site you are trying to reach has been temporarily suspended due to excessive bandwidth consumption.
The site will be available again in approximately 2 hours!
Wow! What a wonderful, in-depth, study! Is there any link to a scientific paper on that page that I am missing or is that everything? I mean, how can someone claim something just showing us a few numbers and an excel graph.
:)
I appreciate the topic very much, but some more material on it is needed. This study wouldn't be complete enough even for high-school homework...
And look at his homepage (just remove the last part of the url). The most pages are more than two years old... that's decay!
Seriously speaking, just look for a few more sources before you accept a story.
this study claims to have uncovered a corresponding 60-70-80 percent decay rate. Essentially, 60% of the web changes every 3 months."
The guy that submited this story is the guy that did the study.
On a similar note, I was curious to see what the CowboyNeal content of the web is. As luck would have it, a precise answer can be found easily.
:)
:P
Google gives us the following interesting results:
3,840,000 sites contain the word Cheese.
1,640 sites contain the words CowboyNeal and Cheese.
Therefore, 4.27083333333333333333333333333e-2% of cheese related sites contain a reference to CowboyNeal.
As cheese is a randomly chosen word with no special connection to CowboyNeal it is reasonable to assume that 4.27083333333333333333333333333e-2% of all sites contain a reference to The Cowboy (Assuming the number of sites dedicated to CowboyNeal equals the number dedicated to ignoring him).
So there we have it. The web is 99.957291666666666666666666666667% CowboyNeal free.
I said the results were "precise", not "accurate".
I am a Karma Library.
bored geeks mercilessly devouring the download limit of free sites...I can't help but find it amusing that this guys decay information has just decayed.
Yeah no doubt.
I can't even find my page on google anymore. I don't know if it's just because my site's unpopular, or because it has the same name as an online retailer. In any case, it's not searchable anymore, and my guess is that it was removed as "dead".
Once you have put a page on the Web, you need to keep it there indefinitely. Read more. Slow news day, eh?
I don't claim this is the authoritative answer, or an in-depth study, but the raw data comes from Bill's very own MSN search: bill gates sucks, check it out...
Google SOAP thing for compare-stuff is in the pipeline...
Essentially everything gets older by the minute!!!
Actually this post is getting old!!
Enig? Det alt for hot det smor!
"Temporarily Unavailable
The Angelfire site you are trying to reach has been temporarily suspended due to excessive bandwidth consumption."
Imagine that you were renting a building and running a business - a retail store. One day, the owner of the bulding comes in and padlocks the doors and says "Sorry, you can't re-open till the first of the month - too many people have come into your store".
What stupidity.
Our weblogs show that google visits our site (www.up.org.nz) atleast monthly, and it is by no means a huge traffic drawing site in the global senee. Its' last visit was on 13th April, drawing 1888 hits...
In his story submission, scottennis spoke very impersonally of the study he authored himself:
"By plotting the number of results at 3, 6, and 12 months for a series of phrases, this study claims to have uncovered a corersponding 60-70-80 percent decay rate."
Was that just scientific detachment or was it someone pretending that he and a few clueless Slashdot editors aren't the only ones who would take any serious interest in this numerology?
"bit-rot-quantified" department eh...how bout the "but-not-qualified" department.
late at night--This phrase returned the highest number of results of any I searched and yet it also adheres closely to the 60-70-80 rule.
If he really wanted a large search he should have tried "porn".....
"Freedom of speech has always been the abstract red-headed stepchild of the Constitution"
-Suck
Looks like 100% of the link mentioned in this article decayed in a little under 5 minutes!
Cheers,
Bowie J. Poag
Yeah, in bytes. I wonder how many digits that would be?
Cool! Amazing Toys.
Are google claiming that they can check through the entire internet inside a timescale of 3 months, ready to check through again at the start of the next quarter?
I don't know if that's all that far-fetched. I know Googlebot last hit my site on April 7th, crawled every page in my domain over the course of 12 hours, and current searches of their cache show content I'd updated at that time. They seem to visit every month or so.
Perhaps it's based on the traffic they detect to a given site through their CGI redirects... but I'm not a large site, my primary webserver is a Pentium 90. :)
crawl4.googlebot.com - - [07/Apr/2002:13:36:32 -0400] "GET /broken_microsoft_products/ HTTP/1.0" 200 128854 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
Fire and Meat. Yummy.
While the numbers clearly aren't totally random, they are very fragile indeed. Some people have had a change of two orders of magnitude, within a week. And in these cases, there have usually been no real world events that could explain such a change. I guess the google page hits numbers depend as much on the internal google structure, as on the number of actual pages on the web.
So I doubt google page hits statistics is a useful research tool. Nonetheless, it can be fun. Here are some google hall of fame lists:
- A list of the most famous Danes according
to google.
- A list of free software
celebrities according to google.
- A list of Emacs contributors sorted
according to google hits.
- A list of sequential artists sorted
according to google hits.
- A list of OS (Kernel) Mindshare sorted
according to google hits.
PS: Mail me to suggest new entries to the lists... I noticed that a paper I wrote a LOT of years ago can still be found online somewhere.. so I suppose that although -in the average- web pages do disappear, if those pages contain documents, they will survive the death of their original webpage.
not that it was an interesting document - just a little paper about nothing important. But still, it's out there.
My thoughts? I think that as long as a website can be "saved" in some form, its content will be available in other forms for a long amount of time.
this should make people think, especially those who put copyrights on their webpages, or don't want some information to spread around.
could we say that information want to be free as long as it's downloadable?
hmm..
-- There are two kind of sysadmins: Paranoids and Losers. (adapted from D. Bach)
By thoroughly researching the following phrases on www.yahoo.com :-
Sex
Warez
mp3
I have discovered that amazingly, my results differ substantially !
In conclusion, then, it seems that content is ultimately always fresh and there is no indication of decay !
A slashdotting - you get the stick first and then the carrot !
What's so special about the cheese makers?
It's not meant to be taken literally. It refers to any manufacturers of dairy products.
-- Monty Python, Life of Brian
What scares me here is the conclusion that web sites need to change their content 60% every 3 months. This is not freshness, this is reorganizing to re-organize. If you are considering doing this, you had better seriously re-consider your future. Its an interesting study but a good meme doesn't die simply because the catch-phrases are tired.
At faculty meetings at our school I sit with a bingo card. On it are a series of catch-phrases. We listen for the catch-phrases and shout out when we have finished our cards. B***SH*T is the game and to reduce your content to a series of reorganized catch-phrases is like having a marketing guy develop foreign policy.
Anyone willing to write the perl module that searches for the latest catch-phrases and inserts them randomly into your web content. Yeesh!
Using Google to calculate Tooth Decay.
Anyone have any mirrors? (by the time I'm done posting this, there'll probably have been a dozen "First Mirror Posts" but oh well.)
Slay a dragon... over lunch!
For example, most web pages linked to in slashdot articles.
Ironicaly, this site on decay, adds to the decay.
They failed to include one statistic: The decay rate when the Slashdot Effect is applied to a website: 99.998%
:)
Indeed life is short
Gone are the cheesemakers, but
Bill gates always sucks
The study I posted on Angelfire appears to have reached a bandwidth threshhold. I've made the same study available here:
http://helen.lifeseller.com/webdecay.html
I've also included a link to the raw data I used.
Read any good sonnets lately?
I have never liked the smell of bit-rot, so I like to keep them close by my desk where I can keep them well-watered and pruned. ;)
.html file or text off to my hard drive, or (lately) used Adobe Acrobat to get the whole page (preserving graphics and layout in one binary file, rather than 100 extra .gif/.jpg images in a directory somewhere).
For years, whenever I've found an article that I've liked, or data that I thought would be useful later on, I've always either saved the
Ryan
Don't steal. The government hates competition.
This gal may see an inordinate amount of traffic soon (http://www.web-decay.net).
The website pointed to by the article seems to have been taken down due to excess bandwidth usage ... text-book example of being slashdotted :-).
The key to making links that don't rot is to design a URI schema that's both independent of any redesigns of your site and independent of any particular way of doing things.
y &threshold=3&commentsort=3&tid=95&mode=nested&pid= 3434535 - what is it telling you that it doesn't need to?
.pl is a bad idea. What happens in 4 years time when SlashDot is running on PHP, or Java, or Perl 7, or a Perl Server Page, or ASP? Then there's the difficult-to-decode query string that tells you nothing about the link other than "this is the information the server needs to locate your page at the moment", and doesn't give you much faith in it living forever.
6 511/51/post#here is a URI to reply to a random comment on k5.
Let's look at a few examples.
The URI to this page is http://slashdot.org/comments.pl?sid=31884&op=Repl
Well, for a start, that
Now let's look at an equivilent Kuro5hin URI.
http://www.kuro5hin.org/comments/2002/4/29/22137/
For a start, you can't tell what application or script is serving you the page, and you can't see what type of file it's linking to; both these things can and will change over time.
Second, there's a date embedded in there; you can see the developers, if they ever decide to change the meaning of '/comments', using that date as a reference; if the URI is before the change, they can map it onto the new schema or pass it onto legacy code.
Having the date in the URI is good because it allows you to determine when the link was issued, and map it onto any changes or pass it off to a legacy system as required.
Now let's take an apparantly good link on my now horribly out of date site, aagh.net.
http://www.aagh.net/php/style/ links to an article on PHP coding style.
Certainly, hiding the fact that I'm using PHP to serve this document is good, and shortening the URI to remove the useless querystring is good (you can't see one? Good, that's the point), however, this URI may well stop working in a few weeks; I'm planning a redesign and the old schema may well not fit in well with it.
A short yyyymm in there could have made all the difference; a simple if check on the URI's issue date would keep it working.
The moral of the story: Think about your URI's when you're designing a site. Try to remove as much data as you can without painting yourself into a corner.
This seems so totally- "if everyone else is
jumping off the Brooklyn Bridge, then we
should to" by itself that it discredits what
sliver of credibility the article had. Using
a web-wide average as a guideline for what
a particular web site "should do" is
meaningless. Web sites should present timely,
appropriate information that is useful to
those who visit. Some sites deal with
material that changes frequently (stock quotes
and sports sites should be presumably updated
regularly) and some sites deal with material
that does not change frequently (no need to
redo your tech support documents for long-
out of production products every week.)
This notion of `freshness' is ill-defined,
poorly measured and of dubious value.
It's psychosomatic. You need a lobotomy. I'll get a saw.
1. Here is tonight's top 10 list
2. Critical Updates Package (138 MB)
3. Hey Ho Let's Go
4. Nobody's perfect
and, of course
5. News for nerds, stuff that matters
The key to making links that don't rot is to design a URI schema that's both independent of any redesigns of your site and independent of any particular way of doing things.
You can't mod_rewrite a domain name that you have lost control over. If you have a popular site hosted on a university's server, and then you graduate, what do you do? If you put up a site, some Yakkestonian trademark holder takes it from you in WIPO court, and you're forced to go to Gandi.net to get a new domain, what do you do?
Will I retire or break 10K?
I hope that this study is a joke, if not the horse shit that people churn out is getting worse and worse. I cant belive this made the front of slashdot.
http://helen.lifeseller.com/webdecay.html
Read any good sonnets lately?
The analysis of the data is poor; anyone interested in decay would suspect some kind of exponential decay. They would therefore plot the data logarithmically, and perhaps calcualte a half life. Piss poor.
So when can we expect to see your rigorous analysis? Or were you just bitching?
Nope, no sig
The Open Directory is the bigest load of crap I have ever seen in my life. I could not find a fucking thing I was looking for on that load of shit.
Nootch, SilentBob
Did you find yourself laughing hysterically?
Read any good sonnets lately?
Once you have put a page on the Web, you need to keep it there indefinitely.
How is this possible if you happen to lose control of the domain? I wrote a letter to Tim Berners-Lee about this issue.
Will I retire or break 10K?
yep.
It may be a valid thesis that you are putting forth. It does occur to me, however, that you seem more interested in having someone else prove it for you. Your rather cursory investigation and lack of basis neither lends credence to your theory nor compels one to take it seriously. If you desire the respect of the scientific community then I suggest you put a little more work and effort into it.
Why do so many people use crap like Angelfire, Tripod, Homestead with all their bandwidth limits, restrictions, ads and blocking of remote image loads?
;) Of course, then it isn't worth looking at, so who cares if it is even hosted.
Not to mention that well over 50% of the time any search engine result that points to Angelfire in particular points to a 404 Not Found. This is much more than what I experience with other sites. Do their users get kicked off often, or just go away, or what? I don't even bother clicking on those results unless it looks like the content is truly compelling. And thank God for Google's cache.
I can understand if some truly can't afford hosting, but even for these people, even Geocities is much better!
Somehow I doubt the majority of those people using Angelfire, Tripod, etc can't afford hosting.
Well, after the dot-com world gets a little more squeezed, those sites may no longer exist. Too bad that many people won't bother rehosting their content and will just drop off the web.
olm.net offers Linux based hosting for under $9/month. No I don't work for them, but I am a (satisfied) customer.
$9 a month - and you won't piss off your users.
(Yes I know their other packages are more - but the $9 a month package is better than any of the free services)
Don't EVEN get me started on organizations and commercial BUSINESSES (ack!) that use free hosting - that is so unprofessional. I don't think I'd want to do business with a company (even a local store) that wouldn't/couldn't pay $9 a month to have a less annoying and more reliable website.
Of course, some of the content out on the Web isn't even worth $9/month, heck some of it has NEGATIVE worth.
Just because it CAN be done, doesn't mean it should!
It's a ringer for a typical adequacy.org story :)
(Link omitted deliberately.)
i love it. who would have thought that the "dying" troll would live so long?
sulli
RTFJ.
See above.
Getting diabetes AND salmonella would be a bad weekend.
Okay - this article is very entertaining but "Deacy" is the wrong word for this - the author must have been studying capacitors and the decay rate of charge on a capacitor. I would suggest substituting "decay" for "charge"
I can't believe this made Slashdot!
This seems to be a complete missinterpretation of the data. Is he saying that 1 year ago there were more hits than 6 months ago, and 6 months ago more hits than 3 months ago?
The Google search is returning all valid pages within the past 3, 6, and 12 months. So, all of the current pages are listed in the 12 month search also.
From the data he has provided, it is possible to interpret either that the number of pages is in constant decrease, or that the number is in constant increase (with old pages being removed or relocated).
Using his data for "home run king", 7070, 7520, 7920, 8900:
You could say that more than a year ago, there was only 980 hits (anytime - 12 months)
Then, 6 months to a year ago, there were 400 added (12 months - 6 months)
Then 3 months to 6 months ago, 450 were added (6 months - 3 months)
And, in the last 3 months, 7070 pages were added (3 month value)
This shows a constant increase! Sure, this is highly unlikely, but it is a possible way these hits were gathered. His data collection gives no way to tell how many pages have been removed between periods, or how many were relocated.
Why did this get posted?
vk
"Decay" would be more along the lines of X% of links become dead after 3 months. You'd have to collect a bunch of live links from various search terms and check ALL of them 3,6,12 months down the road and see if they're still there. 60% is more a measure of changed/new content in the last three/whatever months. At least the web isn't stagnating.
What about archives? They should not care about being 'fresh' beyond adding stuff to the archive. I want to be able to bookmark something in an archive for future reference and be able to come back to it in three years and still find it there, just like a library.
The argument that web sites should change 60% of their content in order to keep up with the average is like saying we should all be wearing puke-green colored clothes because that's the average color of the universe - the reason has nothing to do with reality. Web content should be as 'fresh' as the information being provided demands of it. Weather forcasts should change daily, stockmarkets - hourly, slow pitch standings - monthly, and so on.
W9x:Thanks for the make-work project Bill.
Mod it -1 Troll, then post in the discussion, for the coveted and richly deserved "5, Troll" score.
I love how this very page seems to have died... The web is a massive irony generator.
"Your superior intellect is no match for our puny weapons!"
...Bill Gates is the Devil! the Devil! Thank you for your attention.
Has anyone written a script to try to figure out if a text message is a positive one or a flame? Shouldn't be too hard (you'd toss out A LOT of 'unknowns').
If someone has, you could graph the ratio of positive/negative posts to USENET for a set of keywords over time.
One could also graph the total volume. That would be much easier.
-twb
Down with Michelle's fascist regime of censorship! Long live the Widener!
This is a difficult question to answer, but the answer is full of totally unrelated semi-googlewhacks and curious links.
Enoc
Try this out! It's a PHP script using the Google API. Now you can discover if the world likes dogs better than cats, and sex better than love (duh).