When Your Site Ceases To Exist
El Lobo writes with a sobering account of how Javalobby dropped off the face of Google last month. The site had been attacked by forum spammers and Google indexed some of their spew before the Javalobby guys could remove it. According to a post in Rich Skrenta's blog, Google is now the de-facto front page for the Internet, accounting for anywhere from 70% to 78% of the search market. The power this conveys is hard to overstate. From the Javalobby saga: "We had completely disappeared from Google's main index! If you run a website, then you know how serious a problem this is. On any given day over 10,000 visitors arrive at Javalobby as a result of Google searches, and suddenly they stopped coming! ... Suddenly we no longer existed in the eyes of Google."
Javalobby? Another slashvertisement ...
I just typed in "Javalobby" in the Google search and their link came up on top. If there was a problem, it looks like it's fixed.
The CB App. What's your 20?
1. Move all forums to Javalobbyforums.com or equivalent
2. ???
3. Hire 'little people' in multicoloured pointy hats to help generate traffic for your site not that it is now google acceptable
4. Profit!
Even those who arrange and design shrubberies are under considerable economic stress at this period in history.
Remember the days when one got word out on a web site based on sharing word of mouth, etc? Back to them.
Of course, the anti-spam crowd will say it is a good thing this disappeared because they weren't fast enough to do something about it. Kinda like Googles Real Time Black Hole.
(I don't share that opinion.)
They're on the Slashdot front page, I don't think they'll mind being off Google for a little while.
Maybe you should stop relying on a single source for you advertising.
... wait, you did that.
Maybe you should actually monitor your forums. You know, in case your customers need your help or a SPAM-bot goes on a rampage.
Maybe you should actually have a site that people care about so they'll keep coming back.
Maybe you should slashvertise and
If your site is worthwhile, dropping off Google for a week won't affect it that much, and you'll actually have control over your forums.
"If you make people think they're thinking, they'll love you; But if you really make them think, they'll hate you." - DM
5. Have midgets properly proofread all posts
Even those who arrange and design shrubberies are under considerable economic stress at this period in history.
If they could have implemented one layer of security or verification to prevent spambots from registering (similar to phpBB or vBulletin), they would have prevented all this. But they didn't. There is no image verification on their forum registration page. All it takes is a spammer with a source of disposable e-mails such as dodgeit.com to spam your page to hell.
Did you miss the memo? Google owns your ass now.
This is why people don't like monopolies much.
- Adam L. Beberg - The Cosm Project - http://www.mithral.com/
Maybe it's time for the DOJ to start building an anti-trust case against Google...?
The problem is indeed deeper than just a headache for a webmaster or two. Let's face it: just as the desktop software market depends on MS Windows, and a lot of software companies will vanish overnight in case Microsoft introduced a new trick [like, signed - for a price - executables only, or backwards-incompatible API, etc], so the web now depends on Google. Should all the Google system administration team take a week off - and voila, you get no new customers, because they don't know where to go, and you're lucky if somebody from your old clients returns using his browser's history. Of course, there's Yahoo, MSN, Nigma, and a hundred of startups, but all of them combined hardly have the same significance that Google enjoys alone. So let's either keep our fingers crossed and hope that Google will not do anything more evil than it does now, or... heh, I don't really know even what else could we do.
It's as bad as concentration of wealth. I know a bunch of geeks that think Google is all sweetness and light (probably because EA returns their resumes unscanned.) Maybe they'll wake up from their narcosis now; they're worse than Apple fanboys.
Pinwheel of death!
Nah... Is there anything worse than Apple fanboys?
It's time to realise that Abble's products are the biggest abomination these days. Just say NO to the dumb iAbble way!!
From the title, I thought this was going to be finding a mirrored copy of your website after you stop maintaining it and your host drops you. But being nolonger indexed?? That doesn't make your site dissappear - what a drama queen. Untill Google becomes the only search engine, or becuase a government institution, people need to stop being so dramatic. Websites existed before search engines as far as I understand.
"Thanks for all the money you paid to us. We've used it to buy off ISO among other things" -Microsoft
whilst some people may have a point about the *cough* slashvertisment this article has made me think about Google and monopolies, should I now change my search engine of choice because having many players in any market is better or is a monopoly acceptable when they are (pretty much) the best... even if they do sometimes change where, and if, they list sites
*''I can't believe it's not a hyperlink.''
Maybe this is where Google needs to provide multiple indexing algorithms. The idea by giving different result types ( most linked, closeness to keywords, flashiness, highest rated, totally random, etc ), this would make it harder for site spammers to know which algorithm to be targeting.
Jumpstart the tartan drive.
I refuse to even click the link. This site, based on what I see here, deserves anything bad that happens to it. Millions of sites see their traffic rise and fall every day. And none of them take up our valuable time to post a sniveling bitch about it to the front page of Slashdot.
That'll teach em'.
If the forum isn't particularly time sensitive, how about just not serving recent forum posts ( 1 week) to the search engine spiders, which advertise themselves as being such, no?
That gives you some elbow room.
Such cynicism; but you do have a low user ID, so I'll give it a pass as perhaps the voice of a soul beaten down by actual slashvertisements. Perhaps you should read the article and give the content a chance? Yes?
Dude. Slashdot is the last place I'd want to advertise. Their site will be down in minutes (what with being on the front page, and the article unabbreviated).
For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
Is there anything worse than Apple fanboys?
Definitely. Linux fanboys.
Could they have prevented this? I think so. But let's face it: Google results are a great way to advertise on the internet. Do you find the products you're looking for on banner ads from other sites? I always use google to search for products and services that I want to find online.
Good news somebody, it's back on google.
OP might have a point that this is slashvertisement. javalobby is on the top for the 4 primary search engines.
r c=IE-SearchBox= dir- 8&fr=b2ie7m icrosoft:en-us&ie=UTF-8&oe=UTF-8&startIndex=&start Page=1
http://search.live.com/results.aspx?q=javalobby&s
http://www.ask.com/web?q=javalobby&qsrc=0&o=333&l
http://search.yahoo.com/search?p=javalobby&ei=utf
http://www.google.com/search?q=javalobby&rls=com.
I don't care f'r Google for personal reasons undisclosed, so I don't use their products.
They're not MY de facto site, nor do I consider TFA any more than fanboy buzz. Just like other search engines we've used over the years of 'net usage, they're just the one on top right NOW. Give it 10 years. They might be the next big monopoly, or the next Webcrawler.
Personally, I prefer the meta-search engines; more baskets means more eggs.
Don't tell me to get a life. I'm a gamer; I have LOTS of lives!
Rick Ross, the guy that wrote the article, is this crazy marketing guy. He emails my company at least once a month asking how much he can pay us to drive traffic to his stupid sites. (My company is not in that business, which is why it's so strange.)
Zacky -- at least Apple fans have reasons to be proud.
http://www.google.com/search?q=%22Denny+Fish+Jr.%2 2
Some hacker compromised a lot of websites so they could get Google to associate this name with "JMP analyst" and then leveraged that into a press release that said "Upgraded SCOX from Market perform to Market outperform" as if they had some sort of hope of avoiding bankruptcy.
Some flack from a major news outlet quoted him, probably after being cued where to look by an interested party. He's probably a bot and nobody will ever be able to prove he existed at all, but the flack will be able to wipe his hands on the first amendment and point to google's cache.
<sigh> I wish it were harder to game the google.
See the irony?
I'm going to go out on a limb here and say that your lame site is getting more traffic than its ever received in a single day.
Which means that you've just been depending on Google too heavily for too little in return.
Digg it. Sig it. Promote the hell out of it.
I'd say this is a non-story, but the irony is that it was ultimately a wonderful short term solution to the author's issue.
Google does *not* own the Internet unless you depend solely on Google.
------ The best brain training is now totally free : )
I just visited your site just so I could joke around about being your single weekly hit.
Joke's on me and my poor eyes; I can't believe that you are ranked so high up at 50.
The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
In the comments are some strings that one writer of theirs expects to find on their site when searching google, but didn't. I just searched for the "jgoodies data binding" and their site comes up the 7th top level listing on the first results page.
It seems to me that google worked perfectly here. When 50,000 spam and phishing messages were posted to that site, the ranking of it went way down. When they cleaned them up, the site ranking came back.
What, would the site owners have google preserve their site ranking even though the content on the site went in the toilet? As a google user, I'm quite happy that google de-listed these folks for a bit, because otherwise these and other searches would have been severely polluted.
Sean
Who would have ever know! Not something an editor would check, is it?
I'll clusty before I'll google. Clusty needs some tuning, but it picks up the far-web that google doesn't even parse.
Don't think that a small group of dedicated individuals can't change the world. It's the only thing that ever has.
Well.........yeah. If you search for the exact term javalobby, there's a good chance that their website would come on top. More interesting would be some ambigious serach terms that would put it on top.
That which does not kill me only postpones the inevitable.
Try typing any mis-spelling of javalobby. Anything. Google offers you the alternative of 'javalobby'. They *so* do not recognise this website... so much so that they dare to *suggest* it as an alternative to a common mis-spelling of the forbidden site. Bastards! How deep does their vitriol run?
yeah, and if you search for KillerBob on Google, my site comes up at the front. If you type my real name, my personal website isn't even on the front page. On the second page, there's a couple of scripts I wrote over 10 years ago, and a story I submitted to BBSpot years ago, but my personal website still doesn't show up. Selection of keywords. If you type the name of any specific site, you'll get that site first. If you type what the site does, you may find that it's much lower on the page ranking. They probably aren't worried about traffic from people who search for the word "javalobby", because those people probably already know about their site.
They're worried about the people who search for terms like "java help", which is what somebody who *doesn't* already know about their site would be searching for. In my case, it's quite deliberate. I'm using robots.txt to tell GoogleBot to ignore my personal website. It's *personal*. All it is is an e-mail gateway, anyway; the blog is restricted access. There's no point in having it in Google, so the robots.txt reduces my daily traffic.
If you believe everything you read, you'd better not read. - Japanese proverb
Build a search engine which reads my mind better than google does and brings me results which are more relevant. Perhaps something which learns what I want and what I don't.
Deleted
Yes, downtime will come and go, but the page rank effects will be everlasting!
"You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
If you would have tried doing even a little research, you would have found out that Google penalizes hacked sites and even makes an attempt to contact the webmaster to alert them to the problem. Not only that, they'll relist you if you remove the spam.
1. Fail to follow even basic internet precautions standard since 1998
2. Whine loudly on Slashdot when search engine behaves as advertised
3. Get lots of new traffic
4. Profit
Dekker Dreyer
I had a similar, but opposite experience. I started setting up Yet Another Job Site, but I never got around to making it useful (see Click. Hired!). Google decided that it sort of liked it for a while, sending some traffic my way. I went from making nothing on my google ads to a few bucks a day. It wasn't much money, but it was fun seeing the traffic come in. Then google decided it was the crappy site that it was and my traffic went back to its deserved trickle. I wrote an article about it with pretty graphs:
:)
What Google Giveth, Google Can Taketh Away
I should have submitted it for a slashvertisement.
I see your informative link, and raise you a pithy comment.
This is part of my hosts file:
127.0.0.1 google.com
127.0.0.1 adwords.google.com
127.0.0.1 pagead.googlesyndication.com
127.0.0.1 pagead2.googlesyndication.com
127.0.0.1 adservices.google.com
127.0.0.1 ssl.google-analytics.com
Go figure!
Ah, the FreeBSD fanboys have arrived.
They're there affecting their effect.
I went on the site and went down the page to something posted a few days ago and grabbed some random text: "which allows native libraries" and googled. There were only three hits but they were one of them.
/. about captchas and I wondered what the big deal was. Well, after this story, I've been educated.
They haven't disappeared. Google indexes them. Their problem may be that they no longer come out on the first page of the google results.
Not long ago there was a story on
This has occurred with Snowboarding2.com as well. It use to offer a subdomain feature where snowboarders could create their own website. A spammer used a few subdomains and had cialis and other drug links placed to it all over the net. The subdomain service was ended a year ago and all of those subdomains have timed out for over a year as a result yet the site continues to be sandboxed by Google. A site that was on the first page of Google results since '99 is no where to be found. There is a difference between showing up on page 10 and being sandboxed completely. You can type in snowboarding2.com itself into Google and the website itself does not even show up. Google has been contacted several times regarding this and nothing has been done. A link campaign was also performed to overpass the amount of bad links with good links and that search term to no avail. With the recent Google update it is now a PR0 website when it was a PR5 for a very long time.
30% off web hosting. Coupon code "SLASHDOT".
Slashdot (and digg for that matter) only hurt the small personal and hobby sites run on the $19.99 hosted solutions. Traffic from slashdot to real sites running real businesses isn't all that much to write home about. Now a mention on Yahoo, that is serious traffic.
I think that the article is overdramatic, and maybe a bit of self-promoting.
According to ALexa (look at Reach), they dropped by roughly a factor of 2 to 3, from 100 to 150 per million, depending on the base period chosen, to about 50 per million. A factor of three variation in site traffic over a few weeks is large, but it's not the end of the world.
This happened days ago, and they're already back in the google searches. If I had to guess I'd say they were only out of searches while Google had the cached copy of their site that contained all the trash. Why direct people to porn spam when they ask for Java?
I'm glad that Google watches the Web so that we don't have to fear anything. See my small cartoon. Bye, Oliver
Join the club, Alex Chiu has been blacklisted by Google for years.
http://www.alexchiu.com/spread.htm
A choice quote:
"Google controls 50% of the world's searches. This famous website is so controversial that it has been banned by the most popular search engine in the world 'Google'. That's right. You cannot find alexchiu.com in Google system. Some very important people don't want you to know about Alex Chiu. Alex Chiu is on more than 30 TV interviews, 250 radio interviews, and in business ever since 1996. Yet AlexChiu.com cannot show up on Google?"
I made a proposal in the W3C AC forum a week ago that would kill linkspam. So far I have not managed to follow up with Google.
The key observation here is that linkspam is not aimed at the reader of the blog, its aimed at the search engines, in particular Google. So all we need to do is to define some RDFa type markup that allows a blog to mark regions of the page as comming from a third party source.
There is also a proposal to extend the norobots scheme to allow marking of regions but I don't like that as it breaches a core principle of HTML: declarative coding. Norobots is an imperative command, 'this is external content' is declarative.
I should have a note ready sometime next week.
Looking for an Information Security student project suggestion?
Try http://dotcrimeManifesto.com/
Should have linked this the first time. For more details on this scheme, see my personal blog.
Looking for an Information Security student project suggestion?
Try http://dotcrimeManifesto.com/
How Google handles hacked sites
As it turns out, Google is very professional on this issue, notifying webmasters, putting timeouts on the "sandboxing", etc ..
echo '[q]sa[ln0=aln80~Psnlbx]16isb572CCB9AE9DB03273snlbxq' |dc
"Suddenly we no longer existed in the eyes of Google."
Then you should have gotten your shit together and been more proactive on the spam front.
Even then Google doesn't own the Internet, it just owns you.
All your cookies are belong to us!
Seven puppies were harmed during the making of this post.
It's extremely easy to get reincluded to the Google Index. Just follow the steps on their help: http://www.google.com/support/webmasters/bin/answe r.py?answer=35843
Making your whole business reliant on a single vendor is just stupid.
Especially a vendor that you don't even have a contract with.
People act like Google is a public service, Google is a business and as a business there is no reason why they have to index your site.
...and that is all I have to say about that.
http://jessta.id.au
Just because they show up when you enter the name of the site doesn't mean they haven't lost lots of PageRank.
They probably mean that they used to show up when you searched for "Java", but because the spambots created so many outgoing links they lost their PageRank and now you have to search for "JavaLobby" to get them.
// MD_Update(&m,buf,j);
Here
Let this be a lesson to all of us who have websites. Your sacrifice will not be forgiven. RIP
Unfair - Javalobby is a community forum that has existed for ages. Don't bash it if you don't know what it is.
How many days after a site has been transformed by hijackers/forum spammers/whoever into a pile of crap should it come off the top of googles search results? A day? A week?
If they'd maintained their site properly, it wouldn't have happened.
-1 Uncomfortable Truth
So I need to get my site in a Slashdot article?
Javalobby dropped off the face of Google last month.
Good riddence! I dared criticise Java and OOP there and it started a long involved discussion. When the discussion ranked too popular on their traffic ranking system, the editors yanked it. They couldn't handle Java criticism so they pulled a "China".
It feels good when censorship aholes get what they deserve. Cheers!
Table-ized A.I.
I can't say for sure, but maybe he's connected with some moron that phoned me the day after Christmas !!! about a domain I registered for someone else (obviously looked up my contact info via whois), and got pissed off when I told them that they don't need help gaming the search engines.
Is his phone number 425-882-8838?
BTW - if anyone else has been phone-spammed by these c*ck-gobblers at internetadvancement, please get in touch with me so we can file complaints about abuse of the information in the whois database.
Um, what, exactly, are you trying to accomplish? We already have rel="nofollow".
Now, if you're arguing that there should be away to make sections of code as user contributed, and, thus, not to be indexed, that a) doesn't make any sense, and b) is already possible.
It doesn't make any sense because having random areas of gibberish or non-related content doesn't actually hurt you in search engines, except to the extent people will stop linking to you.
And it's already possible via trivial javascript to hide that part of the page from search engines, if you actually had a reason to do that.
I'm not sure the idea makes a lot of sense, anyway. Google doesn't want to link to pages that are full of spam, so coming up with some magical tag that says 'This could be crap' isn't really going to work. They want to link to useful things, regardless of who created the content.
If corporations are people, aren't stockholders guilty of slavery?
giving different content to the spiders as opposed to 'real peoples' browsers is a surefire way to kill your pagerank.
If only google already had a standard for it, respeced my multiple search engines....
Maybe Javalobby should watch for the googlebot and not deliver new posts until they're known to be attractive to Google? If they do indeed clean up the posts "quickly," then why not set the policy on their web servers to only deliver posts that are 2 X "quickly" old to clients with the googlebot user agent? just a thought.
And let's not forget the infamous interview :) http://interviews.slashdot.org/article.pl?sid=01/0 6/07/1421238
As a "community forum" it should be able to weather not being listed by google ... right?
After all, how many people ended up on slashdot via google as opposed to word-of-mouth?
Google isn't the be-all and end-all. There's no "live by Google, die by Google." If you've got something good, people WILL tell others.
How many days after a site has been transformed by hijackers/forum spammers/whoever into a pile of crap should it come off the top of googles search results? A day? A week?
60 days but you can request reinclusion sooner with Google Webmaster tools
I'm pretty sure that presenting pages to Google that are different from what regular people would see is already a breach of the terms of being listed by Google, and it's already resulted in sites being de-listed. (ie. If Google can't see what other people would see, how is it supposed to index and rank it appropriately for its users?)
This might not be the case if it could be done using something like robots.txt, but as someone else pointed out, not letting Google see recent content is a likely way of reducing your pagerank. Google tends to promote sites that have recent content.
What they say is law these days. It is NEVER that good to give one entity so much power.
Most of the time I ended up on it when searching for hints on a particular problem. People just go to hang out on slashdot, but I imagine many people, like me, end up reading and posting on javalobby only in a context of a specific question.
They actively censor political and religious sites with which they disagree.
Fair enough ... if it has the answers, people will go to it no matter where google places it.
What gets me is how self-feeding all this infatuation with being on the first page of google has become. I'm not dissing google - I use it dozens of times a day at work - but the intrnet is much more than pagerank.
I agree. I don't see how anyone can complain about not being the top hit for some general term (even if it's java.sun.com for "java"). If you don't think the results are relevant - submit a more specific query. And I guess for general terms one can show a few hand-curated links on the top ... and I think google already does something like that.
I'm actually experimenting with the google adsense myself this week on a new website. I tried to make a site based on what I wanted to see, and the info I wanted to be able to find. After week #1, I've had 250 hits and a total of 5 ad clicks. It's been a lot of fun while I've had some spare time, but I feel I may already have hit my max.
Posting on slashdot gets me a few hits. Google refuses to show all of my site: only the front page, and the forums despite submitting a sitemap per the google specs.
Any good advice for someone like me?
Fix Your Own TV - RiddledTV.com Avoid the Landfill
How many days after a site has been transformed by hijackers/forum spammers/whoever into a pile of crap should it come off the top of googles search results? A day? A week?
If they'd maintained their site properly, it wouldn't have happened.
Why is this stuff being modded Insightful? Some spam posts residing briefly in an open forum does not make a site a "pile of crap", and removing the posts is "maintaining the site properly".
It was actually just a bad guess as to why the site dropped out of Google for a few days.
rd
You're misunderstanding who the user of Google is. Don't worry. Most slashdotters make this mistake.
*You* are not the user of Google - You're the *product* sold by Google. The real users are the websites that are advertised by Google.
I don't know what % of the *on-line advertising market* Google controls, but if an anti-trust case were to be made (ie: advertisers have to play by Google's unfair rules in order to have an on-line presense), it'd be through that angle, not by allegedly controlling the "on-line search" market.
Google refuses to show all of my site: only the front page, and the forums despite submitting a sitemap per the google specs. Any good advice for someone like me?
What it seems to me they are doing is not showing hits in searches for areas of a site that the algorithm determines is not core to the site premise. This I think is an attempt to make search results "relevant", or to put it more bluntly, keep sites from fooling the search engine and people using it from clicking on the site based on unrelated content.
Sort of the same premise as spam mail containing random bits of text to fool spam blockers.
Also links on the home page to those other pages will encourage search engine bots to find them.
rd
Oh.. yeah, this is reality... Google actually really screwed the pooch when they gave this site a rank to begin with... do any of you have a decent search engine?.. I thought that google was worth something... I guess that they sold their souls to youtube.
/. was for people that knew spam from crap... please give me some real info instead of total garbage.
I thought that
Seriously, how did this make it on slashdot? Did we lose all moderators?
Visitors from Google are not your real customers, they are more like guests. You should service them well, of course, and they do contribute a lot to your profitability (if you are commercial) or your popularity as a site. But your real customers are those who remember your URL by heart and visit you again and again, posting to your forum and buying from you repeatedly. You should focus on them. Have an pot-in mailing list where they can learn about your news and make sure they are interested to know what new things you have to offer.
I've got a dedicated server running a few over hundred domains. Some very well maintained and other not. The general consensus from the SEO bigwigs is that burst of traffic you'll see at first is from GoogleBot's spiders picking up your keywords. If you site hits on some good keywords with low PKI you'll see good traffic within a couple months. Once that traffic starts rolling in someone at Google may actually view your site. If you happen to get decent traffic from low PKI keywords you'll see your traffic boost of diminish sooner, than with less traffic on higher PKI keywords. This is likely because spammers intentionally pick those lower PKI keywords to get those quick traffic bursts where there's little keyword competition and therefor Google pays closer attention to them. If it's determined that the site is garbage, incomplete, or just re posts from other sites your pagerank will be manually turned down. This is not permanent unless you're obviously abusing the system however.
Correct me if I am wrong, but all there is in google's hompe page is the Google logo, the search "box", and a couple of google internal links. Regarding the importance of being showcased in google's search results is not as important as having a site with enough content to generate enough traffic and use google ad sense to get a shit load (a ton) of money.
So... thou shalt not desire to be ranked #1 unless you can do something useful with it.
Is it just me, or do you find the first page returned isn't that good anymore on Google due to people exploiting their algorithm? I've started going to the 3rd or 10th page returned to start seeing results less "gamed."
It's turtles all the way down!
There is worse. Windows fanboys. They DO exist, really!
Great Intellect...
I'm pretty sure NineNine knows that already.
I have a website that got around 600 visitors a day with a certain domain name (.fr).
When my host had to renew the subscription for the domain name, they didn't, even though I paid. Then someone "stole" the domain name when as it was free.
Now I had to buy another domain name (.com) and some a**hole put ads on my former domain.
Does anyone know what I can do to have google index the new one and give it the position my former domain name had ?
And it's already possible via trivial javascript to hide that part of the page from search engines, if you actually had a reason to do that. ...which harms accessibility... Not a bright idea really, since not only are you preventing some of your customers from seeing the site, but in many parts of the world this is actually illegal.
http://blog.nexusuk.org
This happened last year. There've been several follow ups to the original blogpost on how the situation was resolved. There's even a guy from google who showed up in the forum and offered his helped to fix things. By now the damage has been undone of course but for some time Google stopped returning any javalobby results.
Btw. it's javalobby.org, they're not for profit and sort of pay the bills with the advertisements. Just like slashdot. I've been a member of their site since 1998 and they're good guys.
Jilles
Sure, but search engine accuracy should be besides the point here.
What the guy story in the story claims is that his site ceased to exist, and dropped off Google's indexes.
I'm not going to waste time looking up various search terms and see where they place his site, but obviously it's in their index, so his main argument falls right there.
Beware: In C++, your friends can see your privates!
The positive sides of the story would in this case be twofold:
1. That a Java site not having as bad spam problems has likely gained notability to Google at the cost of this one.
2. That his site should be back in case he fixes his problems at the next Google spidering, at least if Google is consistent here, and I don't see why they shouldn't for the best of their search index.
Beware: In C++, your friends can see your privates!
Now on to losing PageRank... even if it has happened (proof please), who cares? Sites lose PageRank every day, why is this one special?
Preventing the crap content in the first place is not in Google's remit, rather it is the webmaster's job.
It is entirely in the rights of Google to alter their indexing at will, it is unfortunate if your business depends on a free service, so much more is it in your interest to secure your site against abuse.
The same shit happened to me more than two years ago. I not only removed the offending pages, but actually return error code 410 GONE which should mean that all references to the resource should be removed. Despite that, I still see robot queries for these pages, as late as December Google tries to fetch the offending pages, and I also see real user requests to the offending pages - somewhere on the Internet, links must still exist, despite these pages only existed for a month. But I can't find them, and they doesn't show when searching for links to my page.
My site is indexed now, but I have never regained the ranking.
What Google should offer - knowing problems with forum and blog spam - is to temporarily derank the site. If a site is deranked because of such spam a flag should be made in the index, such that next time they come around they will check if the page is still offending.
Google does offer in their webmaster tools both the opportunity to report spam on web pages, to get these pages removed, and to request reinclusion.
Google's indiscriminate indexing techniques? Shouldn't Google just be proactive and exclude areas prone to spam invasions such as site forums? (After all, Google is supposed to be the origin of all innovation on the internet).
Well, yeah, the whole thing is a bit silly.
You want to keep engines away from content, keep them away from the whole page, which is doable in two different way using the robot exclusion standards.
The idea that search engines would support a tag to exclude parts of pages as 'non-official content', and yet index other parts of the same page, is a bit silly. Why would they want to do that? When they send users to pages, they send them to whole pages, so they want to know about the whole page. If a ton of spam is screwing up Google's relevancy for a page, then, duh, Google's right, the page is full of crap and not as relevant as another page that isn't full of crap.
The person I originally replied to wants to have his cake and eat it to, where he can tell Google and other search engines to pay attention to certain parts of a page, and not other parts. If Google was willing to do that, why would they spend so much time fighting 'cloaking', where different pages are sent to search engines vs. users?
If corporations are people, aren't stockholders guilty of slavery?
Maybe so but the point is still valid. G can censor if it wants to kill your site but they still wear the white hat.
Google is the next Microsoft and will and will probaly become to be hated like M by the yet unborn or just newly born crowd because it will represent the establishment like M does today. IBM was the previous "Big Blue" and they were just as hated in the 80's and 90's when Bill's star was rising and he could do no wrong. Yes, just like Google is idolized today.
G has no competition (a monopoly, sort of) so is not threathened. When a treat will arrive, and it will, those business fangs will come out like Bill's.
But for now enjoy.
Post an article about it on Slashdot.
"To be is to do." --Socrates
"To do is to be." -- Aristotle
"Do-Be-Do-Be-Do..." --Sinatra
Yeah, pretty much ditto. I had my webpage drop from Google and it is exactly like you don't exist anymore. Also searching for my real name turns up nothing for about a page and a half before it puts up a script I wrote under my real name a number of years back. Looking a lot longer I finally find that article I submitted to BBspot, Half-Life 2 Physics Engine Contains Grand Unified Theory. Really is it all pretty much Google. At least my username is original enough that almost every last google link is really me.
It is no longer uncommon to be uncommon.
I don't know. The article will drop off the /. home page after 24 hours. Comments page will last longer but PR for comment pages are low.
Search RapidShare and MegaUpload!
Well, since I joined before Google existed, and your uid is less than half mine, I have to say that you may not have the most relevant experience when it comes to judging the number of people who find /. via Google.
Not that I'm saying you're wrong, necessarily...just that I'd have to see some numbers to back it up.
Reality has a conservative bias: it conserves mass, energy, momentum...