Slashdot Mirror


Google De-indexes Talk.Origins, Won't Say Why UPDATED

J. J. Ramsey writes "Talk.Origins is an archive with thousands of pages exposing creationist pseudoscience. Rather mysteriously, Google pulled the plug on its search engine, giving only the vague reason: 'No pages from your site are currently included in Google's index due to violations of the webmaster guidelines.' This was apparently triggered by a recent cracking of the site that added 'hidden links to non-topical sites,' but Google won't say just what the violations were. Talk.Origins webmaster Wesley R. Elsberry believes that this Google policy harms honest webmasters." From the article: "My mission, whether I liked it or not, was to find and fix whatever problem the [Talk.Origins Archive] might have, with no guidance as to what the problem was and nothing at all about where to start looking... I was extremely lucky. The damage to my site was limited and in the first place that I happened to look. Other honest webmasters might not be so lucky. They may have to undertake an arduous process of vetting pages, essentially having to second-guess the mind of the cracker in trying to locate a problem that Google knows the exact location of." Thanks to an alert reader who sent in Matt's blog posting about how Google handles hacked sites.

33 of 575 comments (clear)

  1. Re:Words are Meaningless by Baricom · · Score: 5, Insightful

    Nobody was evil here. The guy's site got hacked and spam links added, Google rightfully de-listed him, and then the webmaster found the problem, fixed it, and asked Google to re-list. Am I missing something?

  2. Google censoring Usenet? Not! by BorgCopyeditor · · Score: 4, Insightful

    The writeup sucks. It implies that Google is censoring Usenet.

    --
    Shop as usual. And avoid panic buying.
  3. Re:huh? by Daniel+Dvorkin · · Score: 4, Informative

    That's the Google Groups archive of the talk.origins newsgroup, which is a different animal (an ancestral form, one might say) from the Talk.Origins Archive web site. It was the site that was delisted.

    And indeed, as of right now (10:35 PM CST) a Google search for "talk.origins" doesn't show any links at all to the Talk.Origins Archive. In fact, the first link that comes up is to a young-Earth creationist site which claims to offer "intellectually honest responses to the claims of evolutionism's proponents, including--but not limited to--the 'Talk.Origins' newsgroup and the 'Talk.Origins Archive' website."

    Conclusions about species competing in crowded niches are left as an exercise to the reader.

    --
    The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
  4. The problem by Aexia · · Score: 5, Insightful

    was that he had no idea why he was delisted so he could fix the problem.

    1. Re:The problem by wfWebber · · Score: 4, Insightful

      And with good reason I'd say. If I add a couple of ways to "fool" a search engine to my web pages, I can't seriously expect that same search engine to tell me which of the tricks they discovered?

      --
      Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway. -- Andrew S. Tanenbaum
  5. Re:ahhh i love it by scowling · · Score: 4, Informative

    Except, of course, that "creationist" does not equal "Christian". Talk.origins exposes *all* creationist pseudoscience, from *all* sources.

    --
    www.kitchengeek.com -- Nosh for
  6. You love to whine, don't you? by BorgCopyeditor · · Score: 4, Insightful

    "exposing creationist pseudoscience"...

    Slashdot is so biased I don't know why I even bother anymore. Bashing Christians is so fashionable these days.

    "Creationist" != "Christian", but don't let that stand in the way of your pretending to feel victimized.

    --
    Shop as usual. And avoid panic buying.
  7. Re:ahhh i love it by One+Louder · · Score: 5, Insightful
    "exposing creationist pseudoscience"... Slashdot is so biased I don't know why I even bother anymore. Bashing Christians is so fashionable these days.
    Wait a second - I thought that creationism was a "valid alternative scientific explanation for the origin of the species", and not religion. Are you saying that it's really religion, specifically Christianity , wrapped in deceptive packaging?

    Sounds like you blew the cover there, dude.
  8. Whine, Whine, Whine by MDMurphy · · Score: 4, Insightful

    So many people refer to Google as if it were a human looking at web sites and giving it the big thumbs up or down. As part of the indexing if the spider finds "violations" such as presenting a different page to spiders than to humans, it risks being dropped from the index. To expect a human response to why each site triggered the de-indexing is not reasonable.

    In the webmaster's whining about Google, he complains about the request to be re-indexed containing:

                        *I believe this site has violated Googles quality guidelines in the past.

                        * This site no longer violates Googles quality guidelines.

    He thinks these are "an admission of guilt", but they dont' say "I violated" they say "the site violated". So, if the site were hacked and did violate their indexing policy, fix it, say you've fixed it and move on. How many hits has he had over the years that came directly from Google? And did they come from Google due to all those people choosing Google to search for his site or it's topics? But now he whines about being delisted for the time it takes him to fix a site he should have kept unhacked in the first place.

  9. Re:Words are Meaningless - Public Utility by TubeSteak · · Score: 5, Insightful
    Like it or not, Google has essentially become a Public Utility.
    I'm going to have to go ahead and disagree with you on that.

    People may be treating Google as a public utility, but Google (a private company) has absolutely no obligations to any website.

    To just pull the plug because you somehow -- maybe not even your fault -- ran afoul of a constantly changing set of rules is not aboveboard behavior for a $157B company.
    Ultimately, Google* has the right to change the rules when & if they please, in an arbitrary fashion, without consulting anyone.

    *When I say "Google" I mean "the guys who own a majority stake in the company and cannot be overruled"
    --
    [Fuck Beta]
    o0t!
  10. Re:Words are Meaningless - Public Utility by vixen337 · · Score: 5, Insightful

    I was under the impression that they told the webmaster the reason they were delisted, they just didn't tell the webmaster the specific pages that the reason pertained to. Like "Your site has been delisted for hidden links to non-topical sites" instead of "Your site has been delisted for hidden links to non-topical sites on pages index.html, intro.html." etc. To me, that's a webmaster job. Google did their job on their end. What if the site had hundreds of pages of non-topical links? What if Google spiders just stopped at the first one they indexed (as they should). Should google be in charge of going through this guy's site and telling him exactly where the problems are? They are a search engine, not a website security firm. People are getting lazier everyday and everyone expects someone else to do their dirty work for them. People need to take some responsibility and stop whining.

  11. Re:Words are Meaningless - Public Utility by telbij · · Score: 4, Insightful

    Unfortunately you're missing something too.

    Google is in an arms race with spammers and blackhat seo firms. How are they supposed to know whether someone is honest or just mining them for information for their scam?

  12. Synopsis by operagost · · Score: 4, Insightful

    "Talk.Origins is an archive with thousands of pages exposing creationist pseudoscience"
    This article is a submission containing a biased summary which has little to do with the actual topic, which is the enigmatic status of Google's search algorithms.
    --

    Gamingmuseum.com: Give your 3D accelerator a rest.
    1. Re:Synopsis by Tyreth · · Score: 4, Interesting
      I think that the person you are replying to is referring to the big bang. If you look far back enough into the history of the universe, you get to a point where everything began to exist. At the singularity of the big bang, we find that both time and space began. There is no "before" the big bang, as time did not exist. This is a central part of the cosmological argument for God's existence:
      1. Everything that begins to exist has a cause
      2. The universe began to exist
      3. Therefore, the universe has a cause
      What came before the big bang? That question is meaningless, as time did not exist. So you have a few options, only one of them feasible. The first is that the universe is infinitely old and had no beginning. Once a view of atheists, this is no longer scientifically plausible. The second answer is that the universe came into existence from nothing - absolutely nothing. The third, and most reasonable, is that something else caused the universe to be created. This cause must itself be timeless, and spaceless, as time and space began to exist with the big bang.

      So the atheist must either claim the absurdity that the universe came from nothing, or he(/she) must acknowledge that there was something that created it. And that *something* is inaccessible from scientific analysis. It is not, however, too far from the reach of philosophy and logic. We can draw reasonable conclusions about this entity.

  13. Re:Words are Meaningless - Public Utility by eclectro · · Score: 4, Insightful

    What you're missing is that Google gave him no clue/hint/guide/comment/help on why he was delisted.

    I'm not for censoring any information, and I am not trying to defend google. But there may be one very good reason why this may be happenning this way.

    Google is at war with search engine spammers. When google de-lists somebody for spamming their search engine, if they gave a specific reason why then all the spammers would do is tweak their spam farm and be up and running in a couple of hours.

    If they told this guy what was wrong, they would have to spend a huge amount of time and resources telling why everyone is wrong, all the while helping out the spammers.

    Google is a good search engine, but if you notice that if you go beyond a couple of pages out of search results, many times you will find nothing but useless "link farms." Unfortunately, spam is no longer limited to email inboxes anymore, it's everywhere.

    --
    Take the cheese to sickbay, the doctor should see it as soon as possible - B'Elanna Torres, "Learning Curve"
  14. Understandably confused that some is not all by Nevyn · · Score: 4, Insightful
    They may have to undertake an arduous process of vetting pages, essentially having to second-guess the mind of the cracker in trying to locate a problem that Google knows the exact location of.

    Bzzt. The website admin needs to locate one or more problems (== however many the cracker planted), and Google knows the exact location of at least one. "one or more" >= "at least one". If google tells people where their problems are, google will be playing whack a mole for eternity. There are contractors/services that should be able to help them/anyone, google is not one of them.

    --
    ustr: Managed string API with ave. 44% overhead over strdup(), for 0-20B
  15. Caped Hacker by derubergeek · · Score: 4, Funny

    This was quite obviously the work of the Flying Spaghetti Monster.

    --
    Trust me. This is an inactive account. Regardless of what the /. bean counters might report.
  16. probably just bad algorithms by martin-boundary · · Score: 4, Insightful
    While it's natural to sympathise with the victimized website, it doesn't follow that Google is doing something Evil(TM) in this instance, rather it's most likely that their current algorithms are badly tuned.

    With the index sizes that are being collected by search engines these days (on the order of 10 billion entries), it's completely naive to think that some humans are sitting at a terminal choosing to delist websites for some policy reason or other. It's also completely naive to think that a human email monkey can do any sort of digging to find out the exact reason that Google's automated algorithm has censored this particular site.

    Instead, Google's engineers have automated algorithms which do all the censorship, and the policy is just there as a thin cover for whatever the algorithm happens to be doing today. It's worse of course, because 1) algorithms change every few months and 2) there's simply no comprehensive way to test the quality of the implementation.

    Anyone who's programmed a nontrivial algorithm knows that obscure edge cases are a bitch, and with 10 billion websites, any algorithm will have plenty of obscure edge cases which nobody has ever tested, nor ever will. The most likely explanation is that the website in TFA is a false positive of some subsystem, but fixing it will require changes to the algorithms, and Google don't want to risk that, would you? The problem will probably go away in a few months when the algorithms are scheduled to be updated.

  17. Re:ahhh i love it by thefirelane · · Score: 4, Insightful

    ID does not propose that the creator must be a diety

    ha ha ha ha. Yes, in ID the creator must only be someone eternally existing with the ability to manipulate all matter in the universe at will.

    But diety [sic].... no!

    In case you missed it, in ID it must be a deity, or else who created the creator? If life can not come from non-life, then there must be some eternally existing intelligence to kick things off (aka God). So either you don't understand the theory, or you are lying.

    You have to love when a theory tries to sound more sane by saying "but... it could be space aliens too!"

    Is there anything I'm missing there about ID?

  18. Re:Words are Meaningless - Public Utility by TubeSteak · · Score: 5, Insightful
    Google is a Public company, not a private company.
    Google is publicly traded, but for all intents and purposes, privately owned by 3 people (who control 66% of the shareholder votes).

    So if they know where the problem is, it would be *good* for them to help out and point the site admin at the problem area. Right?
    It might be good, but my point is that Google doesn't have to... and maybe shouldn't.

    To some extent, part of Google's ability to foil bad website behavior relies on security through obscurity. If Google doesn't tell or hint to anyone how the cheat-detecting algorithms work... well, isn't that good for Google?

    I could make the argument that since (as you argued) Google is a public company, they have to do what's best for the shareholders by doing what's best for Google. But that is an irrelevant argument, since there's really only three people whose opinions on the subject matter.

    If Google ever did do something along the lines of what you're proposing, they'd have to put a lot of time & effort into setting up a system that can't be easily abused by link spammers, is easy to use for idiots, etc etc etc.

    That may be more trouble than it is worth, compared to saying "not our problem, deal with it yourself."
    --
    [Fuck Beta]
    o0t!
  19. Re:Words are Meaningless - Public Utility by jrockway · · Score: 4, Insightful

    > Google is at war with search engine spammers. When google de-lists somebody for spamming their search engine, if they gave a specific reason why then all the spammers would do is tweak their spam farm and be up and running in a couple of hours.

    Security through obscurity is no security at all. The spammers already know Google's weaknesses -- that's why there's so much spam everywhere.

    --
    My other car is first.
  20. Re:Words are Meaningless - Public Utility by coaxial · · Score: 4, Insightful

    People may be treating Google as a public utility, but Google (a private company) has absolutely no obligations to any website.

    PG&E is a public company. ComEd is a public company. Verizon is a public company. AT&T is a public company. They're all public utilities. Simply being a publicly traded for profit corporation doesn't mean that you're not a public utility.

    Ultimately, Google* has the right to change the rules when & if they please, in an arbitrary fashion, without consulting anyone.

    Yes, but there is something called ethics. Google is held to a higher standard than the Ackbar and Jeff's Falafel and Oil Change Hut because of their unique position of being depended on by hunderds of millions of people the worldwide. Also, Google said they should be held to a higher standard with their "Don't be Evil" slogan.

    Did Google act wrong in this case? No. But that doesn't mean that your larger point about corporations are beholden to no one is valid.

  21. Google Webmaster Tools by RockoW · · Score: 5, Informative

    Google have a set of http://www.google.com/webmasters/tools/ tools for webmasters. essencially it give out every diagnostic needed to fix your site for Google. Additionaly you have statistics for searches and how GoogleBot see your site. So, you shouldn't blame until you googled for the answer! Searching for "Google index tool" shows up "Google Webmaster Central"...

  22. Talkorigins hacked by porn spammers by Mouth+of+Sauron · · Score: 4, Informative
    The site www.talkorigin.org is not the only site to have been de-indexed by Google.


    This is a google cache of talkorgins.org showing the porn spam links.


    However, I checked on deepx.com and it is *not* a porn site.


    From DeepX.com's about page:


    XML provides an open and flexible language for the creation, management and exchange of electronic content. Founded in 2000, deepX has an experienced team of consultants and developers, who specialise in the design and development of solutions using XML and the emerging technologies related to XML.


    Also, another link shows www.theoi.com and it is *not* a porn site, either:


    Here's how THEOI used to look via the Wayback machine.


    Theoi.com has been banned by Google (no reason given) and forced to close down as a result. There are no plans to re-establish this site in the future.


    wu.edu.gh is Valley View University is a Seventh Day Adventist college in Ghana.


    Both deepx.com and wu.edu.gh redirect to porn sites.


    Unsurprisingly, wu.edu.gh, theoi.com and deepx.com have been de-indexed by google.


    I speculate that all these sites that have been de-indexed were tagged by automated processes.

  23. Re:ahhh i love it by denoir · · Score: 5, Insightful
    Why should it be the responsibility of ID to explain who created the creator?
    Because it otherwise fails to explain anything. If irreducibly complex things require a designer then the designer who designed them will be even more complex. Since the designer theory can't tell us, well, anything, the only way to investigate is to go up the ladder: who designed the designer?

    If you say that that's a metaphysical question that cannot be answered, why not just skip the whole designer/creator bit and say that you are not interested in physical modeling of the world. Invoking an extremely improbable super-being to explain the world is very unhelpful. That's what earlier civilizations did: thunder was Thor riding in his carriage in the sky etc

    What the ID followers want is a return to that using the logic "I don't understand it so it must be God's work."

  24. Re:Backups? by grimJester · · Score: 4, Funny

    That presupposes the site was intelligently designed. Starting with that kind of assumptions is completely unscientific.

  25. Re:Words are Meaningless - Public Utility by nacturation · · Score: 5, Funny

    What you're missing is that Google gave him no clue/hint/guide/comment/help on why he was delisted. login: root
    password: ******

    Incorrect login for user "root". You got the first and fourth characters correct, and one other character was correct but in the wrong place. Please try again and/or make use of one of the following clues/hints.

    You can also try one of the following non-root accounts:
    1. admin (8 character password)
    2. backup (6 character password, all lowercase letters)
    3. johndoe (5 character password)
    4. maryjane (7 character password)

    Failing that, if you can't remember any passwords this server is located at 1234 Main Street, Anywhere, USA. The server rack key is located in the desk drawer on the second floor in the manager's office. You can boot with a Knoppix CD (inside the rack) and reset the password after mounting the hard drive.


    Often, helpfulness is at odds with security.
    --
    Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
  26. Ethics: The users are our customers by MDMurphy · · Score: 4, Insightful

    Google has been up front with where their loyalties lie in the search engine business: With the user. They got big and continue to be big because the give results that the search users are looking for. In general, this means the links they present are on the topic queried for and on the basis of links from other sites the content has been "rated" useful.

    If a site is designed ( or screwed up ) such that it shows as a result to a query when inappropriate, delivers spam, or ranks higher than the content would warrant, and Google still presents it as a search result, then Google has failed their customer.

    Webmasters are not their customers, individuals who are searching are. Ethics says that you give your customers what you promised them. Ethics says you live up to what your stockholders expect by doing what you told them you do: Delivering search results that keep your customers coming back ( and serving them up ads each time ).

  27. Re:Words are Meaningless - Public Utility by HUADPE · · Score: 4, Informative
    PG&E is a public company. ComEd is a public company. Verizon is a public company. AT&T is a public company. They're all public utilities. Simply being a publicly traded for profit corporation doesn't mean that you're not a public utility.

    These companies were all given special monopoly privileges by the force of government. They can run wires, pipes, and other items through your property without your consent, by law. They are required to provide service to all persons in their scope of operation by law. No such law exists regarding Google Inc. and they are not a utility.

    --
    This sig has not been evaluated by the FDA. It is not designed to diagnose, treat, prevent, or cure any disease.
  28. Re:Words are Meaningless - Public Utility by Columcille · · Score: 4, Insightful

    Going on the market puts some control of the company into the hands of shareholders, not the general public. Become a shareholder, then you can have a say and ask for a nice, friendly email.

    --
    I love my sig.
  29. Google emailed this site by GoogleGuy · · Score: 5, Informative

    If you dig deeper, it turns out that Google emailed talkorigins.org to alert the site that it had been hacked and was stuffed with rape and animal porn spam. Google's head of webspam has posted a full write-up.

  30. Re:Words are Meaningless - Public Utility by edumacator · · Score: 4, Insightful
    they dont have to tell anyone how they found the problem, just where. if the webmaster of a site is deliberately trying to cheat google, they already know what pages are in offence anyway.

    But if they tell the webmaster, who might be cheating, (remember, a lot of the exploits out there are actually used by the webmaster) where the problem is, then the cheating webmaster only has to get rid of one exploit and gains insight into the detection methods employed by Google. Then he can leave all the others in place. Wouldn't it be fair to say that the people doing evil is, well, the exploitive webmasters?

    Don't hit reply yet...I know this guy was honest, but how in the hell could Google possibly tell who is legit and who is not? Google can't hope to be "fair," only just.

  31. Re:Words are Meaningless - Public Utility by Gulthek · · Score: 4, Insightful

    It would be the same as Microsoft stopping an application from running under windows.

    Which would be an entirely appropriate response if said application was a virus.