Slashdot Mirror


Google Cache Makes Murdoch's K-12 Site Look Obscene

theodp writes "Rupert Murdoch's Amplify Education site is all about the kids, so it's understandable that the site's Terms of Use bans abusive, pornographic, obscene, and vulgar content. But if one uses Google to do a site search of Amplify.com (e.g., site:amplify.com donkey) you may get quite an unexpected eye-opener (redacted, but still NSFW). So, does someone at Amplify really want to "@&^$" your "a**"? Of course not. But this does serve as a cautionary tale of the perils of buying a second-hand domain name when pages of the shuttered site may live on in cache-land. Prior to its conversion to a site for kids' education, Amplify.com was a social sharing product that allowed users to clip favorite sites from the web and add their own commentary. Google does note that removed content may still show up in Google's search results in certain situations (removal requests can be made)." Update: 04/08 17:04 GMT by T : Stephanie Chang writes (in a comment below): "Hi, I’m the editor of Amplify.com. We purchased our domain name in February 2012 and took ownership of the site in July 2012 for use as our company's home page. Prior to that, the domain was used by its previous owners as a social-sharing site. As a result, some old content dating back to the previous domain ownership still shows up as cached on certain search engines. Amplify Education, Inc. did not produce the cached content in question nor do we in any way endorse it. We’re working with Google and other search providers to make sure caches of our site are up to date. In the meantime, we apologize to anyone whose attempts to locate information on amplifying donkeys resulted in a negative browsing experience."

21 of 101 comments (clear)

  1. Other good porn sites? by Anonymous Coward · · Score: 2, Funny

    Thanks! I'm looking for suggestions for other good porn sites - preferably free, although I'm not opposed to paying a little if the content is particularly high quality. Good job /.!

    1. Re:Other good porn sites? by SuricouRaven · · Score: 2

      fchan or e621 :>

    2. Re:Other good porn sites? by RobbieThe1st · · Score: 4, Funny

      Frigging Furries, always ruin everything!

  2. This is why you robots.txt after a purchase by Anonymous Coward · · Score: 5, Informative

    If you want to establish your own site after you take it over, always throw a deny-all robots.txt to clear out it's google cache and archive.org entries for a couple of weeks

    1. Re:This is why you robots.txt after a purchase by c · · Score: 2

      We're talking about Rupert Murdoch; it's a pretty solid bet that robots.txt won't be part of his solution.

      --
      Log in or piss off.
    2. Re:This is why you robots.txt after a purchase by mjwalshe · · Score: 2

      This is a traditional dead tree publisher its like training cats to push jelly uphill to get things done properly at this sort of organization.

    3. Re:This is why you robots.txt after a purchase by Anonymous Coward · · Score: 2, Insightful

      If that actually works, that's really scary. That would mean that the Internet Archive's copy a whole website could be removed entirely just because the domain name changed ownership. There are quite a few scenario's where this is clearly unwanted, the most obvious ones being the operator of the site running out of cash and selling it, or a site a site that contains dirt on e.g. the political process that gets confiscated or pressured into removing the pages.

  3. Re:What's the news here? by xclr8r · · Score: 2

    Google's webmaster tools can limit issues like this.

    --
    Beware of those who profit off the docile and persecute the unbelievers.
  4. Re:What's the news here? by alphatel · · Score: 3, Insightful

    Google's webmaster tools can limit issues like this.

    As can wary domain buyers who know to look at a domain's history as part of the valuation.

    --
    When the foot seeks the place of the head, the line is crossed. Know your place. Keep your place. Be a shoe.
  5. Re:Obviously the cached content was not current by Frosty+Piss · · Score: 4, Insightful

    What's with the "f_ck" and "a_s"? If you thought the word and probably say the word, why not type the word?

    Fuck and Ass. There, no one died.

    --
    If you want news from today, you have to come back tomorrow.
  6. And Google Street View makes me look bad... by truedfx · · Score: 3, Insightful

    ...if the previous residents of my house liked to decorate the windows with pentagrams? Or do people understand that different people live at the same address at different times?

    1. Re:And Google Street View makes me look bad... by mounthood · · Score: 5, Insightful

      ...if the previous residents of my house liked to decorate the windows with pentagrams? Or do people understand that different people live at the same address at different times?

      No, not when it comes to the internet. If hotmail.com was sold and became a p0rn site, it'd be a media apocalypse. Eventually people would understand the difference but they don't today.

      What should be done, relative to the popular ignorance on this subject, is simple: the buyers of used domains should be careful to guard their reputations, allowing caches to expire, 404'ing inbound links from old affiliates, etc... A more interesting discussion would be, What technical steps should be taken when buying a used domain?

      --
      tomorrow who's gonna fuss
  7. Re:Obviously the cached content was not current by OzPeter · · Score: 2, Funny

    What's with the "f_ck" and "a_s"? If you thought the word and probably say the word, why not type the word?

    Fuck and Ass. There, no one died.

    As far as I am concerned .. an Ass is a four legged animal that is a subgenus of Equus. Now if you were talking Arse .. you'd be heading into a much darker void.

    --
    I am Slashdot. Are you Slashdot as well?
  8. Re:"Cache-land" by retchdog · · Score: 4, Informative

    It's basically equivalent to quoting a portion of a work for a book review, which is fair use. Google's profitability is irrelevant.

    People should be thankful for having an opt-out robots.txt at all. It would in most cases not violate copyright for Google to ignore it completely (it might violate the site's TOS, but it is questionable whether even this holds any weight); they are just courteous enough to honor it. robots.txt is there mostly to prevent servers from being overloaded, or to keep content private, not to enforce copyright on publicly-facing content.

    No one cares what you "find weak" or "struggle understanding." I'm sure there's quite a bit.

    --
    "They were pure niggers." – Noam Chomsky
  9. Re:Obviously the cached content was not current by ShaunC · · Score: 2

    <boondock-saints>FUCK! ASS!</boondock-saints>

    --
    Thanks to the War on Drugs, it's easier to buy meth than it is to buy cold medicine!
  10. Re:This seems like a Google issue by AK+Marc · · Score: 3, Interesting

    That's why it's amusing. And how long should a cache hold an old site? You want to be able to pull it up years later, but don't want to pull it up years later. Maybe clear on transfer of ownership? But then, how can I refer to the "old" site if the cache is wiped on transfer?

    So, if this is a "Google issue" then what's the solution? You'll piss off someone somewhere, no matter what you do. That's not a Google issue, that's an issue with all opinions.

  11. Re:Obviously the cached content was not current by isorox · · Score: 2

    What's with the "f_ck" and "a_s"? If you thought the word and probably say the word, why not type the word?

    Fuck and Ass. There, no one died.

    Between you hitting submit and slashdot entering in the database (500ms), Someone probably did die.

  12. Re:This seems like a Google issue by flimflammer · · Score: 2

    I don't think google cache should necessarily be used for looking into a websites history especially beyond an ownership change when the site is completely different. That's something for the Internet Archive project. I think people should be able to request any previously cached pages be removed (can they already? the notion of pages being removed on request was vague enough that I don't know if it's ona per-page basis or can be per site) and updated with modern content. It doesn't need to be an immediate process, just a queue that Google goes through.

  13. Re:Obviously the cached content was not current by fustakrakich · · Score: 2

    There, no one died.

    False! I bet you that more than 5,000 people died in the time it took you to write that post. The good news is that as a result of people committing 'obscenities', around 7,000 more were born.

    --
    “He’s not deformed, he’s just drunk!”
  14. Fake outrage. by TapeCutter · · Score: 3, Insightful

    It's like the linking bullshit, we all know that if Murdoch wants to stop Google indexing his propaganda all he need do is fix his robots.txt. Same deal here, the process/facts are irrelevant when you are trying to paint the enemy as an irresponsible pornographer, a brazen thief, a despicable leach, or whatever bad news story he can dream up where Google are trampling all over his delicate sense of entitlement.

    --
    And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
  15. Wayback Machine Exclusion is Permanent by chazchaz101 · · Score: 2
    If you add a deny all robots.txt, the domain will be permanently excluded from the archive.org wayback machine, regardless of if you change it to allow later.

    Currently there is no way to exclude only a portion of a site, or to exclude archiving a site for a particular time period only.
    When a URL has been excluded at direct owner request from being archived, that exclusion is retroactive and permanent.

    http://archive.org/about/faqs.php#14