Slashdot Mirror


Published Google Docs To Appear In Search Engines

dotancohen writes "Google plans to make all published documents from Google Docs users crawlable, if the documents are linked from a public Web site. No official announcement appears to have been made, just a short blog post on the subject by a Google employee in a help forum. (One comment on the ghacks.net post linked above says that email was sent to the admins of Google Apps accounts.) There does not seem to be any way to make an individual document not crawlable; you can only un-publish it, at which point Web links to it will not work any more." The move makes sense from one point of view — Google is just making crawlable a document linked from another crawlable document — but it's likely to catch a lot of people by surprise.

19 of 62 comments (clear)

  1. Summary is wrong by sopssa · · Score: 5, Informative

    The summary or the article doesn't mention all aspects on it. For a better article, see theregister. "Google plans to make all published documents from Google Docs users crawlable, if the documents are linked from a public Web site." is wrong.

    This only applies to files explicitly published using the suite's "publish as web page" or "publish/embed" options and linked to from a public webpage. This does not apply to files shared via the "Allow anyone with the link to view (no sign-in required)" option, which provides for document sharing without links to the public web.

    So its not really as bad as it sounds. You have to explicitly publish them as webpage, which atleast for me tells that they might get indexed aswell, even more so if they are linked to from other websites.

    The good thing Google could do here is to add explicit warning or small text under the publish option that the content you publish as webpage might be indexed by search engines aswell. Other than that I dont see a problem with this, as the users are explicitly publishing them.

    1. Re:Summary is wrong by AvitarX · · Score: 5, Insightful

      I'm actually shocked they weren't already.

      I mean, that's what google does, it indexes things/

      Why would I expect my google doc I link to would be treated any differently than say, a PDF doc I link to?

      I really just took it for granted that is was searchable,

      --
      Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
    2. Re:Summary is wrong by stocke2 · · Score: 5, Insightful

      It would seem to me, if you are publishing it as a webpage purposely and linking to it from a public website, one would think you would like it to be crawled.

      I got the email from google since I admin two google apps domains, and have no problem with it. We don't normally publish docs like this, but if I did it would be because I wanted them found.

      I am sure a lot of people on here are going to go overboard like they always do because it is google, but it is not going to expose all of your private docs.

      --
      A Smith & Wesson beats four aces -- Murphy's Law of Poker
    3. Re:Summary is wrong by jcdill · · Score: 2, Funny

      I'm shocked too. "D'oh, Google indexes publicly linked files? Who would have thought of such a thing?"

      This won't take a "lot of people" by surprise but it might take a "lot of stupid people" by surprise. Which is not surprising.

      --
      "I'd much rather be mistaken as a lesbian by a bigot than be mistaken as a bigot by a lesbian."
  2. Google notifed users by email by Anonymous Coward · · Score: 5, Informative

    At least for apps administrators, the following email was sent out with instructions on how to prevent this:

    *****

    Hello Google Apps admin,

    We wanted to let you know about some important changes around published documents, spreadsheets, and presentations.

    In a few weeks, documents, spreadsheets and presentations that have been explicitly published outside your organization and are linked to from a public website will be crawled and indexed, which means they can appear in search results you see on Google.com and other search engines. There is no change for documents published inside your organization or shared privately.

    If you wish to prevent users from publishing documents to the public internet, we now offer an admin control in the Google Apps Control Panel that allows users to continue to 'share documents outside the domain' without allowing them to publish the files to the public Internet. To change this setting, follow these steps:

    - Login to your admin control panel
    - Select Service Settings > Docs
    - Un-check the option 'Users can publish documents to the public internet'

    If a user does not want their published Docs to be crawled, then the user must unpublish them by doing the following:

    - Go to the 'Share tab'
    - For documents and spreadsheets, choose 'Publish as web page'. For presentations choose 'Publish/embed'
    - Click on the button that says 'Stop publishing'

    For more details, please see this Help Center article: http://www.google.com/support/a/bin/answer.py?hl=en&answer=60781

    This is a very exciting change as your published docs linked to from public websites will reach a much wider audience of people!

    Sincerely,

    The Google Apps Team

    Email preferences: You have received this mandatory email service announcement to update you about important changes to your Google Enterprise product or account.

    Google Inc.
    1600 Amphitheatre Parkway
    Mountain View, CA 94043

  3. No way! by bmetzler · · Score: 5, Funny

    You mean things available on the internet might be indexed by Google? Holy Cow! I wonder if other search engines also do this "indexing" thing. Mysterious and curious activities for sure, I say.

    1. Re:No way! by Anonymusing · · Score: 5, Funny

      I know! So much for Google's motto of "Don't be evil". Obviously they mean "except when indexing publicly accessible web links"! Those hypocrites!

      --
      Liberal? Conservative? Compare perspectives at Left-Right
  4. Wait.. by R2.0 · · Score: 5, Funny

    Are you saying that I can't publish a document on the Web but limit who sees it?

    That's an invasion of my privacy! Next thing you know you'll be saying I can't stop people watching me bang my wife in my front yard!

    --
    "As God is my witness, I thought turkeys could fly." A. Carlson
    1. Re:Wait.. by Anonymous Coward · · Score: 3, Funny

      Next thing you know you'll be saying I can't stop people watching me bang my wife in my front yard!

      We will gladly help you to keep perverts at a distance, but you must give us the address of the front yard.

    2. Re:Wait.. by selven · · Score: 3, Funny

      1600 Pennsylvania Ave, Washington DC 20500

  5. Re:So? by CastrTroy · · Score: 2, Funny

    Not as bad as Scott Hanselman saying I'm going to Google that on Bing. Can't remember what the date was, but he said it on his podcast.

    --

    Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
  6. I'm aghast! by natehoy · · Score: 5, Insightful

    I'm actually surprised that, so far, no one has misinterpreted this as "all your Google Docs are belong to our search engine" along with a few jihaddist vows to delete all data from Google immediately. Instead, everyone seems to have read the article and understand that these documents already should have been indexed, because the users published them on a web site the public has access to.

    Who are all of you people, and what have you done with my Slashdot????

    --
    "This post contains words, known to the State of California to cause thought. Wash brain thoroughly after reading."
    1. Re:I'm aghast! by mewsenews · · Score: 2, Funny

      Heh.. umm.. heh.. *pushes glasses up nose* I'm supposed to .. umm ... the jihad is alive and well, heh .. but

      *portly neckbeard appears*

      what?

      *whispering*

      ok

      *portly man exits*

      ok anyway, we're taking it easy on Google this time and if you want to disagree with us you are required to accuse us of having rose-coloured glasses and being on a "honeymoon" with them.

      If you accuse us of reading the article again, we're not.. umm.. there will be.. problems waiting for you, in fact I might not be surprised if you woke up and found a snarky reply attached to one of your comments. You do not mess with us.

      *wipes cheeto crumbs from shirt*

      Yeah.

      *exits*

  7. Re:So? by natehoy · · Score: 4, Funny

    If you don't remember the date, you can always Bing it on Yahoo!

    --
    "This post contains words, known to the State of California to cause thought. Wash brain thoroughly after reading."
  8. Good by Tsiangkun · · Score: 4, Interesting
    This does nothing to the docs I've shared with other gmail users or people with google accounts, while enhancing the web presence of the documents I've explicitly published as web pages.

    I expect web pages to be crawled, indexed, and searchable.
    I see this as a good thing.

  9. Perposterous! by not+already+in+use · · Score: 5, Funny

    I for one am filled with feigned outrage, because the way slashdot presents this article dictate I be!

    --
    Similes are like metaphors
  10. Necessary people Notified !!! by Lordy2001 · · Score: 4, Informative

    As an admin of multiple Google apps sites that email was sent to me for each App site administered. I don't see why the summary implies that there was no notifications, but this being Slashdot I am not surprised.

  11. Re:Viruses by Kalriath · · Score: 3, Informative

    What do you think are the odds that exploited documents will be published to these documents too?

    Zero, because this is about Google Docs, not Google Groups.

    --
    For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
  12. Re:So? by selven · · Score: 2, Informative

    Besides, what is the past tense of "bing"? Is it "bang", as in "I don't know much about your mother so I bang her"?