Slashdot Mirror


Speculating About Gmail

rjelks writes "The Register is running an article about Google's new email service that was mentioned earlier, here. The story details the new privacy concerns about Gmail's privacy policy and Google's tracking habits. The policy states that Google will not guarantee the deletion of emails that are archived even if you cancel your account. 'The contents of your Gmail account also are stored and maintained on Google servers in order to provide the service. Indeed, residual copies of email may remain on our systems, even after you have deleted them from your mailbox or after the termination of your account.'" Reader cpfeifer writes "Rich Skrenta (founder of ODP, and Topix) speculates in his blog that the real product Google is creating isn't web search or email, but a massively scalable, distributed computing platform. 'It's a distributed computing platform that can manage web-scale datasets on 100,000 node server clusters. It includes a petabyte, distributed, fault tolerant filesystem, distributed RPC code, probably network shared memory and process migration. And a datacenter management system which lets a handful of ops engineers effectively run 100,000 servers.' If he's right, the question isn't what product will Google announce next, but what product will they not be able to announce?"

35 of 612 comments (clear)

  1. Only one? by zackeller · · Score: 5, Interesting

    Here's my question: how are they going to make sure people only have one account each? What's to prevent people from getting dosens and backing up their harddrive?

    1. Re:Only one? by radionotme · · Score: 5, Interesting

      Probably in a similar way to other email services, there will be a maximum size to attachments. Even if it was set at double the size of competitors, that would still only be about 10MB - how many people are seriously going to back up their hard drives in 10MB chunks?

    2. Re:Only one? by silentbozo · · Score: 5, Interesting

      Amazing. At some point, Google could have copies of every new document or content produced, all for the cost of hosting. They would, by default, become the next Library of Congress.

      So, who's the lucky supplier that has the contract to provide all the drives and computer assemblies? Any RFP's available for wiring all this stuff up and maintaining it?

  2. It just isnt private email by Moonpie+Madness · · Score: 5, Insightful

    its a different sort of tool, with the advantager of tracking etc and the disadvantage of not being private. just keep that in mind and there arent many problems. i love the idea, and ill use it if i can. i wont say anything extreme or criminal, and really, it is their property, so they can offer it for my use with whatever terms they like. IP rights and plagarism ideas are rapidly changing in our shrinking world, so keep that in mind

  3. Privacy by Anonymous Coward · · Score: 5, Interesting

    I presume I probably wasn't the only person who put their email address into the 'interested in an account?' section on the gmail website before remembering that it could be linked to all my previous searches on this machine... http://www.google-watch.org/email.html suggests deleting the google.com cookie before and afterwards, but might be too late for that...

    -jermy

    1. Re:Privacy by Anonymous Coward · · Score: 5, Interesting

      Yeah, but don't forget to read Google-Watch-Watch - that Daniel Brandt is, to put it politely, completely bananas. A fruit-loop. One badger short of a sett. A total lampshade.

      If Google are tracking everyone for targeting advertising, etc, why does everyone get near-identical search results for the same search queries? And why are the adverts quite obviously keyword-based? (Search for 'digital camera drivers linux', for instance, and get adverts for digital cameras).

  4. disk space is cheap. by ron_ivi · · Score: 5, Insightful
    I've seen Fry's have 200GB drives on sale for $79 before; and I'm sure if you're buying them in units of 10,000 they're even cheaper than that.

    What amazes me are the services that offer I'm acting as a mini-isp to friends, and with a $50/month dedicated server we're renting, $10/month gets us 10GB of email+web storage.

    Hard drive capacity has gone up a lot since the time of HotMail - I'm amazed no free email service started offering reasonable disk space earlier.

    1. Re:disk space is cheap. by pla · · Score: 5, Insightful

      I've seen Fry's have 200GB drives on sale for $79 before; and I'm sure if you're buying them in units of 10,000 they're even cheaper than that.

      True. However, 1PB would require over 5200 of them. Which would in turn require over 650 machines to stick them in (at 8 drives per node, itself probably a tad high since the bus would grind to a crawl in such a machine). All that adds up to at least half of a million dollars.

      And for what - Something that amounts to a community service project? Hey, I'll give Google full credit for their current image in the geek community, but this seems a tad ridiculous.

      So, I'd say they must have some sort of ulterior motive behind this. Either using huge numbers of people as guinnea pigs to test their new infrastructure (as the topic poster suggests), or something we haven't thought of yet. But just for the hell of it? Probably not.

    2. Re:disk space is cheap. by ron_ivi · · Score: 5, Insightful

      I'm curious what the cost in disk-space of a hotmail account was back when hotmail launched. I wouldn't be surprised if it's comparable to what Google's offering now.

    3. Re:disk space is cheap. by AlecC · · Score: 5, Interesting

      Which would in turn require over 650 machines to stick them in (at 8 drives per node, itself probably a tad high since the bus would grind to a crawl in such a machine). All that adds up to at least half of a million dollars.

      In that kind of quantlty I could do you a Raid controller driving, say, 128 drives, for about the cost of one machine. You need to Raid it anyway - you couldn't sau "sorry, we lost all your emails when on drive went down". I would bet that Google have some kind of economy raid controller in the works even if not yet deployed.

      Bandwidth isn't the problem. How much bandwidth do you spend reading email? Most of that data will sit there unread for months.

      --
      Consciousness is an illusion caused by an excess of self consciousness.
    4. Re:disk space is cheap. by untermensch · · Score: 5, Insightful

      So, I'd say they must have some sort of ulterior motive behind this

      Don't forget that Google has ads too. They may not be big and flashy but companies will pay a _lot_ of money to have their ad come out on top for certain search keywords.

      The same will be true for Gmail. Remeber that they admit that machines will be crawling through our mail to allow them to bring us targeted ads. And if any internet activity is more popular than a google web search, it's email. The sheer volume of email flying around on something with the scope that Google is aiming for, will produce a whole lot of ads.

    5. Re:disk space is cheap. by God!+Awful+2 · · Score: 5, Interesting

      E-mail? Who needs another free e-mail account? Thank you Google for giving me an unlimited supply of network attached storage!

      -a

    6. Re:disk space is cheap. by eln · · Score: 5, Interesting

      Hotmail was started for the same reason every other web-based free email system was started, and in fact why every other Internet-based business (with the exception of Amazon) was started way back when...because people still thought the advertisement-driven model of Internet-based businesses was tenable.

      Now, of course, all of these businesses have extra, fee-based "premium" services on top of their base free packages, because they've figured out that advertising revenue alone won't keep your head above water on the Internet.

      The Passport system may have been a reason Microsoft purchased Hotmail (although I think the Passport system probably came well after the purchase of Hotmail), but it's not why Hotmail was created in the first place.

    7. Re:disk space is cheap. by Louis+Guerin · · Score: 5, Insightful
      a community service project?

      You're kidding, right? Gmail is four things that I can see, and none of them are community service:
      • AdSense fulfilling its destiny, by (eventually) gaining an extra several hundred million pairs of eyes every day
      • A massive experiment in distributed computing and data management, the fruits of which will be phenomenally valuable
      • The ability to simultaneously put every other free email provider (and by force of ubiquity, every competing search engine) out of business, just in time for an IPO. Yes, Microsoft, Yahoo, that means YOU.
      Nope, nothing charitable about it. L
    8. Re:disk space is cheap. by utexaspunk · · Score: 5, Interesting

      true, but google seems to be the one company that has managed to really make money with advertising on the internet. consider their constant creativity and innovation in what they provide the users and realize that they do the same thing for their customers- the advertisers, as well as themselves. I'm sure they've worked out how to make advertising profit them as much as possible, just like they've figured out how to do it without pissing users off. The reason gmail will be wildly profitable for them is that they'll have the same non-intrusive AdWords/AdSense ads based on a scan of the words in your e-mail. I'll take that- they'll probably be extremely successful at blocking spam.

      I imagine the client interface will also be as fast and powerful as google, too. A lot of the reason why I've hated web-based e-mail in the past is that (at least with a lot of the larger services like yahoo and microsoft) they're f'in SLOW. Google has the server infrastructure to make it fast, and because they'll be using text-based ads and probably a google-esqe lightweight interface it may just be faster than using Outlook on my desktop.

      I'm sure their other incentive is that this would give them a lot more information to work with. Consider their creation of Orkut- they want more info to tie together. Having your e-mail means having who you e-mail. Sort of an auto-social-networking tool... I'm sure they'll figure out more cool stuff to do with the information they get from your e-mail.

      The only question is- can they be trusted?

    9. Re:disk space is cheap. by joshuaobrien · · Score: 5, Funny

      For example, when mail (example: spam) is sent to 100 people, keep 1 copy of the message

      Better still, when spam is sent to 100 people, keep 0 copies of the message...

    10. Re:disk space is cheap. by LiquidCoooled · · Score: 5, Interesting

      It will simply index your entire mailbox, incoming or outgoing.

      I don't see a problem with this - PROVIDING - it is secure enough and private enough that only I get to see the results of that.

      I can quite honestly see it replacing bookmarks in my regular work.

      Currently, whenever I find something interesting at work, I mail the link to my home account.

      Now, if while google is searching the web, it started using MY personal preferences and keywords to build up a much more tuned result list, things could start to get very interesting.

      Without the wealth of information that your emails provide, it cannot even begin to store YOUR profile properly.

      A cookie can only do so much; a 1GB gMail folder could be just what google needs.

      --
      liqbase :: faster than paper
  5. Of course they won't delete mail... by Ben+Jackson · · Score: 5, Insightful

    They're going to have mirrors, snapshots, backups, offsite backups, remote replication... Expecting them to purge your email when you delete your account is crazy.

  6. Privacy isn't such a huge concern by Biotech9 · · Score: 5, Insightful

    'The contents of your Gmail account also are stored and maintained on Google servers in order to provide the service. Indeed, residual copies of email may remain on our systems, even after you have deleted them from your mailbox or after the termination of your account.'

    If I can get a free account, myname@google.com, with 1 GB of storage, and with IMAP or POP3, I don't give a damn if they use my mail for marketing research, or if they keep it long after I'm dead. The reason is I don't work for M16, the KGB or the CIA, I only break little laws and I don't dig child porno. So basically who cares if a few of my mails get left on a server somewhere.

    Privay is a real concern, but worrying about this is like worrying about the fact that postmen can read your postcard when you send it. The truth is they can, but they don't give a shit.

    1. Re:Privacy isn't such a huge concern by John+Starks · · Score: 5, Insightful

      Ha! You don't care about Google being able to read your mail now, but what about when you get into a position of power that someone doesn't like. All they have to do is pay off someone at Google to go through your old email and find something a bit questionable in your past. Had an illicit affair over email? Had physical or emotional problems and discussed it with someone? Used drugs and let people know? Bought enhancing prescription drugs or other "adult" products online and had the bill sent via email? Heck, have you ever expressed an opinion over email that might not make you look good in the public eye? With the kind of storage we're talking about, it'll be in Google's computers as long as they want. And with enough money, people can pay to have it dug up.

      Remember, privacy is NOT just for people breaking the law. Privacy is for anyone and everyone that lives in our society. In fact, by posting messages like the one you've posted here, you are doing everyone a disservice. We always must fight for our right to have private lives. Encryption for everyone.

  7. A useful server would be... by Albert+Sandberg · · Score: 5, Insightful

    torrents.google.com ... it doesn't have to be illegal contents.

  8. Re:Skynet by TiMac · · Score: 5, Funny

    No, Microsoft will be the ones to build Sky.NET, their crappy coders rushing to market without the checks needed to ensure Asimov's Three Laws of Robotics

    --

  9. Distributed system by Anonymous Coward · · Score: 5, Insightful

    A distributed system is something truly worthy of the doctorate pedigree of Google's staff. They have an incredible concentration of brain power and I have always found it hard to believe they need all that to add a few more boxes to run a simple page weight algorithm and a web crawler.

    Finally, it all makes sense. They're trying to put all (but a few of) the sysadmins out of work! A noble enterprise, indeed. We hate them, they hate themselves.

    But seriously, this has been a dream of admins for a long time. 'Bout time somebody sat down and did it. Why can't a single box manage 100,000 others? If one man can do 100 with the right tools he could do them all. The difficulty of transparency is incredible, but even small teams in universities utilizing a few phd's and transient graduate students are making headway in the area. No reason a well funded lab of hundreds of phds working full time can't achieve it.

    Wow... I guess the BIG question is what they'll do with it. I mean... are they just doing it for their existing products? Are they going to license it out for astronomical sums to places like Lockheed and Sandia? Will they (gasp) open source it? Or, most frightening, they will run the world's largest, most efficient super computer and charge pennies for utility based computing and put Sun and IBM out of business in the process of creating a mainframe monopoly out of whiteboxes. Heck... they could probably buy out Sun to get that sweet Solaris technology for themselves. IBM has all kinds of retarded patents for toilet seats and ways to dance on an office chair. I guess they're worth getting for a laugh.

  10. Thats easy... by Biotech9 · · Score: 5, Insightful

    Here's my question: how are they going to make sure people only have one account each? What's to prevent people from getting dosens and backing up their harddrive?

    They don't limit the number of accounts, they just limit attatchment size and keep an eye out for abuses, like hundreds of downloads of from 1 account, or a scripted mailing of hundreds of 10 meg attatchments to any one account.

  11. Very Real by irokitt · · Score: 5, Informative

    Yes, it's real. The 1000 MB storage limit is listed at the GMail homepage here.

    If you are ainterested in an account, you can give them you current e-mail here
    and they will send information once GMail goes gold.
    Also note that Firefox and Mozilla support is explicitly mentioned!

    --
    If my answers frighten you, stop asking scary questions.
  12. Screenshots! by rffmna · · Score: 5, Informative
    Dear hungry world, here are some Gmail screenshots...

    http://fury.com/article/1990.php

    --
    -------
    FM Clan
  13. Your mail isn't your mail anymore by gunga · · Score: 5, Insightful

    Wow! Google always get a free pass on Slashdot, it seems.

    "Privacy isn't a concern because, after all, *you* choose to give it up by using the service"? I think it's wrong. I think the facts that Gmail reads your incoming mail to choose which text ads it will show you is a very bad precedent. Isn't it the first time someone offers a communication service and they tell you that they will know the content of every message you get?

    The fascination with the power of technology blinds the Google team it seems (like it blinds people on Slashdot), I wonder what Norvig thinks of this issue...

  14. they don't need that much disk space by Anonymous Coward · · Score: 5, Interesting

    Hi,

    Don't forget that while people will be allowed to have up to 1GB of emails in their mailbox, it doesn't mean Google will have users x 1GB of disk space. Most people won't use the 1GB of mailbox space.

    I worked on the mail system of the largest provider in my country. We had 700,000 customers with 15 MB mailboxes and we had something like 1/10 of the disk space required if all the mailboxs were full. And this worked just fine.

    Not only Google won't need all that disk space, bu they will probably purchase additional disk space as it becomes necessary. It's smarter to buy new hard disks later than all the disk space immediately, they'll be cheaper.

  15. Re:Perfect sense by vrai · · Score: 5, Interesting
    What's needed is a browser extension (Enigmail style) that can silently handle PGP {de,en}cryption in text fields.

    For encryption it could pick up the 'to' address from the relevant field and use it to encrypt the main text box. For decryption it could pickup the 'from' address and the encrypted text from the HTML, then replace the encrypted message with the clear text.

    A USB key-drive with a copy of Firebird (+ extension), GnuPG and your keys would allow you to access your mail from pretty much any computer. Though it would be relient on Google not changing their page format too often.

  16. Prohibited Actions by Anonymous Coward · · Score: 5, Interesting
    Have any of you read these Prohibited Actions?

    A few good ones:

    Transmit content that may be harmful to minors

    Illegally transmit another's intellectual property or other proprietary information without such owner's or licensor's permission

    Promote or encourage illegal activity

    Who decides what's harmful to minors? Google? will they ban my account for sending my friends offensive images/jokes?
    If i email an mp3 will they use their compute power to check if I own the copyright? Could the RIAA force them to report me?
    Since they're scanning the mail anyway, would they have to report users if words like 'civil disobidience' are in their messages? Could the government give them watch words?

  17. Privacy doesn't matter by shic · · Score: 5, Interesting

    I hope that GMail is real - because it would solve a significant problem for me - though I'd really need GMail to support IMAP4 for my purposes...

    I've three types of email I need to manage:

    1) Secret, private emails - always with known contacts - encrypted.
    2) Confidential email - again, known contacts only - stored only on my intranet - not sensitive - doesn't need encryption.
    3) Public contact - frequently new or unknown contacts. Enquiries; replies from Usenet/mailing lists etc.

    Types 1 and 2 are low volume and can be easily managed with current infrastructure. Tailored email addresses and white lists can virtually eliminate spam. Type 3, however, is a much bigger problem... because I can not easily control who contacts me. I think Gmail offers the hope of a solution here. For my purposes (at least) - given that Gmail would be used for initial contact only - I couldn't care less about the less than private nature of these communications. I don't really care if Google, law enforcement or even the government gets to see these messages - their content would be considered public. Provided that Gmail can be integrated into my current email system - such a service would offer an interesting and convenient alternative for "Type-3" email.

  18. Re:What? Are we treating this seriously now? by Finuvir · · Score: 5, Insightful

    Gmail was a fantastic April Fools Day joke. They convinced a lot of people that it wasn't for real by making the press release on April 1st, but then it turned out to be true. Genius. This was the only good April Fool I saw this year.

    --
    Why is anything anything?
  19. How would Google manage a 100K-node cluster? by Debian+Troll's+Best · · Score: 5, Interesting
    The Gmail project sounds like a fascinating experiment in massively distrubuted computing, and if anyone can pull it off, Google can. Obviously, a lot of custom software will need to be developed by Google's engineers to make a 100,000 node cluster fly. As mentioned in the article, distributed filesystem, RPC and network tracking software will be essential, and high priority projects. But what about the 'boring' nuts'n'bolts of keeping those cluster nodes in good shape? What about day to day administation tasks like adding new users, or checking disk usage? And what about keeping packages up to date?

    When you stop to think about it, package management could be a key factor in the smooth running of the Google Gmail cluster. What software would be used to make sure each one of those 100,000 mail-handling nodes was running the latest, most secure version of sendmail, qmail or postfix? We know Google uses Linux extensively. It is fairly safe to assume that they are using apt-get to sling packages. But what do the Slashdot community think about apt-get's long term suitability for these types of projects? Can the open-source, Free Software package management poster child scale to meet the 100K-node challenge? I look forward to hearing the community's response!

  20. Google don't use RAID... by blorg · · Score: 5, Informative

    ...but rather (all this according to the article) their own distributed, fault-tolerant Google Filesystem (GFS) [PDF]. Apparently each of their 1/2 depth 1U servers has only one or two drives. If a server fails (which happens routinely with 100k servers) then it's simply left in place and the data is automatically replicated onto another server from one of the redundant copies.

    1. Re:Google don't use RAID... by ron_ivi · · Score: 5, Informative
      Thanks for the links. I was going to mention the same thing, but didn't find the article as fast as you.

      As the parent pointed out (mod him up), Google's GFS is better than a large raid system in many ways. While a RAID system tolerates the failures of individual disks (which then need to be replaced), Google's GFS _expects_ the failure of most components, including CPUs, memorys, disks, systems, etc -- and in google's case nothing has to be replaced.

      Their system is so fault tollerant, Cringly writes: "Now here is the part that sticks in my mind: the fault tolerant nature of the cluster is such that if a machine fails, the other machines simply take over its functions. As a result, whenever a server fails at Google, THEY DO NOTHING. They don't replace the broken machine. They don't remove the broken machine. They don't even turn it off. In an army of drones, it isn't worth the cost of labor to locate and replace the bad machines. Hundreds, maybe thousands of machines lie dead, uncounted among the 10,000 plus. "

      This is far cooler than any RAID from a fault-tollerance point of view.

      (apparently since then google went to rack-based systems so it probably detects dead ones so they can replace them easily)