Slashdot Mirror


Google Nearline Delivers Some Serious Competition To Amazon Glacier

SpzToid writes Google is offering a new kind of data storage service – and revealing its cloud computing strategy against Amazon Web Services and Microsoft Azure. The company said on Wednesday that it would offer a service called Nearline, for non-essential data. Like an AWS product called Glacier, this storage costs just a penny a month per gigabyte. Microsoft's cheapest listed online storage is about 2.4 cents a gigabyte. While Glacier storage has a retrieval time of several hours, Google said Nearline data will be available in about three seconds. From the announcement: "Today, we're excited to introduce Google Cloud Storage Nearline, a simple, low-cost, fast-response storage service with quick data backup, retrieval and access. Many of you operate a tiered data storage and archival process, in which data moves from expensive online storage to offline cold storage. We know the value of having access to all of your data on demand, so Nearline enables you to easily backup and store limitless amounts of data at a very low cost and access it at any time in a matter of seconds."

18 of 71 comments (clear)

  1. Re:TFS just has marketing by Kadin2048 · · Score: 2, Interesting

    Yeah I'd like some more meat to the story as well. Amazon Glacier achieves its pricing by using low-RPM consumer drives plugged into some sort of high-density backplanes; supposedly they are so densely packed that you can only spin up a few drives at once due to power and heat issues. Hence the delay.

    I assume Google is doing something similar, maybe with somewhat better power or cooling since they're offering faster retrieval times which implies that perhaps they can spin up a higher percentage of drives at a time.

    --
    "Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
  2. I don't get the pricing? by Pausanias · · Score: 4, Interesting

    A penny a month per gigabyte... that's $10/month per terabyte... that is already what Dropbox charges for "fast" storage. So what gives? Why would I pay $10/month for a terabyte of slow storage when I can get the same amount of storage for the same price in a regular, fast format with Dropbox?

    1. Re:I don't get the pricing? by wh1pp3t · · Score: 3, Interesting

      A penny a month per gigabyte... that's $10/month per terabyte... that is already what Dropbox charges for "fast" storage. So what gives? Why would I pay $10/month for a terabyte of slow storage when I can get the same amount of storage for the same price in a regular, fast format with Dropbox?

      Why pay for a terabyte of storage when you are not using it to capacity?

    2. Re:I don't get the pricing? by Pausanias · · Score: 2

      It is just the pure storage... bandwidth is extra, which makes it even worse compared to Dropbox, where bandwidth is included.

    3. Re:I don't get the pricing? by SpzToid · · Score: 3, Informative

      Interesting point, so I read up a bit. This only applies to Office365 customers. What about Linux, (etc.) users that can't fully utilize Office365? This really seems almost like a consumer option, and there are certainly business use-cases where this just ain't gonna fly. There's a 20,000 file limit, *period*, and the maximum file size is 10Gb, which is limiting for some, (especially those folks who roll their own encryption and compression).

      For those reasons, Microsoft Office365/OneDrive doesn't seem like a serious competitor to Google Nearline, Amazon Glacier, or Microsoft Azure services.

      http://www.techrepublic.com/ar...

      --
      You can't be ahead of the curve, if you're stuck in a loop.
    4. Re:I don't get the pricing? by stephanruby · · Score: 3, Interesting

      A penny a month per gigabyte... that's $10/month per terabyte... that is already what Dropbox charges for "fast" storage. So what gives? Why would I pay $10/month for a terabyte of slow storage when I can get the same amount of storage for the same price in a regular, fast format with Dropbox?

      Here is an answer from someone on Quora.

      Dropbox offers no Service Level Agreement. Actually they specifically provide no warrantees whatsoever about their service (http://www.dropbox.com/terms). This is a non-starter for many CIOs.

      Beyond that, the fact that Dropbox doesn't "own" the underlying cloud storage architecture -- Amazon S3 -- could be an issue, although they advertise it as secure via in-transit and on-disk encryption (https://www.dropbox.com/help/27).

      If it still is the case that Dropbox uses S3 itself, then that wouldn't make business sense for them to pay more for storage than they're charging their own customers (even if they've decided not to offer a Service Level Agreement).

      So my guess is that this has to do with the way they count the storage for customers. Assuming that their customers do not encrypt their data before they place it on DropBox (which would make sense because DropBox customers are rarely CIOs themselves), then DropBox is most likely hashing the content and only storing a single copy of a file even if there are thousand virtual instances of that same file throughout their system.

      Also note that in the special case where a company is footing the bill and DropBox can't count the same file multiple times within that same company, otherwise the customer company would complain, then DropBox actually advertises a rate of $15 per 5 terabytes per month per user (with no Service Level Agreement of any kind even for business users).

    5. Re:I don't get the pricing? by swb · · Score: 2

      Assuming that their customers do not encrypt their data before they place it on DropBox (which would make sense because DropBox customers are rarely CIOs themselves), then DropBox is most likely hashing the content and only storing a single copy of a file even if there are thousand virtual instances of that same file throughout their system.

      Wouldn't it make more sense for them to dedupe on some kind of variable block size level than the file level? If someone uploads v1 of some 25 meg powerpoint file and then version 2 with just a single page changed, you can't dudupe anything. If you did it at the block level you'd be able to dedupe much more.

      And I would wager that kind of syndrome is common, with someone who works on project X and has tons of files with identical content embedded in them. Plus it would work with encrypted data as well -- you don't care what the data is, just that some chunks of it happened to be identical.

  3. Re:TFS just has marketing by ZipK · · Score: 3, Funny

    I thought glacier is tape storage system.

    Mercury delay lines.

  4. Re:TFS just has marketing by ShanghaiBill · · Score: 2

    How do they do it?

    They use idle HDDs. The three seconds is the time it takes for them to spin up. Google pays less than $30/TB for HDDs ($120 for a 4TB HDD). If they charge $0.01/GB/Month that is $120/TB/year. If a drive lasts three years, they make $360/TB off a $30/TB investment.

  5. Seems expensive by gweihir · · Score: 2

    1 cent per month per GB is $40 for 4TB per month, or 1/5 of what an external 4TB USB3.0 disk costs. As this is "nonessential" data, backup is optional. Sure, the external disk somehow needs to be connected to your server, and there are other factors, but doing this yourself seems to be a lot cheaper.

    Yes, I know if you do it yourself, there is cost for the person doing it as well, but you need to manage the cloud-storage also, and over a worse interface and you get less control in the cloud and cannot put anything confidential there (unless you are not bothered by various TLAs searching through it).

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    1. Re:Seems expensive by DaHat · · Score: 4, Informative

      but doing this yourself seems to be a lot cheaper.

      Oh? Have you factored in the cost of ensuring that you always have an offsite and fully up to date copy, not to mention secondary and tertiary copies for transit time in case your primary datacenter/server happens to kick the bucket/get stolen/evaporate?

      It's easy to compare the cost of an offered service to what you can pick up seeming similar equipment for from Amazon or Newegg... the realities though are far more complex.

      and cannot put anything confidential there (unless you are not bothered by various TLAs searching through it).

      There are ways to manage even that, see this brief bit of Wikipedia for a start.

      I don't know if there are any other commercial or enterprise products out there that do it, but I know this one stores all of it's data in the cloud (with a local cache) but does all of the encryption on site. Only if you choose does the encryption key leave your site and then only in a way you choose making it rather problematic for a TLA or Microsoft to get to your data.

      It is an interesting world when you are dealing with data you cannot legally delete for a period of time and simply want to rid yourself of the burden of having to store it locally. Suing Google or Amazon because their cold storage failed is a far better option than having your IT guy tell you that the HD they stored the crucial data to doesn't spin up anymore... and that the backup disk ended up in the secretary desktop.

    2. Re:Seems expensive by paulhar · · Score: 3, Interesting

      > Oh? Have you factored in the cost of ensuring that you always have an offsite and fully up to date copy, not to mention secondary and tertiary copies for transit time in case your primary datacenter/server happens to kick the bucket/get stolen/evaporate?

      Assumption: They guarantee that your backups/archives are safe.
      Reality: "You are responsible for properly configuring and using the Service Offerings and taking your own steps to maintain appropriate security, protection and backup of Your Content, " Notice the words "and backup". If they lose your data it's your problem, not theirs. http://aws.amazon.com/agreemen...

      > It's easy to compare the cost of an offered service to what you can pick up seeming similar equipment for from Amazon or Newegg... the realities though are far more complex.

      Not to those who are 'skilled in the art'. For example. a copy of CrashPlan, two 3TB drives locally, one 3TB drive at a parent/friends house. For the paranoid, two 3TB drives at two peoples houses. Assumption: network bandwidth is sufficient and/or not much data change rate and/or happy to shuttle drives backward and forward.

      Or, if you don't want to use crashplan, use rsync or other such replication technique. Set up md5sum scanning to run every few weeks at each location, takes a day or so to run and you're 100% certain that bitrot hasn't set in.

      Advantages:
      * I can touch each physical box.
      * It's massively cheaper.
      * Recovery is much quicker since I can just grab the physical copy.
      * I know how the backup infrastructure is designed. If something goes wrong it's my fault, I can't rail uselessly against the sky gods if suddenly all my data goes away.

      Disadvantages:
      * You have to maintain it. You can't trust the sky gods to maintain it for you - a drive fails, you have to buy&replace. Forget to configure something/validate something is done correctly then it's your own fault.

  6. Re:TFS just has marketing by Dutch+Gun · · Score: 2

    I don't think they are spun down. That would kill their reliability. The 3 seconds is more likely to be network delays if the data are scattered to far flung locations.

    Keeping them spun up for rarely accessed data doesn't make a lot of sense to me. This is the sort of data you're probably not constantly accessing, like backups or other data archives. Keeping the hard drives spinning would also increase their power draw and heat.

    Besides which, I don't know of any network that would cause three second lag. You can typically send data halfway around the world in under a few hundred milliseconds. So, I'm not sure what else that delay would be.

    --
    Irony: Agile development has too much intertia to be abandoned now.
  7. Re: TFS just has marketing by Anonymous Coward · · Score: 2, Informative

    This service uses Google normal hard drive architecture, but makes use of the fact that most of their drives have free capacity in terms of Gigabytes, but are running out of IO bandwidth (ie. There are so many users trying to read and write the data that if they filled the drives to capacity, not everyone would get good read and write performance).

    This product is basically filling the drive to capacity, but giving you the lowest priority for reads and writes. Hence why it takes 3 seconds to read data whereas a normal hard drive typically takes 20 milliseconds.

  8. Re:Oh yeah, this is just what I want by sithlord2 · · Score: 2, Insightful

    Overreacting much? You still have a choice on HOW you store your data in the cloud. I keep a backup of my personal data in Amazon S3, but I'm using Duplicity, which encrypts my data with GPG.

    And solved for 30 years? Really? I don't recall having a backup service like this 30 years ago with such uptime, and certainly not in my own home.

    --
    ...You are over-qualified and under-paid. If we give you a raise, we will break the cosmic balance of the universe.
  9. Re:Oh yeah, this is just what I want by Skater · · Score: 3, Informative

    Me either. I have ~250 gigabytes of pictures to back up, and I wanted to do it offsite (they're our family memories). Before Glacier came along, I was looking at building NAS machines for my brother and I that we would host each other's backup data. It would've worked, but what a PITA, and a lot of up-front expense. Glacier is easy, and cheap - my AWS bill last month was $2.50. For that kind of money, it's hard to justify the time and expense of rolling my own remote NAS solution. (I know over the long run I might be able to build the remote NAS solution for less money, but figure in electricity costs and potential drive replacements, and I'm not sure that solution would be that much cheaper. It would all depend on how long the drives last.)

  10. Re:Backup software? by SpzToid · · Score: 2

    git-annex and Amazon glacier might serve you well. Encrypting your GIT/Glacier archive using your PGP key is a one-click-and-save option. With Google's recent announcement of Nearline I imagine over time it will be supported also. GIT annex came about through a kick-starter campaign, and you're welcome to support the project.

    Here's some links to help you:

    http://git-annex.branchable.co...

    Specifically for Glacier:
    http://git-annex.branchable.co...

    --
    You can't be ahead of the curve, if you're stuck in a loop.
  11. what about redundancy? by SethJohnson · · Score: 2

    Good analysis here, Shanghai.

    In terms of the prediction of "$360/TB off a $30/TB investment", does that take into account redundancy to protect their liability for drive failure? I'm thinking they have at least two copies of everything a customer uploads. Maybe three. It's still great money, but I think the numbers are more like $360/TB off a $60/TB investment.