Slashdot Mirror


Google Nearline Delivers Some Serious Competition To Amazon Glacier

SpzToid writes Google is offering a new kind of data storage service – and revealing its cloud computing strategy against Amazon Web Services and Microsoft Azure. The company said on Wednesday that it would offer a service called Nearline, for non-essential data. Like an AWS product called Glacier, this storage costs just a penny a month per gigabyte. Microsoft's cheapest listed online storage is about 2.4 cents a gigabyte. While Glacier storage has a retrieval time of several hours, Google said Nearline data will be available in about three seconds. From the announcement: "Today, we're excited to introduce Google Cloud Storage Nearline, a simple, low-cost, fast-response storage service with quick data backup, retrieval and access. Many of you operate a tiered data storage and archival process, in which data moves from expensive online storage to offline cold storage. We know the value of having access to all of your data on demand, so Nearline enables you to easily backup and store limitless amounts of data at a very low cost and access it at any time in a matter of seconds."

43 of 71 comments (clear)

  1. It's the webframe, Captain! by msobkow · · Score: 1

    It's the webframe, Captain! She kenna take any more data...

    --
    I do not fail; I succeed at finding out what does not work.
  2. TFS just has marketing by Anonymous Coward · · Score: 1

    How do they do it?

    1. Re:TFS just has marketing by Kadin2048 · · Score: 2, Interesting

      Yeah I'd like some more meat to the story as well. Amazon Glacier achieves its pricing by using low-RPM consumer drives plugged into some sort of high-density backplanes; supposedly they are so densely packed that you can only spin up a few drives at once due to power and heat issues. Hence the delay.

      I assume Google is doing something similar, maybe with somewhat better power or cooling since they're offering faster retrieval times which implies that perhaps they can spin up a higher percentage of drives at a time.

      --
      "Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
    2. Re:TFS just has marketing by Anonymous Coward · · Score: 1

      I thought glacier is tape storage system.

    3. Re:TFS just has marketing by Anonymous Coward · · Score: 1

      They marketed glacier with the pretense that it was tape storage, but it's actually idle S3 storage.

    4. Re:TFS just has marketing by ZipK · · Score: 3, Funny

      I thought glacier is tape storage system.

      Mercury delay lines.

    5. Re:TFS just has marketing by ShanghaiBill · · Score: 2

      How do they do it?

      They use idle HDDs. The three seconds is the time it takes for them to spin up. Google pays less than $30/TB for HDDs ($120 for a 4TB HDD). If they charge $0.01/GB/Month that is $120/TB/year. If a drive lasts three years, they make $360/TB off a $30/TB investment.

    6. Re:TFS just has marketing by Anonymous Coward · · Score: 1, Interesting

      FWIW, there's never been any studies or hard data backing up the myth that HDDs spinning up/down have reliability concerns - this all came about as it's a S.M.A.R.T. metric (which also, is an estimate, and they also fail to provide citations or proof of it's validity).

      It's very possible they simply are letting the HDDs idle and spinning them up as needed, I would expect Google of all people to have some pretty decent metrics on HDD failure rates/reasons.

      They've been researching this stuff for over half a decade now: http://static.googleusercontent.com/media/research.google.com/en/archive/disk_failures.pdf

    7. Re:TFS just has marketing by snowgirl · · Score: 1, Interesting

      https://what-if.xkcd.com/63/

      It's common knowledge that Google has already been using consumer-grade drives for all of their servers. Because if a drive fails, "so what, we have another one over there holding the data..."

      This is pretty much similar to what happened with GMail. They came out and said, "here, have a gigabyte for free!" and everyone was like, "yeah, right..."

      Google has storage leaking out of its ears, and generates massive amounts of new data every day... sticking other people's data into the pile wouldn't even be a straw that breaks the camel's back... The camel is already hauling a hojillion ton rock, a straw isn't going to do shit.

      DISCLAIMER: I worked for Google, this does not in any way reflect Google's official word on this news.

      --
      WARNING! This girl exceeds the MAXIMUM SAFE standards established by the FDA for BRATTINESS
    8. Re:TFS just has marketing by Dutch+Gun · · Score: 2

      I don't think they are spun down. That would kill their reliability. The 3 seconds is more likely to be network delays if the data are scattered to far flung locations.

      Keeping them spun up for rarely accessed data doesn't make a lot of sense to me. This is the sort of data you're probably not constantly accessing, like backups or other data archives. Keeping the hard drives spinning would also increase their power draw and heat.

      Besides which, I don't know of any network that would cause three second lag. You can typically send data halfway around the world in under a few hundred milliseconds. So, I'm not sure what else that delay would be.

      --
      Irony: Agile development has too much intertia to be abandoned now.
    9. Re:TFS just has marketing by drinkypoo · · Score: 1

      I don't think they are spun down. That would kill their reliability.

      Would it? Would it really? I've been spinning down rotating discs since I discovered hd-idle, and haven't lost one yet. The only ones I've lost in that time have been discs I've been keeping running all the time. Why would spinning up a disc cause it to wear unduly? Auto-spindown is the very best feature of MyBooks and GoFlexes, I tell you what. It gets so nice and quiet when both my external discs spin down (I have a backup volume for each) now that all my system volumes are on SSD.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    10. Re: TFS just has marketing by Anonymous Coward · · Score: 2, Informative

      This service uses Google normal hard drive architecture, but makes use of the fact that most of their drives have free capacity in terms of Gigabytes, but are running out of IO bandwidth (ie. There are so many users trying to read and write the data that if they filled the drives to capacity, not everyone would get good read and write performance).

      This product is basically filling the drive to capacity, but giving you the lowest priority for reads and writes. Hence why it takes 3 seconds to read data whereas a normal hard drive typically takes 20 milliseconds.

    11. Re:TFS just has marketing by ShanghaiBill · · Score: 1

      I don't think they are spun down. That would kill their reliability.

      There is no reason to believe that idling a drive shortens its life. Reliability studies from Google and Backblaze found no failure increase with spin up/down cycles. Total spinning time is a much bigger predictor of failure, so spinning down when not in use likely extends the life.

      Biggest factors affecting reliability:
      1. Manufacturer: Hitachi is best, Seagate is worst. WD is in the middle.
      2. Total spinning time
      3. Temperature: Hotter is better (the opposite of what most people believe).

    12. Re: TFS just has marketing by jtgd · · Score: 1

      and I'll guess that their data is in the outer cylinders and your data is on the inner cylinders.

      --
      J
    13. Re:TFS just has marketing by jtgd · · Score: 1

      Worst case queueing, behind the important data on the drive.

      --
      J
  3. I don't get the pricing? by Pausanias · · Score: 4, Interesting

    A penny a month per gigabyte... that's $10/month per terabyte... that is already what Dropbox charges for "fast" storage. So what gives? Why would I pay $10/month for a terabyte of slow storage when I can get the same amount of storage for the same price in a regular, fast format with Dropbox?

    1. Re:I don't get the pricing? by Anonymous Coward · · Score: 1

      Don't ask me, I don't even know if it means storage or bandwidth usage. Don't both need to be mentioned?

    2. Re:I don't get the pricing? by wh1pp3t · · Score: 3, Interesting

      A penny a month per gigabyte... that's $10/month per terabyte... that is already what Dropbox charges for "fast" storage. So what gives? Why would I pay $10/month for a terabyte of slow storage when I can get the same amount of storage for the same price in a regular, fast format with Dropbox?

      Why pay for a terabyte of storage when you are not using it to capacity?

    3. Re:I don't get the pricing? by Pausanias · · Score: 2

      It is just the pure storage... bandwidth is extra, which makes it even worse compared to Dropbox, where bandwidth is included.

    4. Re:I don't get the pricing? by SpzToid · · Score: 1

      What if you had more than just 1 Tb? If you had more than 1tb, how is Dropbox going to help you at all? Oh right, now you must purchase DropBox for Business, and your price just went way up. https://www.dropbox.com/busine...

      --
      You can't be ahead of the curve, if you're stuck in a loop.
    5. Re:I don't get the pricing? by Barlo_Mung_42 · · Score: 1

      Heck, for less than $10 a month you get infinite OneDrive storage.

    6. Re:I don't get the pricing? by SpzToid · · Score: 3, Informative

      Interesting point, so I read up a bit. This only applies to Office365 customers. What about Linux, (etc.) users that can't fully utilize Office365? This really seems almost like a consumer option, and there are certainly business use-cases where this just ain't gonna fly. There's a 20,000 file limit, *period*, and the maximum file size is 10Gb, which is limiting for some, (especially those folks who roll their own encryption and compression).

      For those reasons, Microsoft Office365/OneDrive doesn't seem like a serious competitor to Google Nearline, Amazon Glacier, or Microsoft Azure services.

      http://www.techrepublic.com/ar...

      --
      You can't be ahead of the curve, if you're stuck in a loop.
    7. Re:I don't get the pricing? by stephanruby · · Score: 3, Interesting

      A penny a month per gigabyte... that's $10/month per terabyte... that is already what Dropbox charges for "fast" storage. So what gives? Why would I pay $10/month for a terabyte of slow storage when I can get the same amount of storage for the same price in a regular, fast format with Dropbox?

      Here is an answer from someone on Quora.

      Dropbox offers no Service Level Agreement. Actually they specifically provide no warrantees whatsoever about their service (http://www.dropbox.com/terms). This is a non-starter for many CIOs.

      Beyond that, the fact that Dropbox doesn't "own" the underlying cloud storage architecture -- Amazon S3 -- could be an issue, although they advertise it as secure via in-transit and on-disk encryption (https://www.dropbox.com/help/27).

      If it still is the case that Dropbox uses S3 itself, then that wouldn't make business sense for them to pay more for storage than they're charging their own customers (even if they've decided not to offer a Service Level Agreement).

      So my guess is that this has to do with the way they count the storage for customers. Assuming that their customers do not encrypt their data before they place it on DropBox (which would make sense because DropBox customers are rarely CIOs themselves), then DropBox is most likely hashing the content and only storing a single copy of a file even if there are thousand virtual instances of that same file throughout their system.

      Also note that in the special case where a company is footing the bill and DropBox can't count the same file multiple times within that same company, otherwise the customer company would complain, then DropBox actually advertises a rate of $15 per 5 terabytes per month per user (with no Service Level Agreement of any kind even for business users).

    8. Re:I don't get the pricing? by heypete · · Score: 1

      One reason I'm about to start using Amazon Glacier for personal backups is specifically because you can't delete files. I want to put up all of my family photos and videos, and know that they will be there even if my kid installs ransomware, our house gets robbed and burns down, and I'm in a coma for six months and can't deal with trying to retrieve deleted files (along with determining the real ones vs ransom ones) in a timely manner from Dropbox or Crashplan.

      You can absolutely delete files in Amazon Glacier if the access key you're using has that permission enabled. I imagine there's a surprising number of people who use their AWS root account credentials to access Glacier even though this is strongly discouraged. Even if one creates a new IAM user with access only to Glacier (so a bad guy who compromised your computer can't spin up EC2 instances), the default is for all permissions to be enabled.

      Of course, you can disable the permissions to delete files: I've done that, and it works well, but it's not the default. I have a separate IAM user with list-and-delete privileges, but that is a separate user in FastGlacier and requires a password to use -- that keeps me from inadvertently fat-fingering the delete key.

    9. Re:I don't get the pricing? by SpzToid · · Score: 1

      You might be interested to know about git-annex then. Here's some links to help you

      http://git-annex.branchable.co...

      Specifically for Glacier:
      http://git-annex.branchable.co...

      --
      You can't be ahead of the curve, if you're stuck in a loop.
    10. Re:I don't get the pricing? by zennling · · Score: 1

      Can you get fast private connections to Dropbox? Google? AWS has directconnect - you can get Gbps connections to move backups to Glacier

    11. Re:I don't get the pricing? by Alomex · · Score: 1

      , then DropBox is most likely hashing the content and only storing a single copy of a file even if there are thousand virtual instances of that same file throughout their system.

      I'm pretty sure they have said as much themselves.

    12. Re:I don't get the pricing? by swb · · Score: 2

      Assuming that their customers do not encrypt their data before they place it on DropBox (which would make sense because DropBox customers are rarely CIOs themselves), then DropBox is most likely hashing the content and only storing a single copy of a file even if there are thousand virtual instances of that same file throughout their system.

      Wouldn't it make more sense for them to dedupe on some kind of variable block size level than the file level? If someone uploads v1 of some 25 meg powerpoint file and then version 2 with just a single page changed, you can't dudupe anything. If you did it at the block level you'd be able to dedupe much more.

      And I would wager that kind of syndrome is common, with someone who works on project X and has tons of files with identical content embedded in them. Plus it would work with encrypted data as well -- you don't care what the data is, just that some chunks of it happened to be identical.

  4. Seems expensive by gweihir · · Score: 2

    1 cent per month per GB is $40 for 4TB per month, or 1/5 of what an external 4TB USB3.0 disk costs. As this is "nonessential" data, backup is optional. Sure, the external disk somehow needs to be connected to your server, and there are other factors, but doing this yourself seems to be a lot cheaper.

    Yes, I know if you do it yourself, there is cost for the person doing it as well, but you need to manage the cloud-storage also, and over a worse interface and you get less control in the cloud and cannot put anything confidential there (unless you are not bothered by various TLAs searching through it).

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    1. Re:Seems expensive by DaHat · · Score: 4, Informative

      but doing this yourself seems to be a lot cheaper.

      Oh? Have you factored in the cost of ensuring that you always have an offsite and fully up to date copy, not to mention secondary and tertiary copies for transit time in case your primary datacenter/server happens to kick the bucket/get stolen/evaporate?

      It's easy to compare the cost of an offered service to what you can pick up seeming similar equipment for from Amazon or Newegg... the realities though are far more complex.

      and cannot put anything confidential there (unless you are not bothered by various TLAs searching through it).

      There are ways to manage even that, see this brief bit of Wikipedia for a start.

      I don't know if there are any other commercial or enterprise products out there that do it, but I know this one stores all of it's data in the cloud (with a local cache) but does all of the encryption on site. Only if you choose does the encryption key leave your site and then only in a way you choose making it rather problematic for a TLA or Microsoft to get to your data.

      It is an interesting world when you are dealing with data you cannot legally delete for a period of time and simply want to rid yourself of the burden of having to store it locally. Suing Google or Amazon because their cold storage failed is a far better option than having your IT guy tell you that the HD they stored the crucial data to doesn't spin up anymore... and that the backup disk ended up in the secretary desktop.

    2. Re:Seems expensive by paulhar · · Score: 3, Interesting

      > Oh? Have you factored in the cost of ensuring that you always have an offsite and fully up to date copy, not to mention secondary and tertiary copies for transit time in case your primary datacenter/server happens to kick the bucket/get stolen/evaporate?

      Assumption: They guarantee that your backups/archives are safe.
      Reality: "You are responsible for properly configuring and using the Service Offerings and taking your own steps to maintain appropriate security, protection and backup of Your Content, " Notice the words "and backup". If they lose your data it's your problem, not theirs. http://aws.amazon.com/agreemen...

      > It's easy to compare the cost of an offered service to what you can pick up seeming similar equipment for from Amazon or Newegg... the realities though are far more complex.

      Not to those who are 'skilled in the art'. For example. a copy of CrashPlan, two 3TB drives locally, one 3TB drive at a parent/friends house. For the paranoid, two 3TB drives at two peoples houses. Assumption: network bandwidth is sufficient and/or not much data change rate and/or happy to shuttle drives backward and forward.

      Or, if you don't want to use crashplan, use rsync or other such replication technique. Set up md5sum scanning to run every few weeks at each location, takes a day or so to run and you're 100% certain that bitrot hasn't set in.

      Advantages:
      * I can touch each physical box.
      * It's massively cheaper.
      * Recovery is much quicker since I can just grab the physical copy.
      * I know how the backup infrastructure is designed. If something goes wrong it's my fault, I can't rail uselessly against the sky gods if suddenly all my data goes away.

      Disadvantages:
      * You have to maintain it. You can't trust the sky gods to maintain it for you - a drive fails, you have to buy&replace. Forget to configure something/validate something is done correctly then it's your own fault.

    3. Re:Seems expensive by AmiMoJo · · Score: 1

      Spideroak. Encryption client side, reasonable SLA. Not the cheapest by far but you get what you pay for.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    4. Re:Seems expensive by chihowa · · Score: 1

      Spideroak. Still closed source five years after they said that they would open it. Never independently audited. All expectations of security and privacy are derived solely from marketing claims. Even the "zero knowledge" claim is more marketing speak than truth. Caveat emptor.

      Closed source all-in-one crypto and cloud storage is almost never the right answer.

      --
      If you want a vision of the future, imagine a youtube comments section scrolling - forever.
    5. Re:Seems expensive by gweihir · · Score: 1

      and cannot put anything confidential there (unless you are not bothered by various TLAs searching through it).

      There are ways to manage even that, see this brief bit of Wikipedia for a start.

      That only works if you do not process the data in the same cloud. But then the price goes through the roof as you have to pay transfer fees and the offer becomes very expensive.

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
  5. Re:Oh yeah, this is just what I want by sithlord2 · · Score: 2, Insightful

    Overreacting much? You still have a choice on HOW you store your data in the cloud. I keep a backup of my personal data in Amazon S3, but I'm using Duplicity, which encrypts my data with GPG.

    And solved for 30 years? Really? I don't recall having a backup service like this 30 years ago with such uptime, and certainly not in my own home.

    --
    ...You are over-qualified and under-paid. If we give you a raise, we will break the cosmic balance of the universe.
  6. Re:Oh yeah, this is just what I want by Skater · · Score: 3, Informative

    Me either. I have ~250 gigabytes of pictures to back up, and I wanted to do it offsite (they're our family memories). Before Glacier came along, I was looking at building NAS machines for my brother and I that we would host each other's backup data. It would've worked, but what a PITA, and a lot of up-front expense. Glacier is easy, and cheap - my AWS bill last month was $2.50. For that kind of money, it's hard to justify the time and expense of rolling my own remote NAS solution. (I know over the long run I might be able to build the remote NAS solution for less money, but figure in electricity costs and potential drive replacements, and I'm not sure that solution would be that much cheaper. It would all depend on how long the drives last.)

  7. Backup software? by aaaaaaargh! · · Score: 1

    Is there already some personal backup software for GNU/Linux that encrypts all data and can use this as storage?

    I'm looking for large offline storage but strong client-side encryption is a must.

    1. Re:Backup software? by SpzToid · · Score: 2

      git-annex and Amazon glacier might serve you well. Encrypting your GIT/Glacier archive using your PGP key is a one-click-and-save option. With Google's recent announcement of Nearline I imagine over time it will be supported also. GIT annex came about through a kick-starter campaign, and you're welcome to support the project.

      Here's some links to help you:

      http://git-annex.branchable.co...

      Specifically for Glacier:
      http://git-annex.branchable.co...

      --
      You can't be ahead of the curve, if you're stuck in a loop.
    2. Re:Backup software? by coofercat · · Score: 1

      I recently 'discovered' duplicity - it's very good for this sort of thing, but it can't use this or Glacier as a store. I can use S3 though, which you can use as staging for Glacier.

      Personally, I use Duplicity to backup my NAS to another disk. I then have a script that copies full backups up to Glacier (and then deletes them). I'm working on a nicer glacier client for this, but the java one I downloaded from github works well enough to get going.

  8. Keep up the good work Google by watchcriclive1111 · · Score: 1

    Google is a tech giant... and there is no stopping it. No wonder it will overtake everything in its way at this rate.

  9. Re:Egress rate by mbourgon · · Score: 1

    Actually, not that simple. Neither have egress costs if you use their VMs - it's only going to the internet. Amazon Glacier to Internet is free for the first 1GB, $.09/gb for the first 10tb, $.085/gb until 50tb (at between 10-50tb Nearline is cheaper), then $.07 until 150tb.

    Both charge 1 cent/gb for reads, though AWS is free for the first 5%.

    --
    "Sometimes a woman is a kind of religion, she can save your soul & set you free from all your sins" - Bad Examples
  10. what about redundancy? by SethJohnson · · Score: 2

    Good analysis here, Shanghai.

    In terms of the prediction of "$360/TB off a $30/TB investment", does that take into account redundancy to protect their liability for drive failure? I'm thinking they have at least two copies of everything a customer uploads. Maybe three. It's still great money, but I think the numbers are more like $360/TB off a $60/TB investment.

  11. Compared to tape by dshk · · Score: 1

    I thought that our tape backup system is luxury, for such a small company. Quite the contrary, it seems that tape is very cheap. Back of the envelope calculation: Our daily full backup is about 600 gigabytes. We are using 6 pieces of LTO-3 tapes for the last days and 1 for each month, plus 1 for each year. That is about 23 tapes in use. Total of 23Ã--600GB is 13800GB, 138 dollar each month on Google Nearline, which is 1700$ per year. The total cost of the tape drive, the tapes and the SCSI adapter was less than 1700$. And I expect that to work for at least 5 years, not 1. That means that for backup tape is 80% cheaper. Of course deduplication would reduce the data amount to a few percentage of its current size. But then we would lose the plenty of redundancy we have with tapes. Google Nearline is offsite, that is good, or actually, that is required for backup. Offline copies are required too, and that is where the entire thing fails for this purpose. Google nearline is online storage from a backup point of view. In other words it cannot be used for backup. It can be part of a backup strategy, though. It could be good for saving backup copies of family photos, if the account password is managed very cautiously. Otherwise I do not see the use cases for this service, but I am sure there are some.