Google Nearline Delivers Some Serious Competition To Amazon Glacier
SpzToid writes Google is offering a new kind of data storage service – and revealing its cloud computing strategy against Amazon Web Services and Microsoft Azure. The company said on Wednesday that it would offer a service called Nearline, for non-essential data. Like an AWS product called Glacier, this storage costs just a penny a month per gigabyte. Microsoft's cheapest listed online storage is about 2.4 cents a gigabyte. While Glacier storage has a retrieval time of several hours, Google said Nearline data will be available in about three seconds. From the announcement: "Today, we're excited to introduce Google Cloud Storage Nearline, a simple, low-cost, fast-response storage service with quick data backup, retrieval and access. Many of you operate a tiered data storage and archival process, in which data moves from expensive online storage to offline cold storage. We know the value of having access to all of your data on demand, so Nearline enables you to easily backup and store limitless amounts of data at a very low cost and access it at any time in a matter of seconds."
Yeah I'd like some more meat to the story as well. Amazon Glacier achieves its pricing by using low-RPM consumer drives plugged into some sort of high-density backplanes; supposedly they are so densely packed that you can only spin up a few drives at once due to power and heat issues. Hence the delay.
I assume Google is doing something similar, maybe with somewhat better power or cooling since they're offering faster retrieval times which implies that perhaps they can spin up a higher percentage of drives at a time.
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
A penny a month per gigabyte... that's $10/month per terabyte... that is already what Dropbox charges for "fast" storage. So what gives? Why would I pay $10/month for a terabyte of slow storage when I can get the same amount of storage for the same price in a regular, fast format with Dropbox?
I thought glacier is tape storage system.
Mercury delay lines.
How do they do it?
They use idle HDDs. The three seconds is the time it takes for them to spin up. Google pays less than $30/TB for HDDs ($120 for a 4TB HDD). If they charge $0.01/GB/Month that is $120/TB/year. If a drive lasts three years, they make $360/TB off a $30/TB investment.
1 cent per month per GB is $40 for 4TB per month, or 1/5 of what an external 4TB USB3.0 disk costs. As this is "nonessential" data, backup is optional. Sure, the external disk somehow needs to be connected to your server, and there are other factors, but doing this yourself seems to be a lot cheaper.
Yes, I know if you do it yourself, there is cost for the person doing it as well, but you need to manage the cloud-storage also, and over a worse interface and you get less control in the cloud and cannot put anything confidential there (unless you are not bothered by various TLAs searching through it).
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
I don't think they are spun down. That would kill their reliability. The 3 seconds is more likely to be network delays if the data are scattered to far flung locations.
Keeping them spun up for rarely accessed data doesn't make a lot of sense to me. This is the sort of data you're probably not constantly accessing, like backups or other data archives. Keeping the hard drives spinning would also increase their power draw and heat.
Besides which, I don't know of any network that would cause three second lag. You can typically send data halfway around the world in under a few hundred milliseconds. So, I'm not sure what else that delay would be.
Irony: Agile development has too much intertia to be abandoned now.
This service uses Google normal hard drive architecture, but makes use of the fact that most of their drives have free capacity in terms of Gigabytes, but are running out of IO bandwidth (ie. There are so many users trying to read and write the data that if they filled the drives to capacity, not everyone would get good read and write performance).
This product is basically filling the drive to capacity, but giving you the lowest priority for reads and writes. Hence why it takes 3 seconds to read data whereas a normal hard drive typically takes 20 milliseconds.
Overreacting much? You still have a choice on HOW you store your data in the cloud. I keep a backup of my personal data in Amazon S3, but I'm using Duplicity, which encrypts my data with GPG.
And solved for 30 years? Really? I don't recall having a backup service like this 30 years ago with such uptime, and certainly not in my own home.
...You are over-qualified and under-paid. If we give you a raise, we will break the cosmic balance of the universe.
Me either. I have ~250 gigabytes of pictures to back up, and I wanted to do it offsite (they're our family memories). Before Glacier came along, I was looking at building NAS machines for my brother and I that we would host each other's backup data. It would've worked, but what a PITA, and a lot of up-front expense. Glacier is easy, and cheap - my AWS bill last month was $2.50. For that kind of money, it's hard to justify the time and expense of rolling my own remote NAS solution. (I know over the long run I might be able to build the remote NAS solution for less money, but figure in electricity costs and potential drive replacements, and I'm not sure that solution would be that much cheaper. It would all depend on how long the drives last.)
git-annex and Amazon glacier might serve you well. Encrypting your GIT/Glacier archive using your PGP key is a one-click-and-save option. With Google's recent announcement of Nearline I imagine over time it will be supported also. GIT annex came about through a kick-starter campaign, and you're welcome to support the project.
Here's some links to help you:
http://git-annex.branchable.co...
Specifically for Glacier:
http://git-annex.branchable.co...
You can't be ahead of the curve, if you're stuck in a loop.
Good analysis here, Shanghai.
In terms of the prediction of "$360/TB off a $30/TB investment", does that take into account redundancy to protect their liability for drive failure? I'm thinking they have at least two copies of everything a customer uploads. Maybe three. It's still great money, but I think the numbers are more like $360/TB off a $60/TB investment.
$5 / month hosted VPS on linux = awesome!