Google Prefers DRAM to Hard Disks
KP writes: "I came across this interview with Google's CEO. A very interesting
read." It's interesting in part becase that CEO (Eric Schmidt) claims that for Google's purposes, "it costs less money and it is more efficient to use DRAM as storage as opposed to hard disks." "I still cannot figure out how he says storing data on DRAM is
cheaper than storing it on hard-disks. Maybe, if you buy in bulk?"
In the hallowed halls of Google... Row upon row of uber-boxen with a Bagillion megabytes of ram...
Then someone trips over the power chord...
-- Dan =)
I think its only cheaper on a Cost verus Speed basic. I am sure the google archive is only a few 100gb , and thats not too much to buy in ram. A hard disk would be cheaper but alot slower. Costing the company extra money in the long run.
Cruise TT
I was always wondering how google could mirror almost the entire internet and server millions of hits, I mean, it would need super super super fast storage... DRAM is at least a step in that direction... They must have a fsckin LOT of it tho :) A few TB...
How often do you see DRAM fail compared to Hard Disks? A bit more reliability IMHO.
If google has something like 10,000 linux PC's, I would definately think that using RAM and a ramdisk for the rootpartition would be cheaper than putting a hard drive in every PC. I would imagine that the hard drives would be the first to go if something failed.
Obviously, if they used DRAM for their HUGE central databases, it would not be a cheaper solution.
But, I'm talking out of my ass, because I don't know how their datacenter works.. anyone anyone?
-metric
They make their money on hits served so speed is far more cost effective than cost of storage medium. If they can speed up serviing hits, they're ahead of the game.
I still cannot figure out how he says storing data on DRAM is cheaper than storing it on hard-disks. Maybe, if you buy in bulk?
When you pay for DRAM, you get read latency measured in nanoseconds rather than milliseconds, which lets you get more queries done faster with less processing hardware. The key metric here is seeks per second. From the article:
With a rotating disk, if you wanted to access a million different pieces of data, you would have to either wait for a million seeks or set up a 1,000-way mirror and wait for 1,000 seeks. Because DRAM seeks several orders of magnitude more quickly, you don't need as many mirrors of the data to get the same number of seeks per second.
Will I retire or break 10K?
AFAIK Linux and Open BSD cannot do this either. It seems amazing to me that people have missed this idea.
Google reads all the newspapers on the Web every hour and constructs a newspaper for the world by computer--no humans are involved.
Now if only Google could go out and do its own fact-checking, it wouldn't need to rely on other newspapers at all. Mark my words, by 2010 google will be the only place you go when you need information. Forget askjeeves, try listentogoogle. No humans will be involved. Scary.
By the way, this guy can't speak for beans.
The speech I give everyday is: "This is what we do. Is what you are doing consistent with that, and does it change the world?"
I often see comments from this from people who have little experience in business.
What you pay for the initial product is not what it "costs" in the long-term. Businesses have a term for this called TCO or Total Cost of Ownership. It includes all the other time and materials needed to keep the item in use.
I would imagine in this case that the simple reason is that why DRAM is more expensive to purchase it is a *lot* less expensive to run, the primary cost being power.
Also consider that if speed is of essence, as it with Google, it's not 50GB or RAM vs a 50GB cheap-n-cheerful IDE drive. A 50GB Ultra160 drive costs considerably more than an IDE and still won't come near the DRAM for speed.
[)amien
That it can handle many clients with little latency... You'd have to duplicate the data across a huge number of disks to provide similar response time to clients. Sure, if you were the only client, you couldn't tell the difference but with thousands upon thousands of clients all seeking data that would be stored in different locations on a disk things would quickly grind to a halt. Because so much unrelated data is being requested, seek time is the key. Sure, memory is more expensive per meg but its ability to serve so many more clients makes it less expensive overall.
I had an opportunity to play with one on a 20 CPU Starfire domain and it was pretty impressive. The unit I was using had 8 wide SCSI ports on it, which were all connected. Interestingly, when the system was pegged, it was off the scale in system time. There's probably a locking problem in the Solaris kernel that's the real bottleneck.
This just shows how limited the lifespan is of 32-bit 4GB architecture, especially for servers.
At my dad's work, they use a type of chip, but it's not dram. They use E^2prom. True, you do take a performance hit, but they have 10 "gig ethernet ports" on the thing. The last price quote I got was $12000 for a terabyte of this stuff. Don't forget to compare price/performance ratios to the best chipsets of IDE (or if you're a scsi bigot, SCSI). Pulling random data is very easy for chips, but HD's of ANY speed and quality are still slower.
Josh Crawley
If they made a 2GB RAM Drive in each of their 10,000 machines then that would be 20 TB of storage. This seems sufficient to me for most storage needs.
You would still need to be able to direct searches to the machines that have the part of the data you need. This would take a high speed network and some clever programming. But it is doable.
I always was amazed at the speed of googles search engine, now I have a little more clue as to why it is so fast.
Sounds to me like they might be able to sell their database software as a money making product at some point. Oracle, watch out!
-- Never make a general statement.
See The Five-Minute Rule, ten years later (Word Doc) or it's HTML-ified Google Cache
Reasonably priced DRAM goes for about $250/gig; a reasonably priced SCSI RAID setup goes for about $10/gig.
In order to say that the DRAM option is cheaper than the hard drive option, the performance of the DRAM option would have to exceed the performance of the DRAM option by a factor of greater than 25. If you do the math, it's possible.
Years ago, I worked in a VAX shop that used RAM drives for some installed/shared images that required high concurrency. The performance was impressive - and was factored into the overall cost analysis of the purchase.
Maybe he's talking in terms of TCO (total cost of ownership). Over its lifetime, RAM costs less than its hard drive counterpart?
Another point... as long as you don't store you METADATA 100% in RAM, you can store at least your data (cached web pages) in RAM. What happens if it gets dumped? Simple. Just respider the pages you lost and go on. Small amounts of data loss can be covered.
Okay. It may sound like I'm talking out of my ass because I am. It is really hard to cover for a statement like that. But lets talk again on the performance angle that has been covered (but with a little more emphasis on RAID disks).
You *may* be able to get better cost/performance with LOCAL memory (not ram-based drives) than you could with a RAID array. And a raid array could never equal the performance you get with local memory. Of course, local memory could never reach the storage you achieve with a raid array. So these two paths seem to diverge (bulk storage vs speed) when comparing local DRAM to RAID'd disks.
His statement MAY make sense, but it would have to be put into a larger context. (RAM is better than disk in X circumstances.)
Today new computers have 256 or 512 Mb RAM, that's what we've got 10 years ago (386-486 era), every day RAM gets cheaper and IMHO a spinning disk fails too much and it's too much slow to work with on a overloaded servers. RAM provides us almost instant access to data and doen't fails as a hard disk.
I hope soon we'll only use some kind of RAM for everything and not a disk.
less mirrors = less computers = less space
real estate is expensive.
The masses are the crack whores of religion.
The major advantage DRAM has over hard drives in Google is that when the machine reboots the memory will be cleared and then it will go scan pages again. No need to save what was in memory the previous time. Good idea for google bad idea for accounting software.
...has an article on this very subject. The listed article "How to hack from a RAM disk" is what you're looking for.
The simple truth is that interstellar distances will not fit into the human imagination
- Douglas Adams
DRAM is probably much cheaper than hard drives in the sense of their electricity bill. Think of how many nodes their clusters have and then imagine each of them each having at least two hard drive motors spinning 24/7.
More often than not with a database your bottleneck is I/O. When you run a database you cannot have enough disks, and you cannot have enough FAST disks. In order to accomplish the kind of I/O bandwidth that a place like google is going to need you're going to need the best EMC arrays (or perhaps an IBM Shark) money can buy. And guess what? They run you megabucks. You can't just take a bunch of SCSI disks and expect them to perform as well as Fibre channel arrays. You gotta have controllers with multiple caches. Everyone who's never dealt with databases think that SCSI is the beginning and the end of hard drives, and its so far from being the truth its not funny.
I've really no idea how complex the queries are or whether or not they use a relational database but that being said its still has to hit the disk to retrieve the data and that's where every decently designed database's bottleneck is. Besides google caches all its pages. Egads! Do you have any idea how much RAM they must need for just that alone? Yes RAM is faster. Oracle even teaches you to try to keep your frequently used tables in cache anyhow, because its fastest, of course they qualify that with the word small realizing that most people don't have the gobs of memory needed to cache large tables.
Although it's not mentioned in the Slashdot writeup, I think that probably the most important part of this interview was the discussion of Google's business model and future. It's good to see that they're committed to not getting in over their heads with extraneous services. They've found a business model that works and they're sticking to it, rather than getting greedy and adding dumb new services that have nothing to do with searching, or "search," as he put it.
A lot of technology companies would do very well to follow Google's example, it seems to me. They're proving that Internet services are a perfectly sound venture if the company has a sensible business model and always keeps focused on providing quality technology and services in the area that they know best.
Lots of other posters have mentioned pieces of the puzzle, so I risk being redundant here. But, it seems the whole equation goes something like this:
1. If each box only handles a part of the web, it is possible that most of the space on it's drive (or drives) are wasted anyway.
2. If disk latency means that cpus spend idle time, eliminating that latency means more throughput per box, hence fewer boxes. More money spent on DRAM, less money spent on CPU, power supplies, etc.
3. Even with same number of boxes, lower power draw, smaller and/or fewer UPS(s) required. With fewer boxes, even more reduction.
4. Which leads, of course, to lower A/C bills during the warm weather.
5. Fewer boxes, fewer pieces, whatever, means fewer things breaking. The impact of a single outage may be greater, but, from the cost standpoint, you need fewer man-hours to manage the outages, fewer spare-parts, etc.
6. Lower medical expenses from sysadmins going insane due to the noise from all those drives and the associated larger power supplies and extra cooling fans.
OK, that last item is a stretch, but how many sysadmins are more than a step from insanity anyway?
Another service that takes advantage of recency is something we just added called Overview of Today's Headlines. Google reads all the newspapers on the Web every hour and constructs a newspaper for the world by computer--no humans are involved.
This is a pretty cool idea. I only hope they make a RSS feed out of it so that I can use it in my companies new Portal environment. That would be really great! I love Google!
Check it out here.
KangarooBox - We make IT simple!
Additionally DRAM has a much longer time between failures than hard drives do; so maintenance costs are lower.
The throughput from RAM-RAM is on a totally different order of magnitude than HDD. The read-time alone makes RAM more "ecomonical" than HDD (at current memory costs). If google were to switch to HDD, then they would need one copy of their entire DB for each search - which would mean thousands of copies of their DB. With RAM they only need a few copies - making the total cost lower with RAM.
DRAM requires little electricity and produces almost no heat.
Hard disks consume large amounts of electricity, and produce large amounts of heat, since they consist of pieces of metal spinning at 7200rpm.
Using DRAM upfront costs quite a bit more, but uses less electricity and requires fewer chillers, condensors, etc to keep cool.
Conformity is the jailer of freedom and enemy of growth. -JFK
What was the search string, so the rest of us can slay (slashdot) the mighty google.
Anyone remember the Tandy 1000 that had MS-DOS and Deskmate in ROM? :)
BytesTemplar.com
Individually, the mean time betweeen failure for a brick isn't that bad, but when you get enough of them, it's a constant drain on the pocket and on person-hours.
-Eldurbarn
images.google
catalogs.google
groups.google
the catalog part is still in beta, but it's really amazing. when you do a search it actually hilights the words within what appears to be images. really cool. i could see how the three above could easily up their capacity to 50tb.
-- john
i would imagine they have backups of some sort. even if its just dram rsyncing across the internet.
-- john
...but they'll get a million times better as soon as they'll allow boolean searches. Man sometimes it's frustrating!!
Its not a fair comparrison to put 1GB worth of DRAM on one side of the scale, and 1GB worth of physical storage on the other. The hard disk will obviously come out to be the cheaper of the two. However, to a company like Google who undoubtedly uses RAID technology for storage, you're effectively not getting the same "bang for your buck" as you would with a JBOD array. In order to have 1TB worth of DRAM on a scale next to 1TB of physical storage, you're going to have to amass like 2TB of storage on the plate in order to have just the 1TB worth of usable free space.
Mind you, thats not to say that RAID is a bad technology..heh, hardly. Its just that you cant make a 1 to 1 comparrison from DRAM to physical without taking into account the storage methods employed by each.
Cheers
Bowie J. Poag
Every piece of drivel that you spew forth and put on the web is going to be permanently enshrined in its own little piece of DRAM at Google (Probably including this stupid comment.). Each bit of each every word ever put on the web is destined to be endlessly and pointlessly refreshed every few milliseconds, expending its own miniscule amount of energy and waiting in vain for that one stray alpha particle to cause a soft error and finally put it out of its misery. It seems like something Andy Warhol would have predicted.
Hard drives latency is too high. If they used hard drives the machines would be sitting their most of the time waiting for the drive to find things.
The sound a Mac makes when you turn it on.
See that "mature content filter"?
How about a "mature content ONLY search"?
********* sig: If you don't like the law, get filthy stinking rich, and buy a better one.
Mac's have a studied and proven lower TCO then Windows PC's. But the ability to buy some POS sub $1000 box rules over everything.
One would have expected /. nerds could to better at price comparisons than what we have seen so far.
Quick, what is a better price a 1994 Ford Fiesta at $10,000 or a brand new Ferrari at $12,000?
Clearly the Ferrari is a better deal. To do a proper price comparison you have to look beyond the sticker price alone.
What is the performance you get? resale value? maintenance cost? operation costs?
If all you wanted to buy is megabytes of storage you would be better of buying backup tapes. They are hard to beat price wise.
But in all likelihood you need to store that data for some purpose, so depending on frequency of access, latency, total cost of operation (tapes are operator/robot mounted), alternative solutions with higher sticker price, might well end up being cheaper.
What Eric Schmidt claims is that if you have a ton of data and you are accessing it all the time DRAM is more cost effective than (a) a large mirrored RAID array server or (b) a zillion tapes being mounted by operators.
...and they also do it faster but Newsnow isn't nearly as big or as popular as Google. They also seem to aggregate more sources than google (slashdot is aggregated for example).
Recently I was fortunate enough to be able to play with (test) some RAMdisk products from a company called Platypus Technologies (do a Google search for platypus linux) on Solaris workstations and servers. And of course I just had to try them out on the Slackware boxes too.
These Platypus drives are PCI cards and have dual power source ability; they plug into the wall as a secondary supply and get power off the PCI bus as primary. Very cool to be able to shut down the machine to do whatever and still have your RAMdrive ready to go upon boot. Feature wise, they use expensive RAM and the manufacturer strongly suggests you not just grab any ole ECC to stick in the card but order from them (probably has to do with the grade of RAM they use in their cards.)
Performance was absolutely unreal: more than twice the speed of SCSI, in fact, practically as fast as the PCI bus in the machine will allow. I used the cards briefly while doing a a small database conversion project and was totally bummed when I had to send the RAMdrives home. *sniff*
If you have to do anything requiring lots of I/O (like database,) you _really_ do want one of these things or something like it.
Cost-wise they are a little spendy up front (even when compared to a SCSI setup with controller and drives) but if you are at all measuring time, then everything else looses the comparison; if you are measuring lost data on dead drives, the time required to make many redundant backups to avoid lost data on dead drives, the time required to shut down and swap out dead drives, etc. -- RAM wins! Just be sure to factor in the cost of quality UPS units because they truely are part of the cost (read necessary.)
Hook up a Qikdrive2 with one GB RAM, plug it into your UPS, make sure it gets backed up to the hard drive regularly (plenty of tools to do that) and I promise you that you will not want to be without one. If you have the resources, get one of the big ones (6 or 8 GB RAM, I forget.) Look on CDW, search Platypus for prices. The Platypus site has links to purchasing sites.
As always, be sure drivers/modules are available which will work for you. Ack, I'm rambling.
Everything in the Universe sucks: It's the law!
...
"I love my job, but I hate talking to people like you" (Freddie Mercury)
I would bargin to say that by the time Google recovers from the overall cost of buying this than hard drives will have advanced to the point where they are on par with today's current ram...
That's a great calculation, but just figures the space needed for caching the raw data.
What about the indexes required to actually access that data in a timley manner? Once you factor in the extra stuff needed to actually make it a viable search engine, you could easily imagine a PB or more of storage was required.
As for the other poster going on about comrpessing the data - I doubt they'd want to compress the data when all they are concerned about is raw speed of processing requests!
.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
Lets assume that Google needs 100 TB of data. Possibly not correct, but probably not off by more than an order of magnitude either high or low.
Lets just take a look at sharky's ram price guide, and we see that a 512 meg module costs about $75, or $125 if it's ECC. So one gig of ram costs between $150 and $250.
Assuming they used some sort of non-standard computer system that supports vast quantities of ram (so the system price is almost entirely dependent on RAM prices) then we find that one TB costs about $200,000 or $300,000. This assumes that a box which can hold 1 TB of ram (2,000 of the 512 mb modules) costs about $50,000. Perhaps not beyond reason. Maybe it costs more, but once again it should be within an order of magnitude (no more than a million $ or so).
If they have 100 TB of stuff they need to store then that comes to a grand total of $30,000,000 to store it in ECC dram. Not unreasonable.
Of course, if the database size is only about 10 TB, then the total cost is more like $3,000,000 which is pennies for Google (probably). Basically, RAM is not so expensive that huge quantites of data cannot be stored in it, if one is determined.
In addition, the power dissipation would be very low, fewer power supplies, fewer servers of every sort, etc.... Do you think you could build a massive fiber channel RAID array that would serve Google's needs for $3-30 million?
My $.02
Tyler Ward
tjw19@columbia.edu
Fixed head hard drives have no seek time, since tracks have a many to many relationship to heads. That's also why you can't get them at compusa. ( expensive )
I still cannot figure out how he says storing data on DRAM is cheaper than storing it on hard-disks. Maybe, if you buy in bulk?"
Does anyone remember the SNL skit concerning a bank which specialized in making change?
"From a dollar, you can get 20 nickles. You could get 10 nickles and 50 pennies--if you want. How do we do this...?volume!"
That's what I thought of.
this means that we all will have google (and suchlike) to thank in a few years when we're all using computers with no moving parts. :)
I'd love to stick my fist in your vagina. Where do you live? I'll be right over.
Also, Google's searchable data is considerably smaller than the total size of the pages searched, even excluding the images. Read their white papers. And I doubt that they store the cached pages and images in DRAM. Those don't get hit that often.
Cranking the amp too loud is bad for the computers. Or did you mean power cord ?
--JoeProgram Intellivision!
a least completely.
I had the same question myself over the years. Especially recently, as memory prices dropped through the floor.
Linux has the option of loading itself into a ramdrive, and that's great. But why not Windows 98 or ME? Is it because it was technically hard, or was it instead tht the concept was too alien to the developers? (One ALWAYS uses disk! Don't bother me!)
RAM is faster -- always. I realize you that you can't live off of RAM alone, but at the very least the swap file shouldn't be on disk. I've spent too much time in the past ten years listening to hard drives slice meat as I waited for Windows to move pages off of and into RAM.
Well, if XP provides the option, fine. But I won't use XP. Don't like subscription OSes. Maybe the 2K version permits it. I'll try.
Wonder how much of computing is just bad habits?
Many of you are comparing DRAM to HDs solely on an overall price scheme (DRAM are $150/gB and HDs are $3/gB). Some of you have taken this a step further and compared things based on cost (DRAM may be 50x more expensive, but they require 1/10th the power, so over a period of time, the DRAM will wind up costing less). Ultimately, anyone with a good sense of business will look at the perceived value or return on investment (ROI) of the proposed solution over a period of time considering the time-value of money. This is called a net present value (NPV).
::Colz Grigor
In order to acheive the lowest possible NPV, a high-tech financier will break their disparate technologies into a common measure and place a value on the item utilizing that measure. In other words, they'll compare DRAM and hard drives on a price per performance basis.
At Google, I'm positive they're far more interested in latency (gotta get the fastest hit times, right?), so the calculation they use to compare disparate technologies will likely be price per gigabyte cross latency. Since the latency on DRAM is much less than the latency on HDs, the 150x price suddenly flips, and we find that DRAM is 6x more valuable than HDs.
But they've still got to put that into a spreadsheet and add up all the associated costs for each solution (including maintenance, power, expected failure rates and costs associated with the failure (including costs associated to the loss of information stored on volatile memory), etc.) over a period of time.
It's an extremely complex calculation, from a business perspective, so I doubt that Mr. Schmidt has his head up his ass.
For Google's purposes and given Google's attitudes toward generating a ROI, DRAM costs less than HDs. This does not mean that the same would be true for Akamai or NetworkAppliance or you.
--
In many clusters today like KLAT2 they only use hard drives for the root nodes, and the other 98% of nodes use 2GIG of ram.
This saves you at least $150 per slave node by not buying a hard drive, thousands for having to deal with less hard drive failures, and acess times are orders of magnitutes better.
Lets do the math. 512MB of PC133 on pricewatch today was $67. For 2GIG of ram that comes out to $268 per node. For a terabyte(2modules*$67*1000GB)=$134,000.
That blows my mind. A small research lab can now own a terabyte of PC133 for under $150,000. Man, do I feel old.
bash-2.04$
bash-2.04$yes "Don't you hate dialup connections?"| write USERNAME
I'd sure like to be the saleman selling Google their hardware. 10,000 RAM heavy servers, KA-CHING, KA-CHING, KA-CHING, KA-CHING, KA-CHING! My eyes are filled with dollar signs of massive commissions.
What the article doesn't point out is how are they doing the RAM thing. Are they buying Solid State drives (physically look like hard drives, but are nothing buy RAM) or are they just cramming RAM in the servers so the database and its data is all in RAM? That's common to do with databases for performance.
The cost differential between RAM and disk has been eroding for some time, particularly if you compare RAM with SCSI disk. While the price of IDE had dropped, SCSI is still premium priced for the business market, even though there is no reason why a SCSI controller should cost a cent more than IDE.
A 80Gb SCSI-160 drive costs $800, RAM costs $150 for a 512Mb DIMM. So Disk costs $10 per Gb compared to $300.
The problem with the raw comparison is that you still need a lot of RAM to service a large disk, caching etc. There is also a limit on the amount of disk data one CPU can effectively manage. From experience I can asure folk that that limit is certainly less than 80Gb if the lokups are frequent!
So when you add the cost of a CPU and box into the equation the RAM solution is gong to look much better. I doubt that a single CPU could effectively manage more than 4Gb of disk data, but 4Gb of RAM data is quite viable. And you probably need at least 1Gb of RAM to support the disk data in any case so the all RAM solution looks good.
For most database applications RAM wins hands down. On top of the cost of the disk you have to count on
The main problem for the RAM route is getting persistence on transactions. So you need some secondary storage in case of power failure or disaster. This could be tape, but ironically disk is cheaper to run these days than almost all tape systems. A 40Gb cartridge for a tape drive can easily cost $150, which is more than an IDE disk drive that outperforms on practically every level (probably even longevity).
The key is that you use your secondary storage to write out the transaction log, you don't attempt to maintain the data structure on disk like SQL databases do. For high reliability you use a complete duplicate of the system to provide your first level backup with disaster recovery at a remote site.
Looking for an Information Security student project suggestion?
Try http://dotcrimeManifesto.com/
The difference in power consumption is huge between HDD's and RAM. When you consider that modern memory uses something along the lines of up to 600mW at 1V-5V per DIMM at most, it is a lot cheaper than buying a data warehouse of HDD's.
Google will also likely break their technology into three components:
spidering and indexing
searching
caching
::Colz Grigor
Each of the financial analysts for the business groups responsible for each asepct of Google's technology may calculate the value of DRAM vs. HD differently. For searching, latency is extremely critical, but it's not so critical for caching, and there may be some physical problems with solely using DRAM for indexing.
That being said, I would expect Google to use HDs for spidering and indexing, DRAM for searching, and HDs for caching. Mr. Schmidt was probably only discussing technology on the most visable component of Google's technologies: searching.
The average disk can sustain between 100-200 IOPs, while the average memory module can sustain about 10,000,000 IOPs (100ns latency). At $120/disk, this works out to $1.66/IOPs, and at $250/GB for memory, this works out to $0.00025/IOPs.
Google currently claims to index about 2G pages. If one assumes on average each page is 4KB, and that the inverted index takes half the space of the original text, then this means 4TB of index. 4TB of RAM at $250/GB is 4K memory modules for $1M. Assuming their motherboards can hold 2GB each, this means 2K machines at perhaps $120 each for another $250K. Now, those 4K memory modules on 2K motherboards can sustain something like 40G IOPs. $1M of disk is roughly 8K disks for 1.6M IOPs. In a real system, load is never evenly distributed so you are almost never able to approach the theoretical limit.
For more details on the (original) Google implementation, please see The anatomy of a large-scale hypertextual web search engine , by Sergey Brin and Larry Page.
From dim memory, to do a search, you need to:
Each of the lookups (dictionary, inverted index, page rank, document) is a random access (IOP). So, to make a long story short, memory is cheaper for Google because throughput (and latency) is critical to their business and their access patterns are generally random and the cost of enough memory to hold the index is less than the comparable cost of enough disk to support the IO rates they require.
Cheers,
Carl Staelin
Does NetBSD run on IBM big iron? If not, there's always Sun kit with NetBSD.They don't have to use Linux (or DOS 4.0).
Or maybe Google are stupid?
Sent from my ASR33 using ASCII
Assuming 80GB drives each drawing 40 watts of power, and electricity rates at $0.20/kWh, you're looking at an annual power cost of less than $1 per gigabyte of spinning disk storage. That hardly accounts for the difference.
"Biped! Good cranial development. Evidently considerable human ancestry."
You guys crack me up some times.
I'll lay it out. Obviously Google is not storing the master copy of the full multi-terrabyte database in ram, but they are certainly storing as big a chunk in ram as they can, and the cost model ought to be easy for anyone to understand if you sit down and think about it.
Consider the cost difference between the following EQUAL amounts of hard disk storage:
* A 160GB IDE drive
* A 160GB SCSI drive
* Four 40GB drives in an external RAID system
* The cost of a small medium-performance RAID
system.
* The cost of a larger high-performance RAID
system scaleability to a terrabyte.
* The cost of an *EXTREMELY* high performance RAID
system scaleability to multiple terrabytes.
Now consider the cost of building, say, a 40 terrabyte data store (lets not worry about backups for this experiment). If you build it out of a bunch of huge SCSI drives connected to a bunch of PC's it can be fairly cheap. But if you build out of, say, high performance EMC arrays it could cost millions of dollars more to get the same theoretical performance.
So when you consider the cost of storage, you always have to consider the cost of the PERFORMANCE you want to get out of that storage. All the Google CEO is saying is that, Doh! It's a hellofalot cheaper to improve the performance aspects of the system by buying DRAM in a distributed-PC environment in order to be able to avoid having to purchase extremely-high performance (and extremely expensive) disk subsystems. The cost of purchasing the DRAM to make up for the lower-performing disk subsystem is actually LOWER then the cost of purchasing an equivalent higher-performance disk subsystem.
The same is true in the ISP world. When RAM was expensive we had to rely on big whopping HD systems to scale machines up. But when RAM became cheap it turned out that you could simply throw in a very high density drive with 1/4 the performance that four smaller drives would give you, and the operating system's RAM cache would take care of the problem. Suddenly we no longer needed to purchase big whopping disk arrays.
Think about it.
-Matt
I think you must have had problems with air-conditioning.
Our lab has about 100 servers (mostly Sun Netra's and HP L-class), each of them has 5 drives in average, about 100 GB per server (rough average).
That's 10 TB. This amount of storage is "active" since the beginning of last year, and we hadn't one single failure.
Sigged!
Ok, in cases where the general idea here is to set up a honking huge virtual disk in RAM for unbelievably fast I/O instead of using actual disks where for some reason you have to go throught the motions of disk usage, what filesystem is best? (ext2, ext3, ReiserFS, etc.)
Would we ever need to run fsck on a RAM disk?
This
You have no idea. Why don't you read a book on hardware. When you finished you will realize that your comment is 'funy' to be polite.
The electicity needed to refresh ram is a whole lot less then to keep a hard disk turning. It takes about 9 amps to start up a HD and about 4 to keep it turning. DRAM uses power in milliamps.
just wondering the effect of power and UPS failing.
All DRAMS being erased...
I liked it, even though somebody apparently thought it was redundant. It doesn't directly apply to Google, but the principles of trading off speed and cost are still relevant even though the problem's a bit different. One thing I'd find interesting is knowing how much of Google's index data is replicated - one master copy (which might be backed up on disk) kept on N search engine boxes - vs. how much do queries get spread across multiple boxes? Does it make sense to cache the spidering on disk (probably, because rerunning spidering takes a long time, and because the article caches probably don't get hit as often, and don't need the same response speed as the indexing.)
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
;0)
The point is not that a single disk is used for parity, just that there is a disk's worth of parity being used (NOT 2:1 as YOU originally said).
Also, he specifically said they weren't doing JBOD.
So fuck the hell off, bitch.
There IS a dedicated parity disk in RAID-3; one needs RAID-5 for spread-around parity. Modern storage that claims RAID-3 usually does RAID-5 without telling the user. (In fact, Sun T3's ("Purple") offer both RAID-3 and RAID-5, but only do RAID-5 internally.)
YHBT YHL HAND!
This sounds plausible, since you can use fewer machines. But the problem I see is, where do you find a machine that can address 80+ gigabytes of memory? Otherwise, you have to but just as many commodity boxes to hold the ram, which ruins the cost benefit.
Does anyone have any insight on what machines you would use to support this scheme? Does a SAN-type device for RAM exist? Some network-attached box that holds tens or hundreds of gigabytes of RAM?
He was pseaking of the costs in other issues not the price issue of Dram when compared to dhd disks..
But at least the reporter could have picked up on that a specified it..where is a good tech reporter when you need one?
Don't Tread on OpenSource
Hard disks are recoverable (more or less, depending on the filesystem, whether they were shut down cleanly, etc.) If it's all in DRAM, and the power goes, you've just lost decades of indexes!
Unless you back it up on disk, of course...
Ceterum censeo subscriptionem esse delendam.
Dave: "Google, search on 'what happened to the communications'"
Google: "Why do you want me to do that.. Dave"
Dave: "Google, search on "How to turn communications dish to manual"
Google: "I don't think you should do that... Dave"
I mean, obviously you have some kind of grudge against it, to abuse it that way.
Take the RAM out of your computer and throw it at your workmate/housemate/mum. He or she will say 'Ow!', and it's not because he or she was hit by electrons!
This would, indeed, be the use of RAM as a mechanical object but this type of use is not characteristic. You appear to be claiming with this example that any solid object (and possibly any matter) is a "mechanical component," which is wrong and would be harmful to meaningful communication if accepted.
Any solid object's atoms move in relation to each other. This does not mean it can be said to have "moving parts" (this useful phrase would be rendered meaningless, otherwise), or make it a "mechanical device" (ditto).
Every electrical device is utterly reliant on its physical structure to function properly, and will cease to function properly if its structure is altered beyond certain limits. A broken connection is not a mechanical failure.
Sure, the clip that holds it in place is mechanical, and can suffer mechanical failure, but that is not part of the RAM. To note Telstra's odd problem as evidence of RAM being subject to mechanical failure is like talking about a wind-up alarm clock being struck by lightning as evidence of such clocks being subject to electrical failure (this would, of course, actually be an electrical event causing a mechanical failure).
Last year it was 6000 boxes of:
Celeron CPU
motherboard IDE
2 plain IDE drives
lots of RAM
rackmount case
SCSI and secondary controller cards lose big.
You gain more by just adding another PC.
Lots of other posters have mentioned pieces of the puzzle, so I risk being redundant here. But, it seems the whole equation goes something like this:
1. If each box only handles a part of the web, it is possible that most of the space on it's drive (or drives) are wasted anyway.
2. If disk latency means that cpus spend idle time, eliminating that latency means more throughput per box, hence fewer boxes. More money spent on DRAM, less money spent on CPU, power supplies, etc.
3. Even with same number of boxes, lower power draw, smaller and/or fewer UPS(s) required. With fewer boxes, even more reduction.
4. Which leads, of course, to lower A/C bills during the warm weather.
5. Fewer boxes, fewer pieces, whatever, means fewer things breaking. The impact of a single outage may be greater, but, from the cost standpoint, you need fewer man-hours to manage the outages, fewer spare-parts, etc.
6. Lower medical expenses from sysadmins going insane due to the noise from all those drives and the associated larger power supplies and extra cooling fans.
OK, that last item is a stretch, but how many sysadmins are more than a step from insanity anyway?
"It's funny. On the outside, I was an honest man. Straight as an arrow. I had to come to prison to be a crook."
When we talk about what is "cheaper" you first have to set a standard of performance. If you want X data to always be retrieved in Y or less time, then you have a point of comparion. Memory vs Disk becomes cheaper when the number of drives you have to have to insure your level of performance becomes excessive in comparison to the amount of data the drive is storing. This is particularly true when having to index a large amount of data. If you need to do 7 or 8 disk arm seeks to get to the data and you have a standard of performance you may need many more disk than what the capacity of the platter dictates. I do not believe that either all disk or all memory is ever the best solution, but a blend is always needed. That blend goes from the traditional 1 to 100 ratio of memory to hard drive to 1 to 1. Remember the Dram stills needs back up for the most unusual power failures, hardware failures. In a well performing tranaction management system, you really don't want more than 1 or 2 physical I/Os to the hard drive for performance, which means you need intellingent indexes or hashing routines and a proper amount of memory for caching. It really is an interesting performance tuning topic. In fact some operating systems manage the difference of disk vs memory for you the programming, so you are always referencing data in memory and the OS and systems programmer are controlling how much data is really in memory for the application how much is on hard drive. A very similar concept to virtual memory for programs.
Early in your boot scripts, perhaps in a file /etc/rc.local, /etc/rc.d/rc.init, or
/proc is mounted.
/tmp
/tmp/old
/tmp/new
/tmp/old
/bin/bash dev/console 2>&1
/tmp/old
called
similar, switch to RAM. You'll need a lot.
The script must run this only once, and it
should be run before
cd
mkdir old
mount --bind /
mkdir new
mount -t ramfs none
(cd old && tar cf -) | (cd new && tar xf -)
umount
cd new
rmdir old ; mkdir old
pivot_root . tmp/old
exec chroot .
telinit U
# restart anything started before pivot_root
# to free up the old filesystem for unmounting
umount
silly silly people.... dont you understand how a busines works??? ok here is what they do... they buy expensive hardware and then use it to sell their product at a loss and make it up in bulk sales...
It does. But it doesn't help much and measn you have to reload the whole RAMdrive (generally over a LAN) when the box dies. Admittedly, it is a more efficient use of RAM than just handing it to Windows, since Windows (particularly the 9X stream) is a hopelessly inefficient user of RAM.
You must really have spent a lot of time and looked hard before saying that... )-:
``And death and hell were cast into the lake of RAM. Diskless Windows is the second death.'' -- Revelation 20:14, Geek Modified Version
Got time? Spend some of it coding or testing
I would think that using solid state memory would save money over hard disks because of the cost of electricity and cooling. Those mp3 players can run for like 10 hours on a SINGLE "AA", but most hard-drive based players kill batteries faster then you can recharge them.
They said that their problem right now is growth. What happens when they stop growing for a little while and there is a surplus of DRAM on the market. Some of that is going to go the consumer direction and cheap.
~~Apathy alert: Approaching the Point of No Concearn
When we were still using floppy disks in PC's, nobody saw nothing wrong with first loading a database (which had to be quite small though) into the RAM and then serving it from there. Nobody would imagine more than one person waiting for a floppy disk to load something. When floppy disk was 390 MB and memory 640 you could read almost TWO FULL DISKS into memory. No one would have bought million computers with two floppy drives each just to serve some little database. And you also need to save that data there.
Now that we have all the "fast" hard drives almost nobody keeps stuff in memory. It's not the same, but if you're hard drive is 10000 as fast as your good old floppy drive, and you have million users instead of those 10 you used to have... are you going to buy million computers ? No, you increase memory cache. At some point however there is so much "memory cache" that you can actually get some more ram and throw that slow hard drive to a "Recycle Bin".
You save the powerbill for hard-drives, you save the powerbill for cooling, and you don't need that many machines.
Also for reliablity. RAM fails yeah. But so does hard drives. So double the powerbill saved as nobody will be running a non-RAID hard disk for a serious server. And then compare the time wasted when copying all the data to the newly added hard drive. Yes, SCSI can do it without CPU. But you also lose performance from the disk access.
In SERVER environment every little save counts, everything breaks, and the more of it you can have running and faster it will run.. well.. the cheaper it will be.. nobody actually cares what the hardware will cost. It will be little compared to what administration, power, spare-parts, replacement servers, whatever .. will cost in long run.
What if slashdot did no caching ?
Software should be free as in speech, but if we also get some free beer, all the better.
When Moffett Field (now NASA AMES Research Center) in the Bay Area had an open house many years ago, I stopped in the Computer Mueseum/Wharehouse...
I will never forget looking at a bookshelf-sized board of ram. They were quite literally wires crisscrossed with small cheerio sized hunks of metal at each intersection. You could charge these cheerios on and off (creating 0s and 1s) by sending electricity through a wire on its x or y coordinate.
It was soo cool. I could sit and count the number of bits on that board of ram. Imagine countin todays 128 Mb that come standard.
Does anyone know if its still there?
Just wondering, but where exactly does google make their money. If they own thousands of computers and have this huge pipe to the rest of the world, generators, ups systems, people working there, electricty, and running water... don't they have to make money somewhere? There are no ads that I have ever seen, but give really good info on many subjects, such as linux, and have been up for a few years now. This isn't 1998, you actually have to make money on the internet now to maintain yourself, so where does the cash flow come from? Columbia?
You can learn a bit more about these results from our short paper (PDF) just presented at FAST, or wait for the June Usenix conference to see a longer paper.
Studies of database performance show that the most effective way to speed up database access it to cache it in DRAM. I think google would be quite sluggish if the index was kept on disk.
One of the advantages that AltaVista had when it first came out was that the index was kept in DRAM. The servers at that time held 12GB of DRAM.
...oh, never mind.
he was comparing the cost of a lot of ram to maybe the cost of buying quite a few Seagate SCSI 15000 RPM drives on some crazy hardware RAID array. Otherwise, I have no idea how a lot of RAM is cheaper than hard disks.
http://www.cs.uiuc.edu/whatsnew/abstracts/hoelz
have fun
I don't know, but surmise that the seekable data may be held in RAM. Given Google's likely loads, they're looking at a lot of load distribution. With search data in RAM, each machine can handle more load. Therefore less machines needed. The additional RAM costs less than the additional machines (and MANAGEMENT AND MAINTENANCE of those machines) otherwise needed.
The answer is SSD, (solid state disk). By using a disc drive made of SDRAM, you can use the disc by all the servers. About 80% of all data traffic hits only 2-4% of your data. What is that 2-4%? Typically your database index. You can't get anything out of the database without hitting the database index first. What if you took that 2-4% and put it on something that was 250 times faster than the world's fastest RAID? You'll have taken 80% of the slow moving data requests/responses and replaced them with extremely fast data requests/responses. Think about it, the slowest piece on your entire network are your storage mechanisms...disc drives, tape drives and CD's. Ironically, besides the power supplies, these are the only mechanical devices on your network! Everything else is solid state. The key is to put the files that get hit every second of the day, like a database index on the fastest thing you can find...put the files that get hit less frequently on RAID or disc based storage..the files which get hit infrequently on tape. The second point to note is that it's not necessarily how fast a storage is in milliseconds, but how many I/O's (transactions) per second you are able to achieve. The biggest, fastest RAID systems in existence can only do less than 5000 I/O's per second. A company called Texas Memory Systems, Inc. makes a product which allows 50,000 I/O's per second from each port! With the ability to have from 2-15 ports, you can get ¾ million I/O's per second. It's not rocket science what Google is doing. Beef up the network with and SSD and everything runs faster.....your network, your RAID, and the customer responses. This is not meant to be an advertisement, however, if you have questions, please feel free to email me at halsaver@juno.com Thanks. Ric Halsaver