Keeping Google's In-house Database Ticking
An anonymous reader writes "ZDNet has a short but interesting piece on the what Google did with its 12GB database when it became a challenge for the finance department. The database was split into three, says Chris Schulze, technical program manager for Google — one for the current financial planning projections, one for the actual current data from existing HR and general ledger systems, and one storing historic information. The article says Google has been using a variety of products from Hyperion (recently bought by Oracle) to manage its internal financial systems since 2001."
Gigabytes?
"Right now, we're on a not very powerful Windows box," Couglin said. "We definitely are wanting to go to Unix when we go to System 9."
45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
It has Google in the name, that magically transforms it from a lame press release to "Stuff that Matters"
12 GB? You call that big? I haven't seen an Exchange mail store that small!
Gamingmuseum.com: Give your 3D accelerator a rest.
I don't get it, that doesn't seem like much to me.
We have many databases that are larger here from MSSQL to Oracle, some around the 600GB mark.
What's so special about Google's database?
Is it just me, or does this seem like it is absolutely silly and pointless? The only thing that I see us getting out of this are some "LOL WINDOWS" posts.
This is the bit that gets me in the summary:
ZDNet has a short but interesting piece
Interesting to whom, precisely? Hyperion's marketing department? Scant technical details and really only notable for the link to the photos of Google's new Sydney office which are kind of interesting, I suppose, in an "ooh wow shiny...okay what's next?" kind of way.
Guess this is more of the Google/MySQL database bonanza...
Please guys, if you're looking to pump before you IPO and dump, do it on Wall Street, not Slashdot.
Google have terabytes of data. I imagine Google engineers are now horribly embarrassed by what the stupid finance department got up to.
The databases google uses to run searches are quite a bit over 12 gigs...as are the database backends behind gmail and google maps. What the heck is this article even talking about?
It is as if the reader who submitted it thinks a 12 gig DB is big for google.
1. Move on, nothing to see
2. Sack Zonk (sorry man you post some good stories, this ones a stinker)
12 Terabytes maybe? What the hell are they talking about? This is suppose to advertise some company? Ohhhh! Look at you!! Your system manages 12 GB of data!!!
12 GB ain't nothing. Hell, my NNTP db is way bigger than that and that only contains header information.
Also, I think they are talking about AU only. I highly doubt the US only has a 12 GB database.
Curiosity was framed; ignorance killed the cat. -- Author unknown
As far as I can tell, the only reason this is news is that it's Google. I manage several very large database, some in the hundreds of GB. Probably the most interesting of the big ones involves auditing people who are accessing a medical records system. The tricky part isn't managing every command passed by tens of thousands of users, but rather trying to find ways to pull out the needle of bad behavior from the endless normal activities. Was doctor A supposed to look at patient B's record? Is user A somehow related to patient B?
The only thing of technical note in the article is the ordinary problem with database jobs taking a long time. On a related note, I've kept waffling on whether or not to break off the above audit database to its own server. The processing time for some of the import jobs is over an hour. Strangely enough, advances in hardware have been such that it still resides on our main database/web server without any problem. Maybe Google just needed to throw hardware at the problem.
Obviously that's 12 GOOGLE-Bytes*. Which are far huger than ordinary bytes, or even gigabytes, and therefore much more interesting.
* Note that GoogleBytes are still in beta and therefore the exact amount of storage in a single GB is yet to be determined.
picpix image polls. create - share - vote. fun!
The point of the piece is not to laud Google's technical prowess (cough) in partitioning the data but to embarrass the (potentially non-responsive) vendor into improving their product.
Certainly Hyperion can't be happy about having an extremely high profile customer publicize needing to partition a modest 12GB database even if it is running on a Windows platform. And Google is not likely to have gone public with this unless they've already tried to unsuccessfully resolve the problem with Hyperion.
12 Gb of _relational_ database falls under "nothing to see, move along". But Essbase http://en.wikipedia.org/wiki/Essbase is doing OLAP http://en.wikipedia.org/wiki/OLAP , which means that data is pre-aggregated across multiple _hierarchies_ . Those 150 users are likely the top management looking at the revenue, or reviewing the budget.c ts
In Open Source land there are similar projects: http://freshmeat.net/search/?q=olap§ion=proje
http://revj.sourceforge.net
Hmm, suddenly I realise what next year's real April 1st product will be.
No no no! It stands for Googlebytes. Each Googlebyte is approximately 1024x10^10,241,024 bytes. So as you can see, a 12 Google Byte database is quite substantial...
Is it me, or does it seem like they have no clue regarding ERP architectures? And, as everyone has said, 12GB is an impossibly small size. The 'BASE' install of most financials packages is more than twice as large as this. And they can scale very easily, we have some databases approaching 2 TB.
Uhmm, maybe it's some other Google, right...?
I can't be reading a press release from Google, the one that has more or less a copy of the whole Internet on its servers, whining about the difficulties of managing a small database on a slow Windows machine.
So Google used horizontal partitioning to split load across servers? Wow, that's rocket science. None of us in the database community have thought of doing this before. :-) But, if you want to find some news here, you can. One nice thing that Google did recently was to donate their horizontal partitioning code for Hibernate to the open source community. Hibernate Shards definitely needs a lot of work to get it to the point where it does a lot of stuff that people would want, but, hey, release early and often!
The next Google article's in the pipes:
In this article we'll follow through a case study on the difficulties of watching a HD DVD on a slow Windows XP machine, and the hurdles the Google team had to meet, to watch the entire rental movie and return it in time.
"We had to Google for the keys, decode it, split it in pieces and setup a small cluster of servers to play the audio and picture separately, then feed the low-def stream back to our slow Windows box in real-time" said Chris Schulze, technical program manager for Google, said during the presentation.
He added: "We could've rented the DVD instead, but we like to solve tough challenges the Google way".
Their Hyperion Essbase cube was 12 GB? And they had to partition it into 3? That's nothing. We have MS Analysis Services cubes of almost 400 GB (partitioned into 3 seperate ones, like Google). If this is supposed to be an advertisement for Hyperion, it's not very impressive. Of course, we are using 3 seperate 8 processor Itanium boxes with 64 GB RAM. That helps some.
Everyone here seems to be forgetting that Hyperion is an OLAP Cube holding highly aggregated data, consequently it doesn't have to store enormous amounts of data, it probably only hold last years actuals and this years actuals and budget data which even for a v.large company is pretty small. Consequently 12GB is actually a lot of data for the product. Think about the purpose of the product before picking holes in it. I don't work for Hyperion, but have done a few projects with it's Essbase product, which is actually shit hot.
Dum spiro spero
Read the freakin article:
"Google is used to sifting through huge amounts of information to generate its search results, but a 12 gigabyte database proved something more of a challenge for its own financial management and planning systems."
The whole point is that Google is used to dealing with terabytes and a simple little 12GB database actually posed a challenge to them. I really hope the irony isn't lost on this crowd...
12 GB is very small beer nowadays. I work on a data warehouse of >20Tb, once considered
very large indeed, now just another database. Sheeeeit, I wouldn't hesitate to run a 12Gb
database on my laptop....
Why the hell do they think this is news???
Well, we all know that Google is feverishly working on their free broadband service. They don't have enough time to worry about on a measly 12GB database. They are too focus on getting installation instructions correct!
Coderz 4 Life
I have a spare PC running centos and mysql that can handle those troublesome 12GBs like a chainsaw
cutting through butter.
Call me and I can drive over to the plex today and get it running over the weekend for a very reasonable fee...
This must be a joke? right? Google has problems with 12GB of data?
someone please tell me it's at least 12 TB w/thousands of concurrent users...
Couldn't they have at least partitioned it among three iPod Nanos running Linux?
What everyone needs to realize is that this is Financial Data. I worked with a database of over 4GB of nothing but sales orders for Cisco, and that was only for one technology group. This translates to a lot of money, and keeping the integrity, security and performance of these kinds of databases are very, very important and very stressful due to the responsiblility. Also, for financial data, correctness is more important than mondo fast algorithms that add complexity. Divide the 12GB by average value of each record to see how big the database is in business value.