Zvents Releases Open Source Cluster Database Based on Google
An anonymous reader writes "Local search engine company, Zvents, has released an open source distributed data storage system based on Google's released design specs. 'The new software, Hypertable, is designed to scale to 1000 nodes, all commodity PCs [...] The Google database design on which Hypertable is based, Bigtable, attracted a lot of developer buzz and a "Best Paper" award from the USENIX Association for "Bigtable: A Distributed Storage System for Structured Data" a 2006 publication from nine Google researchers including Fay Chang, Jeffrey Dean, and Sanjay Ghemawat. Google's Bigtable uses the company's in-house Google File System for storage.'"
..designed to scale to 1000 nodes, all commodity PCs... I'm just curious if anyone has had any experiance with these types of systems using commodity PCs, how is performance and does how well does it scale as you increase the amount of nodes?I'm sick of following my dreams. I'm just going to ask where they're goin' and hook up with 'em later.
i've been interested in this question for the last few years. how much do people value the ability to use a relational language and transactional consistency, or for most of these uses are these things just historical artifacts?
This is a classic column-orientated DBMS, ala Sybase. You use these for data warehousing since they are optimized for read queries and not transactions. Stuff like Google search queries. It also allows you to quickly build cubes of data across a timeline, since you have data in columns instead of rows.
IE:
a,b,c,d,e; 1,2,3,4,5,6; a,b,c,d,e;
instead of:
a, 1, a;
b, 2, b;
c, 3, c;
d, 4, d;
e, 5, e;
A cube using the time dimension would look like:
01:01:01; a,b,c,d,e; 1,2,3,4,5; a,b,c,d,e;
01:01:02; a,b,c,d,e; 1,2,6,4,5; a,b,c,d,e;
It's pretty difficult to do the same thing with row-based DBMS. However, you can see that doing an insert is going to be costly.. This looks like a pretty good try, I know there were some other projects going to try to replicate what BigTable does. And after hearing that IBM story the other day about one computer running the entire internet, I started thinking about Google.
More interesting is their distributed file system, which is what makes this really work well.
Cool! Amazing Toys.
A cube using the time dimension looks more like this.
HAND.
What?
Wikipedia lists no less than eight Linux distributions designed specifically for building Beowulf clusters.
Using OpenMosix, a single-system-image cluster can be created by booting cluster nodes with LiveCDs and with very little configuration. It's even been done with Xboxes, although they have very poor performance per watt consumed by modern standards.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
I think Google Forms is more interesting. (Based on Google Spreadsheets.)
How to Download YouTube Videos
Mnesia has been able to handle things far in excess of the numbers cited, and with far better control of placement, for more than a decade. So has KDB. Also Coral8. This wouldn't even be on the map if people didn't start drooling the second they heard "based on Google." When they find out it's unstable and in alpha?
Yawn.
StoneCypher is Full of BS