MySQL Creator Contemplates RAM-only Databases
Aavidwriter writes "Peter Wayner asks Michael 'Monty' Widenius of MySQL, 'When will RAM prices make disk drives obsolete for database developers?' From Monty's answers, it sounds like hard drives may be nothing but backups before long." From experience, I'd wager that RAM failure rates are less than hard drive failure rates, so it might also mean more stability from that perspective.
There are these things called batteries. Of course the article mentions them, but who reads the articles anyway...
RAM only needs a trickle to keep refreshed...
Wow! $25,000 for 16 GB of RAM disk seems a tad high or widespread adoption.
It's also interesting to note that Microsoft was going to release what they called In Memory Database(IMDB) support in Windows 2000. However, this feature was removed after Windows 2000 RC2 due to technical issues.
And large image-bases are too much for RAM today. Remeber the 640K limit? I'd guess that more than 50% current corporate databases would fit in a single gigabyte, including indexes.
The real speed improvements, according to the guys working on projects like Bamboo on sourceforge come not from the fact that it's in RAM, they test against SQL in RAM and show that most of the performance improvements come from keeping the data in the same process space as the application operating on the data. If they're right, putting MySQL in RAM or Oracle or Microsoft SQL Server is a small improvement.
The large speed improvements come from lowering the process boundary hopping overhead--marshaling and all that good stuff.
There are some cool ideas there. They use two copies on disk for backup in case of system failure. Because of this they don't have to do page-latching.
In some configurations, though, this is irrelevant, because write transactions lock the whole database! Because they know all transactions will be extremely short, this is faster than locking at page or row level.
Not true, you are describing Parity RAM. Let's review:
Normal RAM: no errors detected
Parity RAM: an extra bit is used to store the XOR of all the othe bits on the line. If the math doesn't work, the OS can be informed the data at a particular address is corrupt, and let the software decide how to best cope with the condition. Only odd numbers of bit errors can be detected.
ECC RAM: Uses a more complicated method and even more bits to provide single-bit error *correction* and double-bit error detection. The odds of two bits on the same line being corrupt is slim unless the ram is really bad. Single bit errors cause no problems. Double bit errors are at *least* detected. More than that and chances are the memory module is so screwed that even if the OS doesn't get it reported, the problem will be obvious to the user.
XML is like violence. If it doesn't solve the problem, use more.
UPS is old technology, the battery needs constant replacement, and very few have multiple redundant batteries and/or transistors to deal with wear and tear. Yes even a simple MOSFET transistor is not 100% reliable. Usually the only way to tell a battery is dead is your UPS fails when you need it (this happened to us when my MD was demonstrating our service live to customers, afterwards was the only time he's taken less than 9 months to sign off a purchase order on new equipment). A UPS also has a power cord to pull out when you recoil after burning your fingers on a Seagate Cheetah 15000RPM HD in the server room. A UPS also trips if you overload it, which again means the UPS fails when you most need it.
Other posts mention cosmic radiation at high altitude makes RAM fail. Last time I checked there were no Quad-Xeon Oracle databases on Concorde, although if the International Space Station were to use one this might pose a problem for non-ECC RAM. Anyway, somebody could always write a driver to do software-ECC with Reed-Solomon for RAM if it becomes necessary.
Huge databases (>500 Gigabyes) would benefit most from this as running a simple OUTER JOIN query on the biggest tables will require most of the database to be called into RAM.
- Small databases become slow due to HD latency problems if they do a lot of WRITE operations (the database is stored in RAM, the transaction log is appended to, COMMIT TRANS). This would benefit least FROM RAMdisk because a HD append operation is cheap, however it would benefit database speed in mid-backup
- Mid-size databases become HD-intensive due to aggregate queries/triggered operations over large '>RAM' datasets. For instance enforced cascading deletes where millions of tuples are being deleted cascaded to hundreds of other unindexed tables (in my job I go to the toilet whenever I run a query like this).
- Huge databases where 'Index size' > 'RAM size' - the simplest query would benefit hugely from more RAM or faster storage or RAM-storage. With current databases this would be a 10Gig Eth connection to a Terabyte RAMSan solid-state disk.
In the future, who knows, maybe a FPGA/ASIC DPU (Database Processing Unit) for INSTANT COMMIT like NVidia's GPU?A caveman dreams of being us, the incalculable power and riches. We dream of being Q, then what?
Any Perl programmers in the audience may wish to check out DBD::RAM. From the CPAN documentation: "DBD::RAM allows you to import almost any type of Perl data structure into an in-memory table and then use DBI and SQL to access and modify it. It also allows direct access to almost any kind of file, supporting SQL manipulation of the file without converting the file out of its native format." More information here
Modern storage solutions (like EMC) use redundant battery backed ram to buffer writes, greatly reducing perceived write latency. This gives you a lot of the performance gain of a ram only database, and also scales very well to large loads. (in fact, when choosing RAID stripe size you take into account whether writes are buffered; if not, keep stripes small for log files)
...most databases don't cleanup indexes after deletes, forcing periodic rebuilding. Other index schemes not generally considered because of poor locality prinicles could be considered. Note that Hash Indexes would probably still use Linear Hashing.
If you know that your data will always fit into available ram then there are a number of performance optimizations that can be done. I'm not sure about ACID becoming "trivial"; You still need most of the same db components: indexes, lock managers, operation journaling, etc. But many of these could be greatly simplified:
1. Page/Buffer Manager Eliminated. Since no disk IO will be required for the normal running of the db, there will be no need for a page manager. This eliminates complexity such as block prefetch and marking and replacement strategies. In fact, the data will probably not be stored on pages at all. Details such as block checksum, flip flop, log position, page latches etc can all be removed. The values in the rows would be sitting in memory in native language formats rather than packed making retrieval much faster. There would be no need for block chaining.
2. More flexible indexing. Since it is not necessary to store data in pages, traditional B-Trees are not absolutely required. Other index structures like AVL trees would be faster and might allow better concurrency. These trees would also be easier to keep balanced
3. Lock Manager Simplified. Row level locking (and MVC) are still desired features, but keeping the locks all in memory simplifies implementation. Oracle and InnoDB store lock information in the blocks (associated with transaction) to allow update transactions larger than memory.
4. Log manager simplified. You will still need journaling capability for rollback, replication, recovery from backup etc. But the implementation of the log need not be traditional. Any structure that maintains information about transactions and contains causal ordering will do. Techniques such as keeping old versions of rows adjacted to current versions that are unacceptable for disk based databases (ahem, Postgres) could be used.
Although these may seem like small things, they can add up: less code to run is faster code. A company called TimesTen offered a product that they claimed was 10x faster than Oracle using an all memory DB. Generally the corporate world doesn't care to split hairs. They want something that works, and they are willing to throw some money and iron at it. Thats why battery backed ram in the disk controller to buffer writes is probably going to be fine for now.
A last note: modern databases already know to not bother with indexes when a table is sufficiently small.
JJ