Slashdot Mirror


MySQL Creator Contemplates RAM-only Databases

Aavidwriter writes "Peter Wayner asks Michael 'Monty' Widenius of MySQL, 'When will RAM prices make disk drives obsolete for database developers?' From Monty's answers, it sounds like hard drives may be nothing but backups before long." From experience, I'd wager that RAM failure rates are less than hard drive failure rates, so it might also mean more stability from that perspective.

23 of 288 comments (clear)

  1. Re:Correct me if I'm wrong by sholden · · Score: 4, Informative

    There are these things called batteries. Of course the article mentions them, but who reads the articles anyway...

    RAM only needs a trickle to keep refreshed...

  2. Comment removed by account_deleted · · Score: 2, Informative

    Comment removed based on user account deletion

  3. Re:I already do this in SQL... by FreeLinux · · Score: 4, Informative

    Wow! $25,000 for 16 GB of RAM disk seems a tad high or widespread adoption.

    It's also interesting to note that Microsoft was going to release what they called In Memory Database(IMDB) support in Windows 2000. However, this feature was removed after Windows 2000 RC2 due to technical issues.

  4. Re:Correct me if I'm wrong by ichimunki · · Score: 2, Informative

    You mean there's articles we're supposed to read? If we did that, how we would ever get in on the discussion while it was still "live"?

    If you ask me, RAM is still too expensive to make this feasible. The article seems to assume that the examples listed are typical. But for so many purposes the idea of keeping the whole database in memory is a huge waste unless RAM gets a lot cheaper. For $100 you can get 120GB of HD or 256MB of RAM. That's not a comparable expense. And depending on the database, you are not likely to need the whole thing in RAM most of the time. In fact, how ofen is speed going to be important enough to justify this?

    If you have a system where speed is of the essence and the size is not prohibitive, can't existing DB engines be tuned to cache as much as possible if not all of the data in RAM already?

    --
    I do not have a signature
  5. Re:Umm...now? by Anonymous Coward · · Score: 3, Informative

    And large image-bases are too much for RAM today. Remeber the 640K limit? I'd guess that more than 50% current corporate databases would fit in a single gigabyte, including indexes.

    The real speed improvements, according to the guys working on projects like Bamboo on sourceforge come not from the fact that it's in RAM, they test against SQL in RAM and show that most of the performance improvements come from keeping the data in the same process space as the application operating on the data. If they're right, putting MySQL in RAM or Oracle or Microsoft SQL Server is a small improvement.

    The large speed improvements come from lowering the process boundary hopping overhead--marshaling and all that good stuff.

  6. This is nothing new by smackdaddy · · Score: 2, Informative

    There is already a leading in Memory database that is extremely fast. Check out TimesTen. That is what we use. There is also another one called Polyhedra. But the redundancy on Polyhedra doesn't appear to be as good as TimesTen, and it doesn't support Unicode either.

  7. Already in Use by NearlyHeadless · · Score: 3, Informative
    There are already memory-resident databases in use. For example, Lucent uses them for creating products which process cell-phone transactions. See http://www.bell-labs.com/project/dali/.

    There are some cool ideas there. They use two copies on disk for backup in case of system failure. Because of this they don't have to do page-latching.

    In some configurations, though, this is irrelevant, because write transactions lock the whole database! Because they know all transactions will be extremely short, this is faster than locking at page or row level.

  8. Just wait for a crash .. by satsuke · · Score: 2, Informative

    Memory sounds like a good idea in theory .. but what about power failures or momentary blips .. UPS can help but not eliminate that risk.

    A recent hardware write up I read from HP / Compaq has ram partitioning / raid'ing on some of the higher end x86 servers .. with some options for active standby and hot replacement available.

    Another little burb was that with ram .. as the number of individual ram components increases the risk of a single bit non ecc correctable fault scales up accordingly .. such that with 8 gig + arrays the chance of uncorrectable error approches 50% per time interval

    I know memory can develop stuck bits without any warning .. several of the Sun Fire 6800 series machines I work with on a regular basis develop these kinds of errors occasionally .. though with Sun the hardware is smart enough to map around the error and make relevent OBP console & syslog entries.

  9. Re:ECC RAM? by Junta · · Score: 4, Informative

    Not true, you are describing Parity RAM. Let's review:

    Normal RAM: no errors detected
    Parity RAM: an extra bit is used to store the XOR of all the othe bits on the line. If the math doesn't work, the OS can be informed the data at a particular address is corrupt, and let the software decide how to best cope with the condition. Only odd numbers of bit errors can be detected.
    ECC RAM: Uses a more complicated method and even more bits to provide single-bit error *correction* and double-bit error detection. The odds of two bits on the same line being corrupt is slim unless the ram is really bad. Single bit errors cause no problems. Double bit errors are at *least* detected. More than that and chances are the memory module is so screwed that even if the OS doesn't get it reported, the problem will be obvious to the user.

    --
    XML is like violence. If it doesn't solve the problem, use more.
  10. Re:Correct me if I'm wrong by Beliskner · · Score: 4, Informative
    RAM for our SQL servers, all it would take is one fool to trip over a power chord (or just a power-outtage) to lose one heck of a lot of data
    The military and Banks already have databases on SANs with battery-backed RAM and HD upon battery failure 'solid-state disks'

    UPS is old technology, the battery needs constant replacement, and very few have multiple redundant batteries and/or transistors to deal with wear and tear. Yes even a simple MOSFET transistor is not 100% reliable. Usually the only way to tell a battery is dead is your UPS fails when you need it (this happened to us when my MD was demonstrating our service live to customers, afterwards was the only time he's taken less than 9 months to sign off a purchase order on new equipment). A UPS also has a power cord to pull out when you recoil after burning your fingers on a Seagate Cheetah 15000RPM HD in the server room. A UPS also trips if you overload it, which again means the UPS fails when you most need it.

    Other posts mention cosmic radiation at high altitude makes RAM fail. Last time I checked there were no Quad-Xeon Oracle databases on Concorde, although if the International Space Station were to use one this might pose a problem for non-ECC RAM. Anyway, somebody could always write a driver to do software-ECC with Reed-Solomon for RAM if it becomes necessary.

    Huge databases (>500 Gigabyes) would benefit most from this as running a simple OUTER JOIN query on the biggest tables will require most of the database to be called into RAM.

    • Small databases become slow due to HD latency problems if they do a lot of WRITE operations (the database is stored in RAM, the transaction log is appended to, COMMIT TRANS). This would benefit least FROM RAMdisk because a HD append operation is cheap, however it would benefit database speed in mid-backup
    • Mid-size databases become HD-intensive due to aggregate queries/triggered operations over large '>RAM' datasets. For instance enforced cascading deletes where millions of tuples are being deleted cascaded to hundreds of other unindexed tables (in my job I go to the toilet whenever I run a query like this).
    • Huge databases where 'Index size' > 'RAM size' - the simplest query would benefit hugely from more RAM or faster storage or RAM-storage. With current databases this would be a 10Gig Eth connection to a Terabyte RAMSan solid-state disk.
    In the future, who knows, maybe a FPGA/ASIC DPU (Database Processing Unit) for INSTANT COMMIT like NVidia's GPU?
    --
    A caveman dreams of being us, the incalculable power and riches. We dream of being Q, then what?
  11. Re:ECC RAM? by PhilHibbs · · Score: 2, Informative

    I think ECC RAM inserts extra bits after the first, second, fourth, eighth, etc. data bits. So for 32-bit memory, you need 6 extra bits. I would guess that 32-bits is probably a good balance between efficiency and protection.

  12. Re:Not only need oodles of RAM... by Anonymous Coward · · Score: 2, Informative

    Not exactly an original idea, in memory DB's have been around for quite some time.

    Depending on the application it may not be feasible to have a in memory DB for a long time.

    I currently work on DB's that are many hundreds of gigabytes as well as terabytes, some are even pentabytes(sp?). It will be a long time before a we can afford to buy a terabyte of memory to hold our DB's.

    Eventually it could be a possibility but its still cheaper to buy hard drives and let the DB just cache in memory the active stored procedures, indexes etc.(instead of the whole DB).

    In Memory would be a performance improvement obviously but Im not holding my breath for it to be a practical solution anytime soon.

    Maybe for smaller scale DB's systems.

    Have to remember that its more then just data and indexes its also everything the optimizer needs to figure out the best way to run a query for example(histograms, statistics, for cost based analysis) could also be partitioning information and other performance related information that also has to be loaded into memory.

    So there is a good chance its the future, but for some its still a long way off.

  13. Re:Some thoughts on RAM by yarbo · · Score: 2, Informative

    "OTOH a mobo supporting even 4GB of RAM could cost over $2000, and it's likely a proprietary design." My $200 MSI K7D Master L supports 4 Gigabytes of RAM link

  14. Re:What's at stake by Second_Derivative · · Score: 2, Informative

    Rarely? When you work in a bank, "Rarely" is a 100 billion dollar word. You don't want that sort of damocles hanging over you. When one disk fails, generally the other disks don't. If you've got 100 copies of your stock exchange, you're not going to cry if you've only got 99. The problem is that when power fails, every single copy of your data dies.

    Then again, bank dataservers are presumably distributed clusters, and taking out all of them at once is pretty damn hard. Guaranteeing power isn't all that hard -- like you said, have some batteries in a UPS type arrangement; enough to keep your entire cluster running for a minute or two while your diesel generator starts spinning. Have several UPSes, several diesel generators, and a very strict schedule for testing and replacing batteries and you should have a system that can survive anything a hard disk can.

    Just don't use the word "rarely" when you're telling the CEO about something that's going to handle more money than some nation states have.

  15. DBD::RAM by suntse · · Score: 3, Informative

    Any Perl programmers in the audience may wish to check out DBD::RAM. From the CPAN documentation: "DBD::RAM allows you to import almost any type of Perl data structure into an in-memory table and then use DBI and SQL to access and modify it. It also allows direct access to almost any kind of file, supporting SQL manipulation of the file without converting the file out of its native format." More information here

  16. Re:shows MySQL != "real" database by rnicey · · Score: 2, Informative

    You really don't know what you're talking about do you?

    a) Of course MySQL has RAM cache. Here is one part of it:
    http://www.mysql.com/documentation/mysql/bych apter /manual_Reference.html#Query_Cache

    b) More than a bit here or there and your disk data is probably toast also. Where do you think the data on the disk is computed from?

    c) There are hot-swap/raid type RAM motherboards available also. But that's not really the point.

    I run a master MySQL/innodb database from disks and I replicate to 20 slaves each of which has the data that I need to move fastest on a 6GB ramdisk. Load balancing software ensures that should one drop dead (and it happens occasionally) it just tries another. You ought see my throughput, I'll put it against your disk array anytime.

    Right tool, right job. Bias is not a tool.

  17. He just doesn't realize that... by Insurgent2 · · Score: 2, Informative

    ...But such a condition (DB in RAM) will make his product pretty much obsolete.
    The Prevayler Project is a RAM-only Java persistence project that works and is so simple not a single bug has been found in the production release.
    3000 times faster that MySQL (9000 times faster than Oracle) even with the database in caches entirely in RAM simply because of the JDBC overhead that is eliminated .
    The only sticking points I've seen are:
    1. Normal PC's boards generally will only take 1GB of RAM. Sure there are thos expensive Sun machines...
    2. Querying objects in an efficient manner.
    3. Others, but I've gotta take a dump real bad...

  18. Prevayler already does this by Anonymous Coward · · Score: 1, Informative

    Prevayler already does this. It's written in Java.

  19. Re:Some thoughts on RAM by vadim_t · · Score: 2, Informative

    Tyan Tiger MP: Dual AMD CPU, up to 4GB ECC Reg RAM (IIRC), PCI64, about $200
    Tyan Tiger MPX: Dual AMD CPU, up to 4GB ECC Reg RAM, PCI64, about $260. Registered RAM not needed in the first 2 slots.

    A perfectly decent board, btw, I have the second one and it's rock solid. Now, to be completely fair, according to Tyan you will only get 3.5GB or so, depending on configuration due to the address space allocated to AGP and other things. Probably you can get very close to 4GB by using a PCI video card.

  20. Ten Times Faster by Sir_Dill · · Score: 2, Informative

    Several companies are already doing this. As one poster stated before, Lucent is doing this, however the leader in this type of RDBMS is a company called TimesTen. http://www.timesten.com/ It was originally part of HP and it was spun off into its own company. Its a rather brilliant product. It only uses the disks for logging and disaster recovery. The unfortunate thing about the whole deal is the volatility of the ram and the fact that it is still cheaper to have 1TB of disk space for a huge database then it is to have half as much ram.

  21. Ram DBs can be faster by godofredo · · Score: 3, Informative

    Modern storage solutions (like EMC) use redundant battery backed ram to buffer writes, greatly reducing perceived write latency. This gives you a lot of the performance gain of a ram only database, and also scales very well to large loads. (in fact, when choosing RAID stripe size you take into account whether writes are buffered; if not, keep stripes small for log files)

    If you know that your data will always fit into available ram then there are a number of performance optimizations that can be done. I'm not sure about ACID becoming "trivial"; You still need most of the same db components: indexes, lock managers, operation journaling, etc. But many of these could be greatly simplified:

    1. Page/Buffer Manager Eliminated. Since no disk IO will be required for the normal running of the db, there will be no need for a page manager. This eliminates complexity such as block prefetch and marking and replacement strategies. In fact, the data will probably not be stored on pages at all. Details such as block checksum, flip flop, log position, page latches etc can all be removed. The values in the rows would be sitting in memory in native language formats rather than packed making retrieval much faster. There would be no need for block chaining.

    2. More flexible indexing. Since it is not necessary to store data in pages, traditional B-Trees are not absolutely required. Other index structures like AVL trees would be faster and might allow better concurrency. These trees would also be easier to keep balanced ...most databases don't cleanup indexes after deletes, forcing periodic rebuilding. Other index schemes not generally considered because of poor locality prinicles could be considered. Note that Hash Indexes would probably still use Linear Hashing.

    3. Lock Manager Simplified. Row level locking (and MVC) are still desired features, but keeping the locks all in memory simplifies implementation. Oracle and InnoDB store lock information in the blocks (associated with transaction) to allow update transactions larger than memory.

    4. Log manager simplified. You will still need journaling capability for rollback, replication, recovery from backup etc. But the implementation of the log need not be traditional. Any structure that maintains information about transactions and contains causal ordering will do. Techniques such as keeping old versions of rows adjacted to current versions that are unacceptable for disk based databases (ahem, Postgres) could be used.

    Although these may seem like small things, they can add up: less code to run is faster code. A company called TimesTen offered a product that they claimed was 10x faster than Oracle using an all memory DB. Generally the corporate world doesn't care to split hairs. They want something that works, and they are willing to throw some money and iron at it. Thats why battery backed ram in the disk controller to buffer writes is probably going to be fine for now.

    A last note: modern databases already know to not bother with indexes when a table is sufficiently small.

    JJ

  22. Re:shows MySQL != "real" database by MattRog · · Score: 2, Informative

    You really don't know what you're talking about do you?

    It is fairly clear you are the one who is a bit confused here. Because, typically, disk is the primary data storage mechanism and main memory capacity is less than the total size of a database the enterprise DBMS vendors (this does NOT include MySQL) have what is commonly referred to as a 'data cache' (vendors may call it something else) which stores data pages in main memory (there are other caches for other data structures, but we're only concerned with the data cache at this point). I'll explain common techniques and then why the data cache is typically superior to MySQL's 'query cache'.

    Note: I am going to generalize throughout, so there may be differences between your favorite enterprise DBMS and what I say here so please don't post 'My DBMS does all but this one tiny thing the same': the theory is more or less the same.

    Data caches are managed by the DBMS and are strictly controlled by the DBA. Typically most caches are managed by a MRU/LRU (most/least recently used) algorithm to keep stale pages out of the cache (although the DBA can configure the algorithm to suit the particular needs of the system). Also it is possible to split the data cache up into smaller chunks (called cachelets, pools, etc.) and bind those to different physical database objects (tables, indexes, etc.). You can thus ensure that an important table is always going to have room in the cache by binding a dedicated cache to it. You can also prioritize objects so that if more than one object is sharing a cache you ensure it has priority over another one.

    MySQL relies upon the OS filesystem cache to do this work for them. This is an obvious advantage from a MySQL code developer's standpoint since they do not have to write any additional code to handle this. However there are some fairly significant drawbacks and limitations with this simplistic approach. Simply put: the DBMS (via knowledge from the DBA and query statistics) has a much better 'big picture' than the OS. As such, the DBMS can set an optimal strategy for maintaining the cache. I am not intimately familiar with OS data caches so I will not comment on them except I guess that they 'assume' that most data accesses will be of the 'sequential file access type' (e.g. apps will request blocks of a particular file in a sequential fashion). Individual physical row retrieval issues aside, this is very rarely the case in OLTP systems. Even if not, they still are dealing with limited knowledge and could never be as efficient as a DBMS-run cache.

    OS caches are also not nearly as configurable, which comes as no surprise since they are not really designed for this type of use (or mis-use as the case may be). MySQL zealots often quote "best tool for the job" which I wholeheartedly agree. The OS cache is obviously NOT the best tool for the job for DBMS use. Not having a configurable data cache is a huge limitation - and considering past MySQL AB views of things like referential integrity (everyone remembers the infamous 'what for?' quote from the TODO list) it is probably due to ignorance of DBMS products and/or laziness (as an aside, isn't it a little scary that misguided people attempt to deploy MySQL in mission-critical use when it was developed by people who really have no clue about DBMS in general?).

    Enough rambling aside, we now know both what a DBMS-managed data cache is and why they are superior to OS-level caches. What about the vaunted MySQL 'query cache'?

    Pure junk.

    Ok, I typed that firmly tounge-in-cheek, in the context of a feature for MySQL, it is not pure junk. However, this is not a superior feature that MySQL has over enterprise DBMS products. To put it bluntly: the query cache is a cheap imitation to a data cache and is primarily needed because of the lack of a data cache. Before we can discuss it, go

    --

    Thanks,
    --
    Matt
  23. Re:ACID? by dardem · · Score: 2, Informative

    Not sure why no one mentioned this yet, but the principals of ACIDity have nothing to do with how the database is stored (disk or ram). It governs the use of multiple transactions to ensure database consistency before and after committing transactions. Having the database in ram just means you also need separate space to record the transaction prior to commital. Without ACIDity, each operation in a transaction is committed before ensuring the entire transaction is correct/complete, which may cause locking, etc.

    Therefore, weather in ram or on a disk, the DBMS should work the same.

    --

    "Ceilean Súil an ní ná feiceann..."