MySQL Falcon Storage Engine Open Sourced
An anonymous reader writes "The code for the Falcon Storage Engine for MySQL has been released as open source. Jim Starkey, known as the father of Interbase, is behind its creation; previously he was involved with the Firebird SQL database project. Falcon looks to be the long-awaited open source storage engine that may become the primary choice for MySQL, and along the way offer some innovation and performance improvements over current alternatives." This is an alpha release for Windows (32-bit) and Linux (32- and 64-bit) only, and is available only in a specially forked release of MySQL 5.1.
I've been very excited since I first heard about this new storage engine adapted from Netfrastructure. Not only does it give MySQL a transactional storage engine that is not controlled by a hostile company, but the engine appears to be designed from the bottom up to support web traffic. Jim gave a great talk at the Boston MySQL meetup that you can watch here http://video.google.com/videoplay?docid=1929002440 950908895
MySQL itself is Open Source. But that only gives you a few storage Engines. The specific storage engines have different licenses. It is perfectly possible to have commercial storage engine for MySQL.
.... other storage engines also exist
MySQL has no "native" way to store or obtain data - everything goes through plugins, some of which ship with MySQL some don't.
MyISAM - the most common and fastest. But no transactions, no ACID, etc. Good for many read-only or non critical tables.
InnoDB - licensed from InnoSoft (now oracle). GPL for non commercial, extra dollars for commercial. Transactions, ACID, but a bit slow.
Stolen directly from the mysql website:
Falcon has been specially developed for systems that are able to support larger memory architectures and multi-threaded or multi-core CPU environments. Most 64-bit architectures are ideal platforms for the Falcon engine, where there is a larger available memory space and 2-, 4- or 8-core CPUs available. It can also be deployed within a standard 32-bit environment.
The Falcon storage engine is designed to work within high-traffic transactional applications. It supports a number of key features that make this possible:
* True Multi Version Concurrency Control (MVCC) enables records and tables to be updated without the overhead associated with row-level locking mechanisms. The MVCC implementation virtually eliminates the need to lock tables or rows during the update process.
* Flexible locking, including flexible locking levels and smart deadlock detection keep data protected and transactions and operations flowing at full speed.
* Optimized for modern CPUs and environments to support multiple threads allowing multiple transactions and fast transaction handling.
* Transaction-safe (fully ACID-compliant) and able to handle multiple concurrent transactions.
* Serial Log provides high performance and recovery capabilities without sacrificing performance.
* Advanced B-Tree indexes.
* Data compression stores the information on disk in a compressed format, compressing and decompressing data on the fly. The result is in smaller and more efficient physical data sizes.
* Intelligent disk management automatically manages disk file size, extensions and space reclamation.
* Data and index caching provides quick access to data without the requirement to load index data from disk.
* Implicit savepoints ensure data integrity during transactions.
The big thing Falcon brings is MVCC which allows safe simultaneous reading and and writing without locks.
Here is a good explanation of PostgreSQL's MVCC.
We don't see the world as it is, we see it as we are.
-- Anais Nin
I've read through all comments with 2 or more in rating, and it seems that people really underestimate what Jim is doing here.
We're talking in-memory MVCC here. This means you can add 1000 records, do a rollback, and the harddisk hasn't been accessed. Even if you commit, performance will eventually be magnificent compared with on-disk MVCC systems. You can run larger systems on one server with this, than you would be able to run on a cluster with other database systems.
This system has been designed to provide very good performance improvements for those who do know how to create SQL statements, but probably even better performance improvements for those who don't. And we don't have a tradeoff between performance and transactions any more - transactions and better performance are both included.
Also, please note that this technology will make MySQL a trustworthy data storage for many commercial applications out there, giving added value to their apps and their businesses. It will also enable small but very skilled development teams able to use MySQL as a trustworthy database for specialized applications - previously only Firebird and Postgresql were able to provide this for free, and even though Firebird has a very high deployment in USA's top 500 companies, postgresql seems to be very much *nix only in deployment statistics.
I have been programming database applications for more than 20 years, and have been programming Oracle, MSSQL, MySQL, postgresql, Firebird, dBase, Paradox, Access and other databases. I see Jim's contributions to MySQL as extremely important for the database market. Instead of having "just" a transaction layer on top of a storage layer, MySQL now provides mechanisms that give this design an advantage over those database systems where the transactions are stored on disk (like Firebird, Postgresql).
And - by the way - this has NOTHING to do with "optimizing for web applications". Web applications are just as diverse as GUI applications and other systems, and GUI applications will benefit from this as much as web applications.