MySQL Falcon Storage Engine Open Sourced
An anonymous reader writes "The code for the Falcon Storage Engine for MySQL has been released as open source. Jim Starkey, known as the father of Interbase, is behind its creation; previously he was involved with the Firebird SQL database project. Falcon looks to be the long-awaited open source storage engine that may become the primary choice for MySQL, and along the way offer some innovation and performance improvements over current alternatives." This is an alpha release for Windows (32-bit) and Linux (32- and 64-bit) only, and is available only in a specially forked release of MySQL 5.1.
"This Falcon Storage Engine better be released as Falcon open source, or I'm gonna be Falcon pissed."
I'm a developer, but I've never used MySQL.
Isn't MySQL already open source? If so, how does the Falcon storage engine differ from the "regular" storage engine that comes with MySQL?
The most rabid believers in American Exceptionalism are the exact same people whose policies are destroying it.
I've been very excited since I first heard about this new storage engine adapted from Netfrastructure. Not only does it give MySQL a transactional storage engine that is not controlled by a hostile company, but the engine appears to be designed from the bottom up to support web traffic. Jim gave a great talk at the Boston MySQL meetup that you can watch here http://video.google.com/videoplay?docid=1929002440 950908895
What does Falcon do that the InnoDB engine doesn't currently do? The description seems rather vague.
Ugh, no you haven't. The storage engine is how MySQL stores the actual data on disk. There are different formats, each have their own benefits, issues and work arounds. Think of MySQL as a SQL interpreter talking to different storage engines on the back. The default is MyISAM, but InnoDB and ISAM are others.
I just setup a phpBB site that uses MySQL and I'm always keeping an eye on storage space and performance now. I don't suppose my host will upgrade MySQL any time soon, but this is still cool news...
OK, although I don't agree with it, I know that a lot of people use MySQL because it's the most common database supported by web hosts. Isn't it almost as likely for a hosting company to have PostgreSQL, though, as to upgrade to a bleeding-edge version of MySQL when this is finally readied for public consumption? Will this new backend give any extra functionality that PostgreSQL doesn't offer?
Dewey, what part of this looks like authorities should be involved?
Okay. Thanks; I didn't realize that MySQL abstracted the actual file I/O layer, although that certainly makes sense.
Clear, Dark Skies
for smaller databases, but limiting the tablespace to a single file per database/schema doesn't sound very flexible, and won't allow DBAs to maximize their disk throughput.
I am guessing that this is more of a MyISAM replacement than an InnoDB replacement, so it's not really a shot across Oracle's bow (as some comments make it sound like).
Actually, a good question would be : What is the difference between Mysql and Mysql falcon?
Mysql maximizes rich channels and empowers cross-platform convergence letting you drive mission-critical niches whereas Falcon utilizes scalable initiatives by scaling end-to-end networks for reintermediate granular platforms (win32 and linux 32/64).
So you posit that people are no longer using MySQL?
I wouldn't rush to call it pairity yet. It may be that Falcon brings MySQL into to paritiy with Postgres but we'll probably have to wait and see. My gut feeling is that Falcon may not be as mature as the storage engine used by Postgres, and may not be for some time.
So if Falcon uses MVCC, does it require something like PostgreSQL's VACUUM? Or does it have some other way to detect and remove dead tuples?
Also, has anyone looked at making PostgreSQL a storage plugin for MySQL? :-)
I realise your post doesn't say that Slashdot are using MyISAM, but just to make it clear, Slashdot are using InnoDB http://www.mysql.org/doc/refman/5.1/en/innodb-over view.html.
What's your GCNSEQNO?
I was forced to learn Postgres because I wanted to do a SELECT * FROM foo WHERE bar NOT IN ( SELECT x FROM y WHERE z > 4 ) type query - basically a negative subselect. I couldn't do (or couldn't work out how to) use table JOINs to perform the same thing, and at the time, MySQL didn't do it.
I'm so thankful. Postgres is so much better. And the table JOIns I did have ran so much faster in PG. s/mysql_/pg_/g *.php
Get your own free personal location tracker
But in something as mission-critical as a database, of all things, reliability trumps everything. I don't think they could have developed PostgreSQL any other way and still supported its primary goal of safety.
What gave you the (wrong) impression that PostgreSQL folks have been sitting around twiddling their thumbs? Version 8.2 just came out within the month and includes several performance boosts that make it fly on our production systems.
Dewey, what part of this looks like authorities should be involved?
Falcon looks to be the long-awaited open source storage engine that may become the primary choice for MySQL, and along the way offer some innovation and performance improvements over current alternatives.
Come on. Give me PostgreSQL any day. After fiddling with MySQL at work for a few custom developments (both versions 4 and 5.0.2) i'm ready for anything else. Gave PSGSQL a shot at home and it runs very nice, with lots of advanced features. Anyone with real-world deploying feedback to share?
New Microsoft Sql Server coming soon to a store near you. Cant wait for M$ to secretly add this to their SQL server and call it innovation.
Laugh all you want, but MS is one of the leaders in the database world with SQL Server for Windows, and have been for a while now. SQLS is by a long, long shot the best software product Microsoft has ever released, IMHO, and keep doing, right to the latest version. It works great.
Unless you think using drupal makes me a sql developer.
Clear, Dark Skies
I write device drivers, actually. I use Mysql as the backend for a club website I manage.
Clear, Dark Skies
when i first read falcon storage engine, i initially thought of a storage software called falconstor. falconstor is more in the data backup/archive/recovery market, often used as a disk-to-disk-to-tape solution, or sometimes as a virtual tape device, but i imagine that often times a name that's close but different is almost always considered "too close" and forces a name change by somebody... let's see how this gets resolved. who knows, maybe the other falcon store(NSFW) might be the winner...
three can keep a secret, if two are dead - benjamin franklin
That depends on the database.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
"MyISAM - the most common and fastest. But no transactions, no ACID, etc. Good for many read-only or non critical tables. .... other storage engines also exist"
InnoDB - licensed from InnoSoft (now oracle). GPL for non commercial, extra dollars for commercial. Transactions, ACID, but a bit slow.
Unfortunately none of them handle binary blobs very well. e.g. video, audio, pictures, etc. Text is what they're optimized for.
The big thing Falcon brings is MVCC which allows safe simultaneous reading and and writing without locks.
Here is a good explanation of PostgreSQL's MVCC.
We don't see the world as it is, we see it as we are.
-- Anais Nin
Ironically PostGreSQL has a bsd license so it's destined to die, not for me though, I prefer a unix like bsd system instead of a minix like windows os.
Just like I prefer a "free for whatever" license instead of a "free as long as its free" license.
The way I prefer Python over Perl or Java. All are personal choices, I'll never say what somebody else must do, only what you could do and I do judge people on their choices.
Just as I'm judged on my choices.
Microsoft SQL Server has had almost all of these features since its first release in the early 90s. MVCC was just introduced in Microsoft SQL Server 2005. There is no row-level compression in SQL Server (or Oracle, or DB2, or PostGreSQL... which is probably a *good thing* from a performance perspective).
This is a nice step forward for MySQL, but for the most part it is just a means for catching up to the other commercial DBs and PostgreSQL. ACID compliance, granular locking, MVCC, and multithreading are *not* differentiating features in the database world.
Umm, Postgres has had automatic row-level compression (automatic compression of TOASTed) values for some time now.
You've got to be kidding. I used SQL Server 2003 a year ago on a game server, and it was horrible. Just to name a few things:
1. Near-incomprehensible administration panel. The fact that they gave it a fancy name ("Enterprise Manager" or something like that) didn't help. The usability of it absolutely sucks.
2. Bad documentation. All I want is to automatically make database backups every night at 3 AM and upload it to a remote server in encrypted form. With Linux, MySQL and SSH, I can write such a script in 30 minutes. With Windows 2003 and MS SQL Server, I spent 1 to 2 weeks. The documentation told me that it's important to have a backup plan, and that every business should have a backup plan, and more of that nonsense, but nowhere did it mention HOW to actually make a backup! Only after days of searching did I finally find the command to make a backup, but I couldn't find any commandline tools like mysqldump! Eventually I found out that something can be done with the SQL server scheduler (I think it was called MS SQL Agent). I spent more than a week trying to figure out how their scheduler works and writing non-graphical tools to upload the backup to a remote server.
3. Bad, bad performance. While the game server was running (which used 100% CPU; the game server software sucks but that's another story), the SQL Query Analyzer's performance dropped to near zero. It took 2 minutes to execute a simple "SELECT * FROM foo WHERE bar = 2". One day I wrote a tool in Perl (using DBI and ODBC) to automate some database actions. And guess what? Running the same select statement in the Perl tool was almost instantaneous! I have no idea what SQL Query Analyzer is doing to make the query literally 100 times slower.
4. One day I noticed that the server ran out of disk space. It turned out that the database transaction log (not the database itself!) grew to 120 GB! The database itself was only about 100 MB. The documentation told me everything that I don't want to know, but it didn't tell me how to clear the transaction log. I spent 2 days searching Google on how to clear the transaction log, but I didn't find anything useful. Eventually I managed to clear it anyway by playing with the Enterprise Manager for an entire day. I don't remember how I did it, but it was absolutely non-intuitive.
5. It runs on Windows. It's hard to automate things. It doesn't have a good shell like Unix. Yes I know Cygwin exist, but scripts cannot run in the background and always need to have a DOS window. Yes there are tools to hide the DOS window but I don't want to add yet more bandaids to the fact that Windows's shell absolutely sucks. I ended up having to write nearly everything, while on Unix nearly everything is already there.
I'm sure someone will tell me that points 1 and 2 say I'm an incompetent database administrator. Well, this isn't a multi-million dollar business and I don't want to waste money on a database administrator which costs me more than $1000 per month! I just want to run the damn game server! I'm competent in SQL. On Linux I don't need a database administrator to maintain a database of this size, so why should I with Windows?
Seriously, this is not a troll. Yes I know I sound frustrated, but that's because I am. This is my own experience with Win2003 and MS SQL Server. This, together with other Windows quirks which I will not mention in this post, have convinced me that maintaining a Windows is an absolute nightmare.
Actually, compression can improve performance. Often disk I/O can slow you down more than the extra CPU work to compress/decompress. I have written some flat-file processing code at my work and it runs much faster reading from and writing to zipped text files instead of uncompressed text files. The space savings are just a bonus.
I've ended up having to support and/or migrate several Access databases designed by end users ...
... ;)
These guys who've set up MySQL are gurus by all comparisons
"The large print giveth, and the small print taketh away" -- "Step Right Up", Tom Waits
The summary does not mention that the alpha only supports x86 platforms... ie, on Linux, it may be compiled for 80486 and better 32 bit IA32 compatible microprocessors, and Intel EM32T compatible 64bit microprocessors such as the AMD's Athlon 64 family and of course the newer non-Itaniam 64bit Intel microprocessors.
Support for other families of microprocessors, such as PowerPC and Sparc will come later.
No sig. Move along - nothing to see here.
There is of course a work-around. Having a search table that is MyISAM and has fulltext enabled.
Better yet, use something that's been built from the gound-up to search: Apache Lucene (which has even been ported to PHP).
Using Lucene will give you much better weighted results than the MySQL FullText, plus it has a natural language query parser so people can use the kind of search queries they are used to using in google etc.
I am NaN
MySQL *have* done so with Falcon. MySQL *do* own Falcon.
[Jim has worked on other RDBMS in the past] != [MySQL do not own the one he's currently working on]
Il n'y a pas de Planet B.
Meh, I liked it better when they called it Sybase.
"I use a Mac because I'm just better than you are."
First of all, I have a somewhat hard time taking someone seriously when they can't even get the name right. Perhaps it was just a typo, but there is no such thing as SQL Server 2003. There is SQL Server 2000 and SQL Server 2005. I'm presuming you were using SQL Server 2000, since the management util is called SQL Server Studio in 2k5(and is more intuitive IMO).
1. It's been a while since I've used SQL Server 2000, so I don't remember what my thoughts were on Enterprise Manager then, when I was just getting started using SQL Server. Irregardless though, SQL Server 2005's tools seem fairly user-friendly to me.
2. I'm not sure why you found it so difficult to do backups. I can understand a little bit of awkwardness if you didn't know to use SQL Server Agent, but performing the actual backup is fairly simple. I use SQL Server Agent to backup to a file(which is then backed up by BackupExec, since BE 10 seems to be bitchy as far as working with SQL Server 2005 goes. Typical clueless Symantec...), and all I use is
BACKUP DATABASE [dbname] TO [Name of Backup Device Created in SQLSvr]
WITH
DESCRIPTION = 'blah blah blah',
RETAINDAYS=14;
for each database. It probably took me a few hours when I first had to set this up (on SQL Server 2005, and I hadn't used backups in SQL Server or SQL Server Agent before). Admittedly, it probably took longer than it did for me to make by backup shell script for my MySQL and PostgreSQL databases running on Linux.
3. I've never had any problem with SQL Server's performance. Its generally been pretty good actually. Why on earth were you running a database on a server that had 100% CPU usage? That is a really bad idea, regardless of what database you're using, if you actually care about being able to access that data.
4. Yeah, I've had that sort of thing happen too(not the that extent, but the logs were still several gigs for a few meg DB). It is pretty damn annoying, but it is pretty simple to fix. All you have to do to stop it from happening (before it happens) is to change the Recovery Model (under the Options tab for the database's properties in SQL Server 2005) to Simple instead of Full. To shrink the DB logs, I think you run dbcc shrinkfile (Name_of_The_DB's_Log). Atleast that's what I put in my Knowledge Base. After changing the recovery model and running that command if necessary, I've never had any problems with it. More information can be found here
5. If you want to automate things in SQL Server, use SQL Server Agent. That's one of the things its there for. And if you really want to, you can use Task Scheduling in Windows to run a batch script or something automatically. I use task scheduler to automatically delete the file that my SQL Server backs its databases up to after BackupExec backs it up. I don't know what you were trying to do, but it sort of seems like you're wanting to do everything like you would on *nix. You sort of have to think differently and use different features to accomplish tasks on Windows. You're not going to be able to think about everything in the same way.
SQL Server and Windows aren't really that hard, once you get familiar with it. If they're used correctly, and their usage is well planned, then they are pretty damn solid. Not much different than PostgreSQL and Linux, or in some circumstances MySQL and Linux. I've used (and currently use) MySQL, PostgreSQL, and SQL Server. I don't know if I really have a favorite, but I do find a lot of SQL Server's and PostgreSQL's features really useful (things that MySQL isn't as strong in). You can't just expect to be able to start using a different piece of software and be able to do everything in the same way as you have done with another piece of software, and that goes for pretty much any of those databases.
Like I said, I use both Linux and Windows, as well as MySQL, PostgeSQL and MSSQL Server, so I'm
Every time you post an article on Slashdot, I kill a server. Think of the servers!
Perhaps we've tried different versions, but my experience was nowhere as awful...
:) I've created fairly complex DB systems involving triggers, linked tables, stored procedures and lots of functions, and the performance of the server (running on a non-dedicated PIII with 512mb, Win2k3 Server) was stellar. This was both from the query analyzer and being polled from a website (about 10-15 stored procedures every 5 seconds per user).
1. Near-incomprehensible administration panel. The fact that they gave it a fancy name ("Enterprise Manager" or something like that) didn't help. The usability of it absolutely sucks.
Given. It's not unusable, but i've seen better.
2. Bad documentation. All I want is to automatically make database backups every night at 3 AM and upload it to a remote server in encrypted form. With Linux, MySQL and SSH, I can write such a script in 30 minutes. With Windows 2003 and MS SQL Server, I spent 1 to 2 weeks.
I did something similar with a few installations and never took me more than 5 minutes - automatic backups and stored procedures triggering. The dedicated utility (SQL Server Agent) is very easy to setup via the GUI, IMHO.
3. Bad, bad performance. While the game server was running (which used 100% CPU; the game server software sucks but that's another story), the SQL Query Analyzer's performance dropped to near zero. It took 2 minutes to execute a simple "SELECT * FROM foo WHERE bar = 2". One day I wrote a tool in Perl (using DBI and ODBC) to automate some database actions. And guess what? Running the same select statement in the Perl tool was almost instantaneous! I have no idea what SQL Query Analyzer is doing to make the query literally 100 times slower.
I have to politely disagree here
4. One day I noticed that the server ran out of disk space. It turned out that the database transaction log (not the database itself!) grew to 120 GB! The database itself was only about 100 MB. The documentation told me everything that I don't want to know, but it didn't tell me how to clear the transaction log.
I limited the log size to a sensible value when i first installed the DB, and was a pretty straightforward procedure. Given, i was warned about this issue beforehand by a fellow programmer.
I'm sure someone will tell me that points 1 and 2 say I'm an incompetent database administrator. Well, this isn't a multi-million dollar business and I don't want to waste money on a database administrator which costs me more than $1000 per month! I just want to run the damn game server! I'm competent in SQL. On Linux I don't need a database administrator to maintain a database of this size, so why should I with Windows?
Oh, much agreed. Sadly, we do a lot of work both deploying and customizing SCADA setups, and this kind of systems run only on Windows platforms, which make SQL Server somehow a viable option (even when setting up a dedicated Linux/BSD server can be done for a fraction of the cost). That's when databases other than SQLS are an option - most SCADA software is "certified" to work with a handful of databases. If i was given the choice, i'd reccomend PostgreSQL, which, as far as i've tried, seems to be perhaps the most complete OSS database engine arround, miles ahead of MySQL.
Having said that, i found SQL Server to be an extremely competent product. If it's worth or not its price is debatable, but i really meant that is the best piece of software i've seen MS releasing that i can recall. The database engine is extremely fast and well-featured, and comes with a nice pack of utilities - notably, the Query Analyzer. I've seen commercial alternatives for MySQL, PSSQL and the works, but i've yet to come across an OSS one.
rotflmao, then you haven't had to deal with any relatively complex data with mysql in the past 5 years... There's been a lot of issues with how it deals with data. Even in "compliance" mode... which of course has to be disabled to be compatible with too many PHP apps out there.
.Net interfaces in it, but some must...
I can't stand MS's politics, but there are plenty of command-line tools for MS-SQL if you prefer that route... most of the management interfaces run T-SQL script commands in the background. You can use isql, or the older interface, I happen to like the newer management interface for SQL Server 2005... Honestly, I don't see a great benefit to using the
PostgreSQL is much richer in terms of programming and standards than mySQL is... mySQL 5.x is better, but too much legacy crap is built on quirks in mySQL now... *sigh*
Michael J. Ryan - tracker1.info
I really have to agree for the most part... I will use MS-SQL when there is already a server in place, but prefer Postgres myself. I've never had much hassle with documentation, but agree the GUIs need a lot of work. I think the additional windows support is great... Distributed databases would be a really nice thing though... implementation of GUIDs similar to MS's for database replication/distribution would be great too...
Michael J. Ryan - tracker1.info
slashdot uses a very customized version of mysql and a LOT of caching. Remember a few months back when threaded comment posting was disabled for 24+ hours so they could rebuild the comment index?
Do you even lift?
These aren't the 'roids you're looking for.
Why would you be concerned about the license for a database? The licence is only important if you are modifying the code AND distributing the modified code. Most database users I know just install the binary and use that in which case you don't care if it's BSD or GPL licenced.
evil is as evil does
MSSQL only runs on windows. It also only supports 4 gigs of data unless you want to pay. I would much rather use db/2 which has no data storage limit for the free version or an open source product and as a bonus I get to use any platform I want.
evil is as evil does
So what if you have to provide the source for InnoDB? There's a process boundary between InnoDB and the program that uses the database. Why would you have to provide the source for your program that only communicated with MySQL over IPC, even under the GPLv3?
But in something as mission-critical as a database, of all things, reliability trumps everything. I don't think they could have developed PostgreSQL any other way and still supported its primary goal of safety.
...
I agree that reliability trumps all, but it sounds like you're saying Postgres was always super-stable. Not so, sadly
If you want to make crappy code into good code, you can. You just need to decide that you want to.
The MySQL guys *could* decide that version N+1 will be ACID-compliant, 100%, with every backend, and then work to make that happen.
Perhaps the damning thing about MySQL is not that they can't, but that they haven't chosen to.
I've used MySQL with had ACID compliance, row-level locking, MVCC, foreign key and a multithreaded backend since 2001 - MySQL 3.23 w/InnoDB. Falcon adds no new features to MySQL.
I really apprecite some of the sweet improvements coming out of Microsoft these days, unlike a lot of the Linux fanbois here who will knock anything Microsoft does. I know SQL Server 2005 has some massive enhancements over SQL Server 2000, and some of them are amazing. I actually just finished reading Inside Microsoft SQL Server 2005: The Storage Engine, and it was a very interesting read. I was considering switching from InnoDB to SQL Server 2005 depending on the results of my research and in-house testing of the performance. But the one thing that really disappointed me was that Microsoft JUST added MVCC in 2005. I personally consider the concurrency of MVCC to be essential for modern applications, at least in the field I work. The fact that they didn't even have it available as an option until now is incredible, and from what I read, you have to maintain the tempdb database when you're using MVCC (aka snapshot) or you could receive errors on update operations. It seems like they hacked snapshot on, rather than making MVCC a core feature of their database engine. I know row level locking gives a great deal of concurrency, and Microsoft's locking types are amazingly fine grained, but MVCC's memory footprint for large numbers of row level locks destroys Microsoft's which just starts escalating locks to save memory, and MVCC beats simple row level locking for concurrency any day. I'll take the 100% non-locking selects.
Also, although my applications rarely involve large transactions, I find it interesting that InnoDB's default mode is repeatable read, while MS sticks with read committed by default (especially if you want snapshot that isn't nearly serialized, you use RCSI). InnoDB's repeatable read is more than Microsoft's RR, because it also protects against phantom rows, giving you near serialized consistancy with fairly high performance. I'm afraid that to get comparable performance on MS SQL 2005, I'd have to give up certain levels of transactional consistancy, in addition to MVCC.
Most MS SQL 2005 users tell me "concurrency / transaction consistancy has never been a problem for me" because they're running their servers at 10-25% load and not exactly serving real time systems with it. But in my industry (telecom) I have to have the best possible concurrency while maintaining data consistancy, and I run my database servers a lot harder than most people do, and have a lot higher requirements for response times.
I am still looking into the advantages of both systems, and intend to test MS SQL 2005 head to head with MySQL 5.0.30 Enterprise / InnoDB in the near future. If anyone has any input on this subject, I would be very interested in hearing it. Feel free to email basharteg@basharteg.com
I've read through all comments with 2 or more in rating, and it seems that people really underestimate what Jim is doing here.
We're talking in-memory MVCC here. This means you can add 1000 records, do a rollback, and the harddisk hasn't been accessed. Even if you commit, performance will eventually be magnificent compared with on-disk MVCC systems. You can run larger systems on one server with this, than you would be able to run on a cluster with other database systems.
This system has been designed to provide very good performance improvements for those who do know how to create SQL statements, but probably even better performance improvements for those who don't. And we don't have a tradeoff between performance and transactions any more - transactions and better performance are both included.
Also, please note that this technology will make MySQL a trustworthy data storage for many commercial applications out there, giving added value to their apps and their businesses. It will also enable small but very skilled development teams able to use MySQL as a trustworthy database for specialized applications - previously only Firebird and Postgresql were able to provide this for free, and even though Firebird has a very high deployment in USA's top 500 companies, postgresql seems to be very much *nix only in deployment statistics.
I have been programming database applications for more than 20 years, and have been programming Oracle, MSSQL, MySQL, postgresql, Firebird, dBase, Paradox, Access and other databases. I see Jim's contributions to MySQL as extremely important for the database market. Instead of having "just" a transaction layer on top of a storage layer, MySQL now provides mechanisms that give this design an advantage over those database systems where the transactions are stored on disk (like Firebird, Postgresql).
And - by the way - this has NOTHING to do with "optimizing for web applications". Web applications are just as diverse as GUI applications and other systems, and GUI applications will benefit from this as much as web applications.
It depends how it is being accessed from the applications. Are they using the c APIs for example? If so they are linking to the client library which would usually be considered a derivative work and so would be bound by the terms of the license.
If they are doing something like using JDBC it becomes less clear and you would need to talk to a lawyer. (actually, you may want to talk to a lawyer anyway)
-- MartinG To mail me: echo kewyjlcxyzvjfxbqwh | tr bcefhjklqvwxyz
Most people use ODBC/JDBC either that or the libraries that come with their language like ruby, python, haskell or whatever. Anyway no need to talk to a lawyer unless you are modifying code and distributing it. The GPL does not apply to anybody who is simply using the binaries.
Nice FUD though too bad it's so easily countered.
evil is as evil does
SQL Server may have some extra features, but to web developers they're lost behind the "Server Management Studio" and its buggy, archaic and inefficient UI.
What, you would maybe prefer maybe the buggy, archaic and utterly lacking-in-functionality tools that oracle gives? You do know that SQL server has commandline tools too?
I haven't encountered any PC application in the past 5 years more buggy than SQL server.
That's a very odd statement. This opinion isn't shared by anyone that I have spoken to.
My Karma: ran over your Dogma
StrawberryFrog
That's a good thing. For many applications ACID isn't worth the overhead. There are many applications where performance is more important than reliability, for instance because the data can be trivially rebuilt if something goes wrong.
I wasn't talking about SQL Server's performance, but the Query Analyzer's performance. As I wrote in my post, the Perl script I write could perform a query in a fraction of the time that Query Analyzer could.
This is the reason why I ran the database server on a machine with 100% CPU usage:
1. I already had 2 server machines. 66.6% of the game server was run on server A, while the other 33.3% of the game server and SQL Server ran on server B.
2. Both machines were dual core. On server B, core 1 would be running the game server program, and core 2 would be running the SQL Server.
3. The game itself ran just fine, and the Perl script that I wrote could perform queries very quickly, so I know the machine can handle it. The problem lies in the *Query Analyzer*, not SQL Server. Query Analyzer needed 2 minutes for a simple query while my Perl script needed 2 seconds.
Yes, I realized that. Making Agent make the backups is not hard. But I also want to automatically compress the backups and send it to a remote Linux server, preferably by ssh. The backup server is not on the same network.
Agent made things more complex by running commands as the SYSTEM user. If any of the commands need user input, then I'm screwed, and the entire command will freeze, because the user input window is never displayed to the user. Windows is highly graphical in nature so it's hard to find any tools that are non-interactive. I couldn't find any non-graphical scp tools for Windows, except Cygwin ssh (in combination with ssh public keys), so that's all I could use. But then I found out that Cygwin ssh would look in the SYSTEM user's home folder for the ssh keys, and the SYSTEM user doesn't have a home folder! In the end I had to write a Perl script that periodically checks the backup folder for new backups. This Perl script is run every time a user logs in (I put it in the Startup folder).
The guy who took over the server setup ODBC for Unix. And how he makes backups by sending the BACKUP command from the backup (Linux) server to the database server. He then copies the backup from the database server to the backup server through ssh (a Cygwin ssh daemon was running on the database server).
Is there a better way than this?
My conclusion is that SQL Server may work for large enterprises that have expensive and trained database administrators, but for small users (like me, who just want to setup a non-commercial game server, and who don't have tons of money to waste), SQL Server is way too complex and phrone.
Ironically, PostgreSQL often benchmarks much faster than MySQL, particularly when non-trivial loads are involved. When you can formally prove that a class of error conditions can't happen, you can stop testing for them.
Dewey, what part of this looks like authorities should be involved?
The GPL does not apply to anybody who is simply using the binaries.
No, it does _not_ just apply to a modification of the original work, but also to any derived works. (see GPL snippet below)
Linking usually constitutes a derived work. Whether using GPL jdbc drivers by putting it in yourl classpath etc. constitutes a derived work is a subject of debate. Seriously, there is legal opinion in both directions on this issue. I have personally discussed this with people at a well known open source java application server vendor who face this exact issue. There is even disagreement between FSF and The Apache Foundation about shipping LGPL classes, never mind GPL ones.
Seriously, if you are building an application that relies on GPL libraries of any kind, you may technically be creating a derived work and you really should talk to a lawyer.
From the GPL:
"b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License."
-- MartinG To mail me: echo kewyjlcxyzvjfxbqwh | tr bcefhjklqvwxyz
No it doesn't.
A DBMS featuring a "/dev/null" backend would be worthless for all applications other than entertainment value (not that MySQL has ever been anywhere near that bad). Unless a system is reliable enough to actually manage your data, every other feature is moot.
Dewey, what part of this looks like authorities should be involved?
Surprised someone would mention that; I am a developer for eAthena (every now and then, college gets in the way).
But I'm anxious to see what performance differences and nifty features Falcon will hold when the Open Source Community gets their hands on it. I currently use MySQL with a mixture of MyISAM and InnoDB tables, but that may all change depending on what new storage types appear.
I think it's a case of typical front-end bloat. I'm tasked with looking after a few MSSQL DBs - 7, 2000 and 2005 as well as occasionally developing light apps with them and when they are set up and running well, they are indeed decent RDBMS and very feature rich - as in *useful* features. But day to day admin tasks are horrible and documentation is poor. Lot's of it, but as you mentioned it's all very "business speak" and only skims the technical stuff you actually need.
Far from being an MS fan, I have to admit that a fair amount of software they produce *is* impressive. But then they invariably tack on horrible interfaces or bloated extra features until you have to spend days digging through the morass before you find the functionality you were looking for.
In fact, this applies to the OS as a whole - good core, crappy shell.
"...So I hung back and lurked. For 18 months. Can't beat a good old-fashioned lurking."
InnoDB does not appear to be ACID-compliant, at least according to the ANSI standard. See:o r-handling.html
http://dev.mysql.com/doc/refman/5.1/en/innodb-err
See the 2nd bullet point. Silently accepting some statements in a transaction while discarding others is a pretty bad idea.
That's not remotely the same as saying that reliability trumps everything. For example, consider an application that uses a database as a read-only, or nearly read-only, working store, but requires very high performance. In that case, reliability just has to be reasonably good, performance is the driving requirement. Or consider a low-end home photo management tool: it's annoying if the database gets lost or corrupted, but since it only stores metadata, not the photos themselves, it's not a huge loss. The driving requirement there is simplicity of installation and configuration. I could easily come up with a half-dozen more scenarios where once a basic level of reliability is achieved, ease of use, performance, size, developer-friendliness, or even popularity would be more important than adding another nine -- or even two or three of them -- to reliability.
As I said, it depends on the database.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Oracle 9i and newer supports row compression. Apart from the obvious space usage benefits, there can, under appropriate circumstances be benefits in performance as well. This tends to be in applications that compression is most needed such as large datawarehouses where queries can have a huge overhead in i/o and memory usage as the blocks remain in compressed form even when in memory. The downside is the potential increase in processing required but as large queries tend to be disk bound, it is often the case that the increase in CPU usage is in no way detrimental.
In other words, the first priority is reliability. After that, other considerations come into play. See? We agree.
Dewey, what part of this looks like authorities should be involved?
Whether OSTG uses a "customized version of MySQL" has absolutely nothing to do with what happened with the threaded comment posting being disabled a few months ago. That was a schema issue, plain and simple.
I've never heard they used *anything* custom in MySQL. InnoDB, MyISAM, replication and multiple masters on some beefy hardware is the general consensus. For caching, it's mainly memcached, just like a lot of the bigger sites.
http://slashdot.org/~tf23/journal
In other words, the first priority is reliability. After that, other considerations come into play. See? We agree.
Perhaps where we disagree is the meaning of "trumps". I interpreted your comment to mean that until a very high level of reliability was achieved, nothing else mattered. I think a much lower level of reliability -- the level provided by MyISAM, or perhaps a bit less -- is perfectly adequate for plenty of applications.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
That's not remotely the same as saying that reliability trumps everything. For example, consider an application that uses a database as a read-only, or nearly read-only, working store, but requires very high performance.
Then you should probably not be looking at an RDBMS. Rational databases are about reliability (ACID is generally assumed) and relationships; not performance. Your criteria is about performance whereby, the dataset can provide for optimal performance outside of relational trappings.
Remember, SQL is not an end-all, be-all of databases.
The phrase "very high level" is not quantitive. Besides, what good is it to deliver lots of bad data fast? And if you're strictly delivering read-only data, then a data cache or some other representation of your data should provide much faster data access then the layers associated with a SQL engine. In other words, given your criteria, there probably exists a far better solution than MySQL or PostgreSQL; and the solution wont include SQL in it.
"Linking usually constitutes a derived work."
Yes but linking in the code sense not in the "I am using this library" sense. If I write a driver and I link to the mysql libraries then I have to release my code under the GPL.
If you are not writing drivers you are fine.
Again this is old FUD, you need to find better FUD.
evil is as evil does
Yes but it means that the database snobs should stop bitching about MySQL. Although they still will, its just that now they will look like bigger asses than before.
As a long-time fan of object databases and the blistering performance they can deliver (often as much as three orders of magnitude faster than RDBMSs), I agree with you. However, there's another constraint, and that is that RDBMSs and SQL is what developers know.
In addition, for many applications the data integrity requirement depends on the nature of the data stored, rather than the nature of the application. Two installations of the same app can have radically different performance and reliability requirements. The nice thing about SQL -- particularly if it's DB-independent SQL -- is that you can choose a storage engine that is appropriate to the requirements.
Of course, a decent OODBMS would be both reliable and hugely faster than any RDMBS, but now we're getting back to "what the developers know".
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
:-)
Besides, what good is it to deliver lots of bad data fast?Obviously none. But the only scenario where that's likely is a non-ACID database, with insufficient locking and competing writes resulting writers overwriting one another's commits, which may even break referential integrity. If you have an application with that access pattern, you obviously must use a better DB. Many apps, however, are read-mostly, write-rarely, so heavy-handed locking is okay, and even ignoring the issue of consistency altogether is occasionally fine.
Heck, even with "real" databases, it's not uncommon to turn the transaction isolation level down, because the app developers know that full ACID consistency isn't required. I'm working on a project right now that uses a big DB/2 cluster as its storage engine, and we set ANSI SQL isolation level 1 (READ COMMITTED). An analysis of transaction patterns showed that we really could use level 0.
And if you're strictly delivering read-only data, then a data cache or some other representation of your data should provide much faster data access then the layers associated with a SQL engine. In other words, given your criteria, there probably exists a far better solution than MySQL or PostgreSQL; and the solution wont include SQL in it.See my other response.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
ask Jamie McCarthy over at slashcode.org. Slashdot is 6 servers: 4 of them serve cached content, 1 serves dynamic (perl) content, 1 runs mysql (a custom version). When a logged in user reads a story, front page, etc, the db is queried for the user account and preferences, everything else comes from cached html files. An AC reading doesn't even require a database hit.
Do you even lift?
These aren't the 'roids you're looking for.
The problem is that nobody have ever made a "decent OODBMS".
I have at least not been able to find any, but if you know of any, i would be really interested in our expierence with it.
Martin
It depends, of course, on what your requirements are for a "decent" OODBMS. For C++, I think ObjectStore rocks.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Thanks for the link to ObjectStore. It really seems like they have updated and improved their website and product since I last visited them a year ago :}
One more thing. Do you have any idear about the price(Or just price range(1000$ 10000$ ??) for ObjectStore? All I could find on their website was a "call us for pricing", and it is my expierence that when a companies don't dare to offer a price on their website it's because their product is extreamly expensive.
Well I think I will just write to them and find out.
a transactional storage engine that is not controlled by a hostile company
Which brings the total to at least 3 such so far: SolidDB, PBXT in addition to the well-known Falcon project.
If Oracle intended to weaken MySQL's market position by the acquisitions of Sleepycat and Innobase, they seriously miscalculated.
you had me at #!
This is certainly not FUD. Please stop simply labelling what you disagree with as FUD and instead find something to back up what you are saying.
How is it so clear to you that "writing drivers" constitutes creating a derived work, but using mysql via jdbc does not? I'm not saying you are wrong, I'm just saying it is legally unclear and nothing you have presented suggests otherwise. If you have examples from case law to cite, please go ahead.
-- MartinG To mail me: echo kewyjlcxyzvjfxbqwh | tr bcefhjklqvwxyz
Here let me try and make it clear for you.
If you are not writing code you are not violating the GPL. Got that? Wring SQL statements doesn't count.
If you are writing code and you are using a library you did not write then you are not violating the GPL. If there is a GPL violation it was done by the guy who wrote the library. SO if you are using JDBC, ruby, python, php, ocaml, haskell, rebol, or brainfuck to write your program that access your mysql server you are safe. Also if you are not distributing your application then you are safe no matter what.
If you are writing a library to access mysql and you do not use the mysql libs then you are not violating the GPL.
If you are writing a library to access mysql and you are using the mysql libs and you are not distributing those libraries to anybody outside of your company then you are not violating the GPL.
Got that? Now stop spreading FUD. Read the fucking GPL and understand it, it's not that fucking complicated. In order to violate the GPL you have to write code derived form GPLed code AND you have to distribute it.
evil is as evil does
I used ObjectStore in the mid to late 90s and I thought it was a great product then. It was moderately expensive then, though cheaper than the high-end RDBMSs. I would suppose it would be about the same now.
It's so easy to work with, I think they really should GPL their PSE version -- dual-license it like Trolltech, Sleepycat, etc. I think it would help them get some traction among developers which would help them sell both PSE and their higher-end products. But, ObjectStore has always seemed to be great technology saddled with lousy management, so I'm not surprised they don't see the opportunity.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Sigh. You are as wrong as you are ignorant. MySQL converted their client-library licenses to GPLv2 (from LGPL) several years ago. Both the GNU zealots and MySQL believe that any applications which use libraries licensed under the GPL are derivative and thus must be wholly licensed under the GPL. This is not an academic point either as this dual-licensing model is where MySQL has derived most of their revenues from (which, depends, on the GPL's force over client libraries).
From the GPL v2:
"If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License"
From MySQL:
Previously, the MySQL client libraries were licensed under the LGPL (the Lesser General Public License) and now they use the GPL (the General Public License). What prompted this change?
MySQL's goal is to provide all its software under a free software/open source license. The change from the LGPL to the GPL for the client libraries was made in 2001 during the development of MySQL 4.0 to help MySQL AB more easily differentiate between a proprietary user who should buy a commercial license and a free software user who should use the GPL license. Previously there were people that were misusing the GPL by distributing the MySQL server tightly coupled with their applications and claiming that the GPL doesn't affect them because the client libraries were free to use.
This change has allowed MySQL to support its dual licensing model by better identifying when someone is using MySQL software in a closed source fashion without commitment to the open source philosophy. While MySQL supports the open source ideals, we also believe in the notion of "Quid Pro Quo" or fair exchange. For developers building open source applications using MySQL, the change in the client licensing policy has no effect.
Based on feedback from our users, MySQL has introduced an exception that makes it possible to combine the MySQL client libraries with software that uses various Free and Open Source software ("FOSS") licenses. This is known as the FOSS License Exception.
Also from MySQL:
"If you include one or more of the MySQL drivers in your non-GPL application (so that your application can run with MySQL), you need a commercial license for the driver(s) in question. The MySQL drivers currently include an ODBC driver, a JDBC driver and the C language library."
"If you develop and distribute a commercial application and as part of utilizing your application, the end-user must download a copy of MySQL; for each derivative work, you (or, in some cases, your end-user) need a commercial license for the MySQL server and/or MySQL client libraries."
Wrong. The GPL affords no such protections. PHP and other open source projects were at risk when they made the change. MySQL had to create a special special exception for these FLOSS programs. A MySQL client-library derivative work (e.g., PHP) is OK as long as the source code is released (under one of the enumerated FLOSS licenses or GPL). MySQL, however, still claims that if you write an application on top of one of these other FLOSS programs, e.g., PHP, and is considered derivative, i.e., connects to MySQL with their libraries, then you must release your source code with your distribution. In other