Ask Database Guru Brian Aker
Brian Aker is Director of Architecture for MySQL AB. He has also worked on the code (and database) that runs Slashdot, and is well-known in both Apache and Perl circles. Outside of the arcane world of open source "back-end" programming, though, hardly anyone has heard of him. This is your chance to ask Brian (hopefully after looking at his blog and Wikipedia listing) about anything you like, from Perl to database architecture to open source philosophy to upcoming events in Seattle. We'll send Brian 10 of the highest-moderated questions approximately 24 hours after this post appears. His (verbatim) answers will appear late this week or early next week.
What is your take on Google's enhancements? Do you fear that Google's upcoming code donation is going to make them a large stakeholder in the future of MySQL? What impacts (if any) do you think this will have--especially on your decisions as a director?
My work here is dung.
My work here is dung.
Do you see a future for object databases? They are quite nice for some applications, but there are no real industrial-grade OpenSource implementations.
Also, why do you select the nickname "Krow" for both Slashdot & your blog?
My work here is dung.
Do you ever get out to Archie McPhee's?
:-)
I've only been to their store once, but I mail order stuff all the time. Great cubicle toys. The best rubber chickens money can buy outside an art museum
Infuriate left and right
From your perspective, what's the biggest public misconception about MySQL and what (if anything) is MySQL AB doing to correct it?
There are no karma whores, only moderation johns
What do you see as being the biggest inhibiting factor in the world of databases? If you could snap your fingers and have one piece of software/hardware changed or removed or improved today, what would it be?
My work here is dung.
When can we get replication that isn't brutally difficult to setup and maintain.
Example, slaves that can be started blank and copy all necessary files without needing to have a copy of the databases already.
MABASPLOOM!
What were you largest challenges or features that gave you the most greif when implementing them in MySQL?
Im a gamer, not a grammer major. This post is full of spelling and grammer mistakes.
What is it about MySQL that caused you to create so much for it and to become a director for it? You're obviously very talented, why didn't you pick some other open source database (I don't mean to incite a religious war here)? Was it political? Was it because of friendships or really for technological reasons?
My work here is dung.
What was your bigest ah ha moment? Did you ever have a solution to a problem that got you excited? Or had you drive in on your off hours/day off just to see if it would work? Or called someone in the middle of the night about?
Im a gamer, not a grammer major. This post is full of spelling and grammer mistakes.
Are you ever tempted to throw away your open source development position for a possibly much higher paid closed source development position? If so, how do you cope with living in a capitalistic society and not taking advantage of (or even seeking for that matter) higher paying opportunities?
I mean, come on, you can't tell me you've never been offered more money (although probably a lower position) at another company. What 'keeps' you at MySQL AB?
My work here is dung.
Is it valid behavior to silently tru
Norris/Palin 2012
Fact: We deserve leaders who can kick your ass and field dress your carcass.
I'd really like to know your response to the post MySQL Quality of old and new features.
Are there any plans to enhance the authentication capabilities of the server to match what's available with Postgres/Oracle?
One thing I've always wished for is the ability to authenticate user/service access to databases and tables via other backends (pam_krb5, SASL, ldap, etc). This ability (imho) would result in wider adoption in certain IT shops where compliance officers and checklist junkies would opt to instead shell out for Oracle Advanced Security or something similar.
-s
One thing that is puzzling is that an obvious thing like choosing the directory in which to place the database is made so complicated, having to go into configuration files, setting permissions etc. so that one can choose where to put the database seems overly complicated to most people who are perhaps not so technically adept. How would you rate the possibilities of MySQL being put into broader useage, if you made it easier to choose save directories and possibly provided a simple webbased gui as part of the installation package. Thanks for a nice product though!
MS, ALS, Aphasia ? http://globability.org - Me http://einarpetersen.com
So, let me get this straight: you (mySQL) use a dolphin to fetch data while PostgreSQL uses an elephant to fetch data. Would that explain why PostgreSQL is better at fetching large datasets? Like, the elephant can haul more, but is slower while the dolphin is faster, but can't carry as well? Have you thought about using a non-animal to fetch your data? Maybe a racecar? Those are fast and could probably haul as much as an elephant. Plus, then I wouldn't need to have fish or peanuts in my server room.
Why are so many databases still tabular today and not hierarchal? I'm thinking of how Cache (or M / Mumps) is a database which basically lets you store associative arrays of data nested at any level, and access them in the normal programmatic way which you would access any other array.
SELECT name FROM table WHERE key='xyz'; seems so archaic and limited compared to something PHP-like such as $table['xyz']['name'] where $table is a disk-based global variable accessible from all scripts and automatically synchronized at all times.
Morphing Software
I like ponies. Do u like ponies? Ponies are liek OMG awesome!
Well, I will ask something not related to databases, or even computing at all. From the "About Me" section in your blog, I can see you named your lovely cats Nausicaä and Kiki. I will assume that you are indeed a fan of Japanese Animation, especially the works of Hayao Myazaki. So I ask:
- What's your favorite Myazaki movie to date?
- What's the reason for this choice?
- And finally: I also love Myazaki's movies, but lately I have been struggling to find anime that I really enjoy.
They all seem repetitive and cliché. Would you dare recommend one? =)
Thank you a lot.
My best wishes for you and your cats!
If I clone myself, can I call it a thread?
If a girl winks to us, can I call it a race condition?
What one book or example would you recommend to someone who wanted to learn LAMP?
DMCA, Hollings, Palladium. What might have sounded like paranoia is now common sense.
How come you haven't hired Joe Celko yet?
-C
Why is MySQL still not standards compliant?
The success of MySQL in the light of databases that have always been architecturally better and better-featured, from the free PostgreSQL right up to Oracle, is more than anything a reminder that we live in the "good enough" world.
A successful database product, it seems, doesn't have to be an engineering masterpiece: it doesn't have to be architected for grand scaling, and it never seemed to bother people how late to the game MySQL was with transaction reliability; features which advance the state of the art, such as object-relational systems, are eschewed in favour of the simple and familiar. In the gold rush, MySQL AB is, as, it were, selling the Mark I shovel; don't get me wrong, it's a fine shovel, but it's still just a shovel.
With all this in mind, and seeing a similar trend across the open source landscape, do you not feel a duty to take the F/OSS community away from its reputation as feeding this mediocre requirement? ESR's bazaar has become a great facsimile machine.
There are several behaviors in MySQL that are quite key to the core of the system, and which are still not being revisited. For example:
These affect performance of almost any large system. However, even with the new storage back-end, some of these fundamental "characteristics" aren't being revisited, in particular the universally derided failure to support foreign key constraint checking.
Why is MySQL so fundamentally incapable of revisiting decisions that have proven to be incorrect over time? I mean, Monty may have had reasons for making his database incapable of supporting key relational database constructs. But why are you incapable of accepting that for MySQL to have much wider scale adoption you have to remove the Monty Personality Constructs from the core limitations of the system?
For those of you who are interested, the key part here is that systems like InnoDB (which are able to process FK constraints internally) aren't able to do so efficiently, because the core locks them into row-at-a-time constraint processing, meaning that large inserts into fact tables with small dimension tables are painfully slow if you don't turn of FK constraint checking)
Brian, are you that same guru guy that does the yellowbook.com commercials?
In engineering the quality of the information (telemetry, simulation models, engineering reports, etc) in databases always will degrade over time as the database exponentially grows. Sooner or later changes in personnel will cause data logging procedures to be ignored, errors or missing sensors not to be corrected, users will stop placing extensive comments connected to the data, bad data will simply be dumped and so forth.
To your knowledge, which development are currently ongoing to make a database automatically keep the quality of the information high? Place a "forgetting" algorithm which cancels out old data, automatic metrics for the quality of the information, recognize and resolve ambiguities in the information, or other similar potential solutions?
Any intentions of implementing any of C.J. Date's Third Manifesto proposals for implementing the new generation of relational databases? If not, why?
The bitter lessons of a veteran coder: http://bitterprogrammer.blogspot.com
Brian:
;)
A thought-experiment question I've been meaning to ask you for a while, keep forgetting, and Hey, look, an interview
Are you truly in favor of secession of the Northwestern states into an independent country, and if so, could you shed some light on you'd favor in the result?
I don't mean complete with a war (why this is just a thought experiment), but I wonder if you could outline how you anticipate such a country would look and operate. Keep Washington / Oregon / etc as separate entities with unique approaches to questions like income taxes, gambling, etc? Do you think some form of (awfully named) "universal health care system" would emerge? What geographic area would you draw the red pen around on the map? How much would you charge Californians to visit? In what aspect of the resulting economy would you welcome a free market, and in which ones would you not?
Cheers!
timothy
jrnl: http://tinyurl.com/c2l8yr / foes: http://tinyurl.com/ckjno5
Hi, Brian -
Are relational database the end-all, or do we have other promising database models to look out for? OO-databases apparently went the way of the Dodo, but what else is out there that you find interesting?
Assorted stuff I do sometimes: Lemuria.org
How much do you know about PostgreSQL and what do you think about it?
Think-o i there: Should read "Are you truly in favor of secession of the Northwestern states into an independent country, and if so, could you shed some light on what you'd favor in the result?
jrnl: http://tinyurl.com/c2l8yr / foes: http://tinyurl.com/ckjno5
Do you think there is a better way of handling multithreading than mutexes and semaphores? If so, what would set of primitives do you think would best suit the task?
Need a Python, C++, Unix, Linux develop
One of the things I've always felt is most lacking for Open Source databases is good client-side GUI tools to do ad-hoc queries, look at the database structure, assist in copying/importing/exporting data, etc. Microsoft has a pretty good tool in this area with either the SQL Enterprise Manager/Query Analyzer combo or SQL Server Management Studio.
MySQL on the other hand has a poorly-implemented, not-well-supported equivalent to Query Analyzer called MySQL Query Browser: http://www.mysql.com/products/tools/query-browser/ It's slow, crashes often (especially when called upon to list large datasets), and has tons and tons of usability flaws (the most glaring being the impossibility to select/copy the dataset to any other applications, and the failure to support standard keyboard shortcuts like Control/Command-A to Select All.)
It seems to me that MySQL Query Browser is treated, at best, as a second-class citizen in the MySQL world. Is that the case? If so, is it simply due to a lack of qualified developers for it, or is it part of a larger strategy to keep more resources working on the back-end?
In my opinion, MySQL could benefit greatly from having some really great (or at least passable!) client-side tools.
Comment of the year
You do know that PostgreSQL is much better, don't you?
What would you suggest as a good first "learner" program for Database driven development? For Perl development?
It strikes me that a lot of the work that has gone into optimizing databases in the past (especially storage, index structure, and buffer management) works on the assumption that the database will be stored to a hard drive, which is way slower than main memory and has variable access times depending on the relative position of the disk head and the required data.
Nowadays there's a lot of large solid-state flash drives coming out, and as time goes on we can expect to see sizes increase and costs decrease. For small-to-medium size databases, it's now reasonable to keep the whole database on, say, one of the new 64GB memory drives. The use of drives which have a fixed, cheap cost for random access seems to invalidate and simplify a lot of the assumptions made about conventional back-end data storage, and opens up possibilities for new types of optimization (binary tree indexes instead of B+ trees, more use of indirection).
I'm wondering what you see as some of the biggest opportunities for memory-resident databases, and what work is going into MySQL to take advantage of the changes that are happening in storage.
Anonymous Luddite: "What do you think of the dehumanizing effects of the Internet?"
Andy Grove: "Not Much."
What's your opinion of the developments that companies like Vertica are making with columnar databases? They tout improved performance and better data compression as two benefits. Will we see similar developments in MySQL? New products?
Do feeble and frankly unfunny bread|biscuit|cake puns really get on your nerves?
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
(*) This problem manifests as many inappropriately moderated (and unmoderated) posts in each slashdot discussion, making it absolutely impossible to read any discussion unless your threshold is set to -1 or you are willing to put up with a disjointed set of posts, apparently unrelated to each other (and sometimes the subject at hand), while also knowing with absolute certainly you're missing many good, on-topic or otherwise high quality posts. One presumes that at this late date, with absolutely no attention from those people running slashdot, that the problem must be something they don't understand about the site (such as the complex and very difficult to understand idea that working moderation would help the userbase), and so devolves upon a technical person.
I've fallen off your lawn, and I can't get up.
and I've been a MySQL DBA for nearly a decade
Next lame question to waste Brian's time...
What's your opinion of SQLite, PostgreSQL, and Firebird? What do you like about them? What don't you like about them?
Do you even lift?
These aren't the 'roids you're looking for.
Open Source databases like PostgreSQL?
Free (beer) "Express" editions of commercial databases?
Hypothetical MySQL spinoffs no longer controlled by your company?
I would really like to see the answer to this one...
Online Starcraft RPG? At
Dietary fiber is like asynchronous IO-- Non-blocking!
In addition, what current work is being done on Multi-Master replication? Something more reliable/recoverable than a circle/star with the auto_increment_offset?
Large corporation bashing aside, Active Directory and Oracle clusters do this very nicely, with low replication traffic volume. Is there any chance this feature will be added and improved upon in Mysql?
What choices have you made provisioning hardware for production MySql systems? Specifically, what do you rely on for storage; raid levels, hardware vs. software, etc. What, if any, tools do you use for performance analysis of production systems beyond the usual *nix monitoring programs (sysstat etc.)?
Lurking at the bottom of the gravity well, getting old
A common problem in the enterprise world is data auditing. By auditing I mean keeping a record of data as it changes.
I've implemented various solutions, usually consisting of copying the row into another table, slapping some sort of version id on it, and the updating the row with new values.
When a new column is added to a table you typically have to add one to your auditing table. Then there is the concept of who made the change. I've implemented a solution using triggers and generic table to hold the changes, but it wasn't as fast as I'd like. Even with triggers I typically regenerated the code when a new column was added. I did not analyze database meta-data during trigger execution.
This can be useful from a governance standpoint all the way down to a debugging standpoint. I've been able to concretely identify how data corruption occurred more times then I can count with these systems.
Have you ever thought of baking this into the database server. Perhaps having MySQL automatically create an auditing table and update it as records are updated. Deletes could be handled the same way, by simple marking the last row in the audit table as such.
Any thoughts?
I'm wondering if you've considered the idea of "dynamic" RDBMS for rapid prototyping or rapid app development?
Column types would be optional, and new columns and tables could be added willy-nilly by a regular INSERT or UPDATE statement. There would be no (default) distinction between non-existent columns/tables and empty columns/tables.
One could later and incrementally add validation to enforce types or existence to selected tables or columns. In other words, the "lock-down" could be gradual as the project grows more formal over time.
We have dynamic/scriptish languages, so I think it makes sense to also have dynamic databases in our arsenal of tools. Or, at least as a cool experiment.
Thank You
Table-ized A.I.
Will MySQL consider positioning itself for Document Databases too (http://couchdb.org/) (Damien Katz works for MySQL: http://damienkatz.net/2004/12/about-me.html).
I would like to see MySQL provide a set of "best practice" data models for common types of problems including triggers, stored procs (or even clusters and replication) so we can all benefit from your expertise.
For example:
What is the best way to set up your data model to deal with recurring schedules? How do you store the schedule for something that needs to be performed every other tuesday after 10:00 AM but before 5:00 PM at location X. How to you generate the individual events. How do you deal with somebody trying to view what their schedule would look like in 2020? How do you rescedule next tuesdays event to wednesday but still keep the scedule active.
Thanks.
evil is as evil does
You can do some nifty calculations with mysql but as far as how would somthing look in 2020? just craft the table and pass the right date range query to it.. please do not bother this man with such quesitons..
While the origin of databases was in holding massive amounts of spreadsheet style buisness data that is only one way that DBMSs are used today. For instance their is an increasing use of DBMSs as data stores for web applications or even desktop applications (Core Data and sqlite). Also some people are advocating column oriented DBMSs for data warehousing applications while unstructed databases like couchDB. Do you think that the standard relational model will continue to dominate the database field or will we see a splintering as some people have suggested with specialized databases developed for different applications? If so what sort of role do you see MySQL evolving to occupy?
I'm particularly interested in whether you think we will see a return to Object Oriented Databases for use as a model store for applications possibly supplemented by some kind of language integration like microsofts LINQ.
If you liked this thought maybe you would find my blog nice too:
Hello Brian,
do you believe the query engine of MySQL could be rebuilt to become a tuple flow system (or any other mechanism) which would be able to scale single-query perfomance on a multi-CPU (core) system?
Thanks,
Oliver
There seems to be a trend toward other areas of IT implementing concepts and ideas that used to be primarily to province of databases. For instance transactional memory has proved to be a powerful model for concurrency in software languages while journaling and copy on write filesystems have been implementing many ACID style guarantees using methods borrowed from databases. Also operating systems seem to be evolving more and more APIs to handle structured data storage (Core Data) as well as to efficently store and search metadata (spotlight, windows search). OS X even stores a great deal of it's configuration data in a centralized Netinfo database and even the windows registry is a sort of database (arguably it's flaws stem from the choice not to implement a full database).
Do you think these trends will continue and we will see even greater integration of DBMS technology into the operating system itself? Will (should?) we one day regard database like structured data storage as an indispensable OS level feature like a filesystem? If so do you worry that Apple, MS and Linux will make MySQL irrelevant by integrating this functionality into the OS? Do you think that something like SQL will remain the primary interface to this sort of structured storage or will programming languages implement an integrated native syntax for both transactional memory and database access?
If you liked this thought maybe you would find my blog nice too:
What advice would you give to students coming out of college who are getting into database related areas?
I'm convinced that transactional memory (TM) is the right model for concurrent programming in most common situations but software transactional memory (STM) suffers from some performance problems. There has been a great deal of research on hardware support for transactional memory and recently Sun announced hardware support for hybrid transactional memory in Rock. Do you think hardware support for TM will catch on and migrate down to commodity hardware? If so would this translate into significant performance improvements for databases like MySQL? If not do you think specific hardware optimizations for DB applications will ever make sense in the mainstream market?
If you liked this thought maybe you would find my blog nice too:
For the most part, I've moved on to Postgres (although master-master isn't that much better over there). Anyway, this is just another vote for this to be one of the questions asked as I think that is a very important area that is only now starting to be somewhat given the attention that it deserves.
When I have a kid, I want to put him in one of those strollers for twins and then run around the mall looking frantic.
I have been watching PGCluster and Postgres-R as well. It shouldn't be too long before there is an open and reliable alternative to the "big features" of the commercial databases.
MySQL has generally gotten a lot of flak for not being a "real" RDBMS. I.e. for a long time it didn't have transactions, the ability to really enforce data consistancy and the like. While many of these are being addressed they look to many of us who do RDBMS-backed buisness apps like they are only partial solutions (strict mode can be turned off by any client application, if we distribute software we have no way of knowing that transactions really are enabled on the server, etc).
At the same time, MySQL has traditionally excelled for things like light-weight web content management and the like. It is almost as if MySQL was designed originally for this sort of task as its core market.
What were the original design goals for MySQL? Has MySQL outgrown them and moving on to become something else?
LedgerSMB: Open source Accounting/ERP
Why doesn't your Wikipedia entry list your time at the Cobalt Group? And what did you really think of Raja and his IBM boys?
If I understand correctly, InnoDB, Oracle, and PostgreSQL storage models all use multiple row versions, so what are the tradeoffs? How do these tradeoffs explain some of the performance differences, such as concurrent performance and serial performance? Stability of performance versus erratic performance? How do they affect maintenance and performance stability over time?
InnoDB and Oracle both use rollback segments (I may be mistaken here), while postgresql uses non-overwriting storage and reclaims it later. What's your opinion of the two approaches?
Why do databases like SQL Server and DB2 still use heavy locking rather than multiple row versions (I may be mistaken here)? Is that an antiquated design, or does it still have potential?
Where can I find more detail about this information?
Social scientists are inspired by theories; scientists are humbled by facts.
Document databases (or textual databases, for that matter) simply aren't a good fit with the relational model, so you're unlikely to see them incorporated with MySQL any time soon.
CouchDB is interesting, but I'd steer clear of it for the simple reason that it doesn't have any support for any relevant standards (e.g. Z39.50, MARC, XQuery). Vendor lock-in is bad, even when it's open source.
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
How do you see the development of native XML databases? What would you have to say about the advantages and disadvantages of a native XML database model against the relational DB model?
I've also been watching PGCluster. It currently looks the most promising for what I am wanting. Even though it does appear to be pretty stable the versions have been incompatible with each other (to some extent). So, I'm probably going to wait a little bit and let it mature before I put it into production. This is an interesting area. On one hand, I would like the Postgres team to have something "official" for replication. On the other, there are so many different types of needs that it would be difficult to handle them all. Also, their time may be better spent working on the core feature (although replication is starting to become a core feature). I don't mind third party applications, but its also difficult to trust your entire operation to a (potentially) unsupported or stagnant project.
When I have a kid, I want to put him in one of those strollers for twins and then run around the mall looking frantic.
-J
(Bug submitted Feb 2005)
Can't be so hard. About every other database does it.
True enough.
It's the way you interact with it that's critical here. Think about how you work with your favourite SQL DBMS and with Google, and you'll see that they're quite different at a very basic leve.
The fundamental result data type in a relational DBMS is the stream of tuples, and tuples contain real data. In other words, querying (i.e. finding what matches some criteria) is essentially the same as presentation (i.e. getting real data out).
The fundamental result data type in a document-based DBMS is the sequence of document numbers. Document numbers do not contain real data. You perform a query, and get a sequence of document numbers. Then you let the user refine the sequence. Maybe you present some metadata, or KWIC information. Maybe you sort on a field. Maybe you add more constraints. Eventually, you get to the point where you present real data.
Yes, you can do this in principle with relational databases, but it costs. It's not uncommon for a relational model for a document database to use 5-10x the disk space of a native document database engine.
SQL is a poor fit for textual data, too. Partly because support for textual querying operators (e.g. phrase or adjacency queries, query-based ranking) is poor, and partly because SQL has poor support for managing result sets on the server.
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
As an addendum, when can we have distributed databases that aren't brutally difficult to setup, have good performance, and that don't cost a fortune?
Sequences in Oracle, Postgres and other databases allow for a wide range of interesting effects. Knowing your primary key before you insert into the database, shared unique identifiers between more than one table, creating circular not-null references around groups of objects before you insert into the database, etc.
After all this time, why does MySQL not have sequence support?
1) TotalRekall (Python-based)
2) PgAccess (a TCL front-end and form builder for PostgreSQL)
3) Once:Radix (a web-based front-end builder for PostgreSQL).
OnceRadix is quite new and I think it is well thought out in a lot of ways.
LedgerSMB: Open source Accounting/ERP
MySQL 5 introduced stored procedures and triggers. However, their language, SQL, is very limited to say the least. Aside from its obvious shortcomings (from no FOR loop to not being able to raise errors from triggers; from no recursive functions to very limited cursors), there's the fact you're going to handle tables and tuples, and this language cannot handle either as first-class types. On top of that, you're reinventing the wheel by writing a new programming language.
Is MySQL going to add support for procedures written in more powerful, productive, flexible languages? Python comes to my mind as the obvious choice, but Lua, Guile (Scheme) and others are also powerful and easy to embed.
PostgreSQL, your open-source competition, supports this.
I was about to say 13256278887989457651018865901401704640, but it appears this number is private property.
The biggest problem I've encountered in testing the INNODB engine is that it doesn't return deleted data space back to the system. I once ran a test taking up 40 gigs, and when the tables were deleted, the space was not returned to the system. After researching this, it's apparently a fatal bug in that it will perpetually keep getting larger. So when will this be fixed? I like the engine, but I simply can't use it on a large scale if this problem persists.
I'll second that. I've yet to try SQL Developer with MySQL but I think I saw it being listed in an update a while ago.
I really like it as a tool. The one issue I have is my company only uses TOAD and keeps shelling out money they don't need to. For any TOAD users out there, I'm not saying it's not worth the money, but most of what my co-workers do can be done in SQL Developer for free.
One of the best things about SQL Developer in my opinion is the simple lightweight and clean interface. I'd say my one complaint is not being able to determine the size of the dataset it returns on queries. Seems to max out at 50, then slowly load in groups of 100 or so.
I have to second that. Sequences see like such a simple but useful feature. Why haven't sequences been added when more complex things like views and stored procedures are part of Mysql?
"If they have both, tell them we use Linux. And if they have that, tell them the computers are down." -Dave Chapelle
Do you think Native XML Databases (such as eXist Open, Software AG Tamino, etc...) could replace relational databases in the future? Will the MySQL team make any attempt at delivering one?
I would just add that textual searching in many RDBMS's is possible. PostgreSQL has good full text searching capabilities, and query ranking capabilities. But you are right-- basic SQL does not handle the sorts of information required, so extensions are required speciically for this purpose.
Note that XML doesn't provide a perfect fit for this either. What one really needs is a good full text querying standard (which doesn't exist).
LedgerSMB: Open source Accounting/ERP
My question is:
Why database types are so different from programming language types?
Lots of CPU processing power is spent in translating from/to database types. Do you think that the types of SQL are outdated? Do you think that the industry should move towards something like DB4O (www.db4o.com) where objects are written in directly to/read directly from the database?
Databases will certainly be faster and simpler if they adopted the data types and principles of modern programming languages.
PHP started as a templating engine built on Perl. It took on some of the stranger and some of the nicer habits of Perl and eventually developed into the main server-side-included language for web applications with it's own distinct charater. I find myself using PHP for more and more projects and tasks (cli scripting included) and keep wondering if PHP is a dead end or if PHP is basically a natural progesssion into the web-aera and has become so popular for mostly the right reasons.
Since you are a member of the MySQL Core team (the database pratically married to PHP) and a Perl Guru (PHPs anchestor) I'd like to know what you think of PHP and it's breathtaking success as the 'web-generations basic' and the road it's come along since PHP 3. Do you think that the nonchalance with which PHP treats programming newcomers, the one or other strange web programming habit it allows/fosters and the large set of default web-centric functions is a fair trade for the downsides PHP is said to have? Or do you think PHP should go the way of the dodo in favour of Perl or other languages?
We suffer more in our imagination than in reality. - Seneca
So they come across an RDBMS, which is a completely different paradigm for data storage/integrity enforcing/query. It sure as hell isn't OO, and it doesn't fit 1-1 with their code written in the One True Proramming Style, and therefore is hacky and outdated.
Le sigh.
A while ago we read some stuff here on /. about column stores being the "new thing" (here). Some of the ideas seem reasonable, so my question is: Is this a simple case of using a different storage engine? Does MySQL already have one, or is one in the works? Or is this simply more neat than useful? (Oh, and is it a case of - "even if useful, not for the average user"?)
Yes, I am a biological organism. All rumors to the contrary are just that, rumors.
Many people consider it a normal thing, that whenever the persistance layer is accessed one must use a special language, a strangely arcane and historically grown thing called SQL. Since you are proficient in Perl, a language notably more powerfull than SQL (or most other PLs) in the handling, retrieval and filtering of data, I wonder if you've ever felt annoyed by this circumstance. Do you think it would make sense to attempt to remove SQL from the stack or at least make it an optional layer? How feasible is it to attempt to connect a storage eninge directly to a high-level programming level like, f.i., perl and start ridding humanity for this plague called SQL?
Since you know both ends very well, a high level PL and details about storage I'd like to hear your opinion on this.
We suffer more in our imagination than in reality. - Seneca
I said "nt" dammit!
Do you think that is time for Complete Rewrite of RDBMS as Stonebraker says?
Thank you.
A common opinion is that Mysql has less features than a full-blown database, and therefore is faster (as long as the operations are not too complicated). This is a surprising fact for me. First, the source code of MySQL looks pretty messy. The abstraction levels between the main program and the database handlers (i.e. BerkeleyDB) are fuzzy, to say it in a friendly word. Also, and i think this gives you a real performance penalty, MySQL uses table locking for write access, and does not lock on a page level. And MySQL implements transactions on top of the handlers, which should also be slower than implementing transactions IN the handlers. On the other side, DBMS like Oracle are optimized to the last bit, but are still slower in many operations. So either MySQL has a secret performance trick which i have not yet discovered, or its performance advantages are actually a Myth. What do you think? I also understand that you (MySQL AB) is working on your own handler, which is optimized for SQL way better than i.e. a generic handler like BerkeleyDB. Does this mean that we will have to expect a major architectural change of MySQL in the near future?
I welcome the stored procedures in MySQL 5.x. I can now move the db code which is used internally and on the web sides to SP's eliminating redundancy and the associated maintenance nightmares. Bravo!
However, are there plans to implement execution plans and precompile the procs so we also get a performance boost(like with PostGreSQL, MSSQL, and Oracle) when using them?
As it is, there isn't much incentive to move to MySQL from one of the more mature enterprise RDBMSs outside of cost or learning curve. Don't get me wrong, I'm a huge MySQL fan in certain situations, but I'd be looking to something else for the heavy lifting where the applications have a lot of users that are updating lots of data _and_ needs to be transactional and pass ACID.
It's hard to beat procedures written in C on PostGreSQL, provided you have the experience to know what you are doing and make things efficient. In my own performance tests, PGSQL is king of the open source hill right now... However, so far, most employers are reluctant to implement on PGSQL because you need a much larger base of knowledge to use it effectively.
I've heard "Who will replace you if you get hit by a bus?" way too many times... I can count the number of people I personally know that understand this database in the mirror. MySQL people are everywhere.
Will MySQL fill this performance void eventually? It'd turn MySQL into a MSSQL killer... few could say no to it and it would open up a lot of market for it.
-AC
tru dat.
XML is a document representation format, and is completely orthogonal to the problem of querying databases.
Allow me to introduce you to Z39.50, though you may remember it as WAIS before it ISO-standardised. If all you want to do is query (and sort, rank, etc), it does the job.
What it doesn't support is database update, but that's coming in the next version.
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
So what's a typical day for you like? ie When do you wake up? You work from home, yes? Do you eat in most of the time? (you seem to have a non-typical diet from some of your prior comments). How much of your day is spent in front of a computer screen M-F and on weekends?
Do you find working remotely through an IRC/IM Window with fellow staffers easy/difficult/tiring/??? What's the best thing about working like this, and the worst?
http://slashdot.org/~tf23/journal
Many people ask about relational fidelity and MySQL. However, my experience out in the developer world is that a lot of developers want to put most of the business rules of a domain model in the application code. For example, stored procedures, triggers, check constraints all seem to be disfavoured over implementing these things in application code, object models or frameworks. This type of focus seems to be the most important relational fidelity concern. Do you feel that is better to put the busines rules in the application code? If so for what reasons.
Shared-nothing is fantastic, but how can you realistically expect us to keep all indexes in data server memory? Indexes need to (as 5.1 did with records) be made available by secondary (disk) storage. Can you please comment?