Porting between dbms products depends primarily on two issues:
1. usage of vendor extensions
2. usage of standard relational functionality
Generally speaking, if you've minimized #1 in your application you can easily port between Oracle, DB2, SQL Server, Sybase, Postgesql, etc: sure, you could hit some issues with jdbc drivers, and may need to port a few idioms (partitioning for example), but it shouldn't be a killer. But going from any of the above list to mysql isn't suggested: you'll get hung up on #2 (it doesn't support standard SQL or DDL)
Realistically, if I wanted to go to a less expensive product than oracle I'd look down this list:
- db2 (1/3 to 1/2 oracle cost)
- sybase (cheaper than oracle, but dwindling market share)
- firebird (very low cost)
- postgresql (free) All of the above are mature relational databases that you could port oracle applications from.
But you mentioned 'mission critical'. At this point I'd be very cautious about either postgesql or mysql in a mission-critical role. How important is it to you that you can recover 100% of your data in the event of a database crash? I'd put my money (and career) on db2 or oracle delivering that kind of quality over mysql...
> I'm under the impression that most "bugs" in software (certainly most bugs in my code) aren't bugs like these in the article > (null dereferences, uninitialized variables, etc), but they're algorithm bugs.
Nah, they're bugs in requirements gathering. *Way* harder to find and fix.
In this case a few are obvious - and point to the functional gaps in this product. Once they fix them I'll look forward to using the product.
Which commercial counterparts? oracle, db2, informix? I seriously doubt it - and eWeek didn't say.
Given that a database is a core software component I'd *expect* a far lower defect density than you'd typically find in an application or especially a desktop application.
Good to hear that they're working on reducing their defects, but please - this is hardly a big deal.
> Even today, I don't care about comparing to, say, Oracle or MS SQL Server. IBM DB2 would be a better > baseline, but best of all the real competitors: PostgreSQL and Alphora Dataphor.
I think you've got your dbms' mixed-up: Oracle, Informix, and DB2 are all of comparable complexity and power: Oracle's partitioning is the simplest and its clustering the most complex. DB2 & Informix have more complex partitioning - but can scale beowulf-style to hundreds (if not thousands of separate servers).
SQL Server is less functional than the above servers, though obviously similar to Sybase (due to its heritage).
Postgresql is less functional than SQL Server - though it's a fine product anyway.
MySQL is less functional than Postgresql.
Not aware of any other database that occupies the limited transaction support / limited ANSI support niche that mysql does. MSQL perhaps?
> Database developers like to go on and on about how difficult it is, and how "non-experts" can't > possibly learn to develop good databases.
No - the problem is that the development of best practice databases requires that the designer understand more than just the syntax of insert, update, delete, and select. And when you get right down to it, most developers don't even have the syntax straight - let alone understand the beginings of relational theory.
> How come your database "*experts*" can't learn java? For the same reason that I don't expect java experts to also develop expertise in database tuning - each is a large and complex discipline.
> For people who know OOP it is a matter of days. Really? They'll understand all the commonly-used libraries? They'll understand when it is best to go straight from jsps to the database, when they should use entity beans, when to use EJBs and when to use CMP? They'll understand the strengths and weaknesses of each solution and exactly how to develop each?
these must be the 'magical unattainable' programmers - you know, the kind I've never met any on any of dozens of projects over the last four years.
No, a stripped-down version of PHP, python, etc would be far better than java.
Does the LAMP crowd write code in java? nope.
Do the DBAs write code in java? nope.
Does the java crowd use mysql? nope (if you're going to the cost of developing apps in java, it makes more sense to use postgresql if not oracle, db2, etc).
In java you can abstract the data persistence thru a variety of encasulation techniques. All of which creates such a mess than everyone's scrambling these days to figure out how to simplify it. Sounds like a mess to introduce deeper into the database.
And portability? Do you really think that java functions created for mysql will port *seamlessly* to postgresql, db2, or oracle? No way. And if they don't - why don't you simply write the functions in a tiny, simplistic language that everyone can use easily?
how many folks can you find who are *experts* good with both SQL & Java? (BTW,I don't mean that they can write simple joins, group bys and unions. I mean good enough to understand access paths and parallelism choices). Of the 100+ java developers I've worked with over the last four or so years I've only met *1* who would meet that critieria. So, exactly who's going to be making the performance-tuning decisions? Nope - bad idea, a simple tuning problem will need a committee to figure it out.
keep in mind that since we've mostly dropped the idea of writing all business logic in stored procedures, they're primarily being used these days for very simple procedures. Nothing fancy - convert an ip from a string to an integer, etc. You don't need *or want* a big language for this. Nor should you sweat too much that it is proprietary - who cares when you can learn the basics of the language in 5 minutes.
Traditional frameworks are fine - but the productivity benefits come at a cost - flexibility.
What I've found that often works far better is:
- divide system into major business-oriented (vertical) sub-systems (assemblies, whatever). Examples of these sub-sytems would include 'party', 'inventory', 'order', etc.
- if possible build these sub-systems using highly adaptable code or based upon well-conceived patterns
- glue the assembles together using a highly productive / adaptable language - python, etc.
If I end up using a framework within one of these classic sub-systems, fine. I can always chuck it out the window when we hit its limits...
I mean, come on - postgesql is a great little database with a bright future...but it isn't ready to tackle large ERP or CRM implementations yet. Or even your typical 300 gbyte warehouse or data mart if you get right down to it...
are probably comparing this system to some old ibm benchmark. They didn't say in the press release, so I'd assume the worst.
IBM appears to dominate the TPC-Hs at the top & bottom, with oracle owning it in the middle.
The only really interesting benchmark out there at the moment is the IBM DB2 ICE configuration - in which they spread db2 across dozens of low-end AMD Opteron dual-cpu servers. DB2 (and informix god bless them) partition differently than oracle - more like a database implementation of beowulf (that they've been doing for 8+ years). Way cheaper than anything from oracle, and you can toss up to 1000 servers into it. Their benchmark is in the 300 gbyte range, not 1000 - but it'll scale way beyond oracle, and is cheap for that kind of power: http://www.tpc.org/tpch/results/tpch_result_detail.asp?id=103073001
Makes me wonder how many pcs I've got laying around the house...
> Anyone who needs to use COUNT() is completely out of luck. Can't use PgSQL. This problem has not been > fixed in the latest release:
why would you be doing count(*) on *huge* tables without a where clause anyway? I mean, sure - I run queries like that all the time - in adhocs, where a scan of 100+ million rows might take 10-30 seconds.
But I can't think of the last time I've built an application and needed to do count(*):
- without a where clause
- on a huge table
- and needed fast response time.
Also note that most databases don't seem to store the min/max values for non-index columns to speed up this situation. Informix does, IIRC, but I don't think that Oracle does.
So yeah, it would be nice if this was improved, but really - I use this functionality all the time, and am completely unaffected by this issue.
> That is completely different. The UNIX file system doesn't have transaction management at all, yet > people build robust applications on top of it.
Right - but they don't build *transactional* systems on top of flat files. And when they occasionally have ACID considerations they either build two-phase commit, behave unpredictably, or reprocess/require manual intervention. Note that none of these solutions are as easy or elegant as asking a database: hey - do all of this work, or none.
> I'm also sure that there are many people like you who pick PostgreSQL because they don't understand > how to make good engineering tradeoffs and build good applications on top of databases that don't > have all the bells and whistles. Yours is a different level of ignorance, but it's ignorance > all the same.
No, the difference is that I've been around long enough to have developed dozens of major applications *before* we had transaction support - using flat files, isam, and hierarchical structure. And I was there when the entire industry shifted to robust databases - because the work required to make reliable and accurate applications before them was beyond the ability and time available to most development teams.
So - "been there, done that, ain't going back". I'll stick to modern technology, thank you.
You on the other hand, are free to repeat the mistakes of the 70s.
Might not always be the fastest...but I can't imagine developing a database application - and putting all the join code in the app - that could instead be a subselect, inline view, etc.
I'm not a huge fan of triggers - but often find circumstances in which their careful application has saved a huge number of hours.
Stored procedures are another example of a technology that when used wisely can result in a huge improvements to flexibility and adaptability. My favorite use of them is when on a fast-moving project - I can have the developers create an object model and almost immediately start developing to it. My first task is to knock out a set of stubbed stored procedure (or sometimes views) that map to that object model. Often within a week they've got a fully working application (with a fake backend) to continue testing and tweaking. Then I can take my time mapping the object model / stored procedures to a relational model - perhaps using some advanced features that would have been time-consuming to build and test. The net result is drastic reduction in development time, complete dba-control over all queries, and the ability to change the model occasionally (as performance or whatever demands) without any impact to the developers.
The concern that stored procedures aren't portable isn't completely valid either. Stored procs shoudn't be complex - simple ones can be easily ported between most databases.
Views, unions, and subselects are other sql capabilities that the mysql-ab team has frequently referred to as being only useful to 1% of the applications out there. And unlike the triggers and stored procs described above - I'd consider these core database capabilities.
> Feature-wise, I agree PostgreSQL is way more loaded.
*Way* more loaded? nah, just a little. It's the fundamentals where it shines: build-in support for transactions, ANSI-sql compliance, etc.
> So use it if you are building a banking system... Or any operational system where data is valuable. Which is just about any application I'd spend my time building...
> But Mysql is designed for speed and average use on websites, where performance is important. Data quality isn't important on websites? How about portability then? Or time-to-market? How about manageability and maintainability? Postgresql outshines mysql on all counts.
Performance is important - but it's just one of a half-dozen factors you should be looking at.
> So stop saying Mysql sucks because it doesn't have feature X or feature Y.
Deal - I won't complain that it doesn't have some of the legacy bells & whistles of the 80s that the commercial dbms are still carrying around, or some of new frills either (built-in etl, etc).
But, aside from features, it still sucks since it lacks good support of transaction and ansi sql compliance. Now, once it has those capabilities, then I'll take a serious look. But until then it isn't worth the time as long as other, better, options exist.
> Just because mySQL fails to implement many parts that standard SQL programmers consider important doesn't mean it's unreliable.
No, sweeping exceptions under the carpet and failing to provide an integrated and robust transaction management facility means that it is exceedingly difficult to build a reliable application using it.
Which really isn't much different than simply saying it's unreliable.
> The real question is: when implementing my system, how much effort and money does it take to achieve a required level of performance > and reliability on top of a given database. MySQL does a lot less than the other systems, but it may still often be a better tradeoff in > that space than PostgreSQL or Oracle. In fact, that's probably why it's so popular.
No - it's popular because it has been trivial to develop trivial apps on, because it is somewhat free, because it was on windows first. And because most of its advocates aren't familiar enough with database technology to weigh the differences between mysql and postgresql.
Frankly, it has the least portability of any popular database solution, and lacks *so* many sql capabilities that simplify application development that I wouldn't consider using it for a moment - as long as postgresql is a reasonable alternative. The only time I've had to recommend mysql in the last year is when a client needed to implement an application that only supported mysql. Fortunately, that's been rare.
The parent probably wasn't as concerned with space as much as required resources. But aside from trying to figure out how you'd feed, cloth, house, and provide drinking water to 6 billion people in Texas...would you really want to live there? I mean, really, Texas is pretty obnoxious with just 10 million, let alone 6000 million!
And back the main point (this issue is apparently pretty emotional) - what's a reasonable upper population goal for the US? If 500 million in 50 years seems reasonable to many - does 4 billion in 200 years still sound good?
Don't know about you, but personally I enjoy having trails to hike on within five minutes of my home. In many more populated parts of the US, you'd have to drive 60-120 minutes to get to a similar quiet and natural surrounding. And that's at 250 million - not 500...
A quick search on the wiki showed no hits for the word 'report'.
Note that the classic problem with object databases is that they focus on transactional queries, and that DSS or reporting queries are either too slow or too difficult to perform.
So, yeah it sounds nice if you want *both* an object database and a relational one. Not a bad solution if you already have a data warehouse on the side. But if you don't it just a lot of extra work.
> Those things sound like stuff you'd write for > those useless reports the bosses always want > comparing apples to porcupines. Most database apps > I've seen use pretty simple queries; > it keeps your memory overhead down, and makes your > app run more smoothly.
A few thoughts:
1. If you think reports are useless, then you probably put tape over the guages on the dashboard of your car as well. I can't help you there.
2. And you deliver customer portal, don't you want to show info about sales they've made in the past, credits they've accumulated, savings they've made via your 'preferred customer program', etc? if not, then you're behind the curve on portal design. if you do - are you going to send them a separate application? Or are you going to run some of these queries from your portal. Hint: pick the last option.
3. Most database apps only do simple queries. You're right. That's because the average developer wants to keep the job simple, can only write basic SQL, and doesn't have experience with usability.
4. Yep, it can take more memory. Then again, memory's cheap.
> If you're using multiple outer joins for > anything other than reports, your schema's > probably screwy.
5. The schema shouldn't be limited by your inability to code multiple outer joins or deal with optional data.
6. See #2 above. The concept of a 'report' being something that somehow is done in other applications is antiquated. Transactional apps have a choice: deliver only transactional views of the data - and force the user to guess what the heck's going on or go to another app, or encompass some basic reporting in the transactional app.
> All that stuffs fine if you're working for the > government, and they can buy you a billion > dollars worth of hardware, > but if you're putting together an app for > accounting and inventory control for a > relatively small company, and you're > using those types of queries, you're going to > make their hardware scream for mercy, and them > very unhappy with the speed of your fancy new > app.
Don't know where to start, but here's a try:
1. Use a real database
2. Tune it right
3. Put it on reasonable hardware
4. Identify the performance needs (based on usability objectives) for each step in each use case. Some queries will have to be lightning-fast, others won't. Learn the difference.
5. Redundancy in the database is your friend, just got to manage it right. It will allow you to take queries you would have thought would be very slow, and run them at blazing speeds. This is also best-practice.
I do this all the time and it always results in fast applications that users *love*. There's no need to limit your use of the database to trivial queries unless you're just prototyping, aren't being paid enough to finish the work, or are using ISAM files.
In my experience the result of this approach limits the database queries to simple selects of lists or individual objects, updates, and inserts. That's fine for many prototypes of simple applications (with very simple data models).
But it breaks down as you add more functionality to the application. For example (keeping it very general) lets say that you want users to select a category to drill-down into for further work. You might start out with just a category name being selected and presented to the user. However, the next step is usability will be to add additional attributes to the name - so that the user has a sense of which categories have the most activity, most pending 'requests', or whatever. So you then need to also select attributes for each category like: * number of orders/requests/etc performed in last 24 hours * total number of articles/orders/etc * number of orders/requests/etc in a critical status
How do you generate the SQL for those attributes. Now, if you are writing your own SQL - you could just join the category names with an inline view of your orders table (grouped) to pull all attributes in a single query. Beats running two separate queries.
So whether it's developed as one or two queries - how would a sql generator build that last query? And if it can't - do you generate 90% of your queries, and then do just 10% by hand. Does that sound useful?
Porting between dbms products depends primarily on two issues:
1. usage of vendor extensions
2. usage of standard relational functionality
Generally speaking, if you've minimized #1 in your application you can easily port between Oracle, DB2, SQL Server, Sybase, Postgesql, etc: sure, you could hit some issues with jdbc drivers, and may need to port a few idioms (partitioning for example), but it shouldn't be a killer. But going from any of the above list to mysql isn't suggested: you'll get hung up on #2 (it doesn't support standard SQL or DDL)
Realistically, if I wanted to go to a less expensive product than oracle I'd look down this list:
- db2 (1/3 to 1/2 oracle cost)
- sybase (cheaper than oracle, but dwindling market share)
- firebird (very low cost)
- postgresql (free)
All of the above are mature relational databases that you could port oracle applications from.
But you mentioned 'mission critical'. At this point I'd be very cautious about either postgesql or mysql in a mission-critical role. How important is it to you that you can recover 100% of your data in the event of a database crash? I'd put my money (and career) on db2 or oracle delivering that kind of quality over mysql...
> I'm under the impression that most "bugs" in software (certainly most bugs in my code) aren't bugs like these in the article
> (null dereferences, uninitialized variables, etc), but they're algorithm bugs.
Nah, they're bugs in requirements gathering. *Way* harder to find and fix.
In this case a few are obvious - and point to the functional gaps in this product. Once they fix them I'll look forward to using the product.
Which commercial counterparts? oracle, db2, informix? I seriously doubt it - and eWeek didn't say.
Given that a database is a core software component I'd *expect* a far lower defect density than you'd typically find in an application or especially a desktop application.
Good to hear that they're working on reducing their defects, but please - this is hardly a big deal.
> Even today, I don't care about comparing to, say, Oracle or MS SQL Server. IBM DB2 would be a better
> baseline, but best of all the real competitors: PostgreSQL and Alphora Dataphor.
I think you've got your dbms' mixed-up:
Oracle, Informix, and DB2 are all of comparable complexity and power: Oracle's partitioning is the simplest and its clustering the most complex. DB2 & Informix have more complex partitioning - but can scale beowulf-style to hundreds (if not thousands of separate servers).
SQL Server is less functional than the above servers, though obviously similar to Sybase (due to its heritage).
Postgresql is less functional than SQL Server - though it's a fine product anyway.
MySQL is less functional than Postgresql.
Not aware of any other database that occupies the limited transaction support / limited ANSI support niche that mysql does. MSQL perhaps?
> Database developers like to go on and on about how difficult it is, and how "non-experts" can't
> possibly learn to develop good databases.
No - the problem is that the development of best practice databases requires that the designer understand more than just the syntax of insert, update, delete, and select. And when you get right down to it, most developers don't even have the syntax straight - let alone understand the beginings of relational theory.
> How come your database "*experts*" can't learn java?
For the same reason that I don't expect java experts to also develop expertise in database tuning - each is a large and complex discipline.
> For people who know OOP it is a matter of days.
Really? They'll understand all the commonly-used libraries?
They'll understand when it is best to go straight from jsps to the database, when they should use entity beans, when to use EJBs and when to use CMP? They'll understand the strengths and weaknesses of each solution and exactly how to develop each?
these must be the 'magical unattainable' programmers - you know, the kind I've never met any on any of dozens of projects over the last four years.
No, a stripped-down version of PHP, python, etc would be far better than java.
Does the LAMP crowd write code in java? nope.
Do the DBAs write code in java? nope.
Does the java crowd use mysql? nope (if you're going to the cost of developing apps in java, it makes more sense to use postgresql if not oracle, db2, etc).
In java you can abstract the data persistence thru a variety of encasulation techniques. All of which creates such a mess than everyone's scrambling these days to figure out how to simplify it. Sounds like a mess to introduce deeper into the database.
And portability? Do you really think that java functions created for mysql will port *seamlessly* to postgresql, db2, or oracle? No way. And if they don't - why don't you simply write the functions in a tiny, simplistic language that everyone can use easily?
how many folks can you find who are *experts* good with both SQL & Java? (BTW,I don't mean that they can write simple joins, group bys and unions. I mean good enough to understand access paths and parallelism choices). Of the 100+ java developers I've worked with over the last four or so years I've only met *1* who would meet that critieria. So, exactly who's going to be making the performance-tuning decisions? Nope - bad idea, a simple tuning problem will need a committee to figure it out.
keep in mind that since we've mostly dropped the idea of writing all business logic in stored procedures, they're primarily being used these days for very simple procedures. Nothing fancy - convert an ip from a string to an integer, etc. You don't need *or want* a big language for this. Nor should you sweat too much that it is proprietary - who cares when you can learn the basics of the language in 5 minutes.
Great! now they get headlines simply by *not* lying
This is like ranking projects based on largest number of lines of code.
Without system descriptions (like in tcp) it merely shows that such a top-end is feasible.
What about total cost?
annual cost?
time to build?
software versions?
hardware?
staffing composition?
I mean really, a 500 gbyte database on a modest single CPU server is far more challenging than a 2 TB database on a 64-CPU E10k.
Traditional frameworks are fine - but the productivity benefits come at a cost - flexibility.
What I've found that often works far better is:
- divide system into major business-oriented (vertical) sub-systems (assemblies, whatever). Examples of these sub-sytems would include 'party', 'inventory', 'order', etc.
- if possible build these sub-systems using highly adaptable code or based upon well-conceived patterns
- glue the assembles together using a highly productive / adaptable language - python, etc.
If I end up using a framework within one of these classic sub-systems, fine. I can always chuck it out the window when we hit its limits...
But - db2 (and I think Oracle) can only access around 1 gbyte of memory on a 32-bit linux OS - without reliance upon extended memory functionality.
So - even if xeon if fast, the impact of 1 gbyte of memory per cpu - vs 8+ more than makes up for it.
> but PostgreSQL can do anything Oracle can do
oh please, you don't really believe that do you?
I mean, come on - postgesql is a great little database with a bright future...but it isn't ready to tackle large ERP or CRM implementations yet. Or even your typical 300 gbyte warehouse or data mart if you get right down to it...
are probably comparing this system to some old ibm benchmark. They didn't say in the press release, so I'd assume the worst.
l .asp?id=103073001
IBM appears to dominate the TPC-Hs at the top & bottom, with oracle owning it in the middle.
The only really interesting benchmark out there at the moment is the IBM DB2 ICE configuration - in which they spread db2 across dozens of low-end AMD Opteron dual-cpu servers. DB2 (and informix god bless them) partition differently than oracle - more like a database implementation of beowulf (that they've been doing for 8+ years). Way cheaper than anything from oracle, and you can toss up to 1000 servers into it. Their benchmark is in the 300 gbyte range, not 1000 - but it'll scale way beyond oracle, and is cheap for that kind of power: http://www.tpc.org/tpch/results/tpch_result_detai
Makes me wonder how many pcs I've got laying around the house...
> Anyone who needs to use COUNT() is completely out
of luck. Can't use PgSQL. This problem has not been
> fixed in the latest release:
why would you be doing count(*) on *huge* tables without a where clause anyway? I mean, sure - I run queries like that all the time - in adhocs, where a scan of 100+ million rows might take 10-30 seconds.
But I can't think of the last time I've built an application and needed to do count(*):
- without a where clause
- on a huge table
- and needed fast response time.
Also note that most databases don't seem to store the min/max values for non-index columns to speed up this situation. Informix does, IIRC, but I don't think that Oracle does.
So yeah, it would be nice if this was improved, but really - I use this functionality all the time, and am completely unaffected by this issue.
> That is completely different. The UNIX file system doesn't have transaction management at all, yet
> people build robust applications on top of it.
Right - but they don't build *transactional* systems on top of flat files. And when they occasionally have ACID considerations they either build two-phase commit, behave unpredictably, or reprocess/require manual intervention. Note that none of these solutions are as easy or elegant as asking a database: hey - do all of this work, or none.
> I'm also sure that there are many people like you who pick PostgreSQL because they don't understand
> how to make good engineering tradeoffs and build good applications on top of databases that don't
> have all the bells and whistles. Yours is a different level of ignorance, but it's ignorance
> all the same.
No, the difference is that I've been around long enough to have developed dozens of major applications *before* we had transaction support - using flat files, isam, and hierarchical structure. And I was there when the entire industry shifted to robust databases - because the work required to make reliable and accurate applications before them was beyond the ability and time available to most development teams.
So - "been there, done that, ain't going back". I'll stick to modern technology, thank you.
You on the other hand, are free to repeat the mistakes of the 70s.
Might not always be the fastest...but I can't imagine developing a database application - and putting all the join code in the app - that could instead be a subselect, inline view, etc.
I'm not a huge fan of triggers - but often find circumstances in which their careful application has saved a huge number of hours.
Stored procedures are another example of a technology that when used wisely can result in a huge improvements to flexibility and adaptability. My favorite use of them is when on a fast-moving project - I can have the developers create an object model and almost immediately start developing to it. My first task is to knock out a set of stubbed stored procedure (or sometimes views) that map to that object model. Often within a week they've got a fully working application (with a fake backend) to continue testing and tweaking. Then I can take my time mapping the object model / stored procedures to a relational model - perhaps using some advanced features that would have been time-consuming to build and test. The net result is drastic reduction in development time, complete dba-control over all queries, and the ability to change the model occasionally (as performance or whatever demands) without any impact to the developers.
The concern that stored procedures aren't portable isn't completely valid either. Stored procs shoudn't be complex - simple ones can be easily ported between most databases.
Views, unions, and subselects are other sql capabilities that the mysql-ab team has frequently referred to as being only useful to 1% of the applications out there. And unlike the triggers and stored procs described above - I'd consider these core database capabilities.
Can you easily return result sets from stored procedures with this release?
Also, anyone know if the python stored procedures are more native or still just external programs?
> personally i don't care about sql compliance (as long as it's not wildly different)
unfortunately, it isn't merely an academic issue: it means that applications and skills can't be easily ported.
> and, just to join the bashing fun, vacuum in postgresql is gay =b
lol, yeah nothing's perfect.
> Feature-wise, I agree PostgreSQL is way more loaded.
*Way* more loaded? nah, just a little. It's the fundamentals where it shines: build-in support for transactions, ANSI-sql compliance, etc.
> So use it if you are building a banking system...
Or any operational system where data is valuable. Which is just about any application I'd spend my time building...
> But Mysql is designed for speed and average use on websites, where performance is important.
Data quality isn't important on websites? How about portability then? Or time-to-market? How about manageability and maintainability? Postgresql outshines mysql on all counts.
Performance is important - but it's just one of a half-dozen factors you should be looking at.
> So stop saying Mysql sucks because it doesn't have feature X or feature Y.
Deal - I won't complain that it doesn't have some of the legacy bells & whistles of the 80s that the commercial dbms are still carrying around, or some of new frills either (built-in etl, etc).
But, aside from features, it still sucks since it lacks good support of transaction and ansi sql compliance. Now, once it has those capabilities, then I'll take a serious look. But until then it isn't worth the time as long as other, better, options exist.
> Just because mySQL fails to implement many parts that standard SQL programmers consider important doesn't mean it's unreliable.
No, sweeping exceptions under the carpet and failing to provide an integrated and robust transaction management facility means that it is exceedingly difficult to build a reliable application using it.
Which really isn't much different than simply saying it's unreliable.
> The real question is: when implementing my system, how much effort and money does it take to achieve a required level of performance
> and reliability on top of a given database. MySQL does a lot less than the other systems, but it may still often be a better tradeoff in
> that space than PostgreSQL or Oracle. In fact, that's probably why it's so popular.
No - it's popular because it has been trivial to develop trivial apps on, because it is somewhat free, because it was on windows first. And because most of its advocates aren't familiar enough with database technology to weigh the differences between mysql and postgresql.
Frankly, it has the least portability of any popular database solution, and lacks *so* many sql capabilities that simplify application development that I wouldn't consider using it for a moment - as long as postgresql is a reasonable alternative. The only time I've had to recommend mysql in the last year is when a client needed to implement an application that only supported mysql. Fortunately, that's been rare.
The parent probably wasn't as concerned with space as much as required resources. But aside from trying to figure out how you'd feed, cloth, house, and provide drinking water to 6 billion people in Texas...would you really want to live there? I mean, really, Texas is pretty obnoxious with just 10 million, let alone 6000 million!
And back the main point (this issue is apparently pretty emotional) - what's a reasonable upper population goal for the US? If 500 million in 50 years seems reasonable to many - does 4 billion in 200 years still sound good?
Don't know about you, but personally I enjoy having trails to hike on within five minutes of my home. In many more populated parts of the US, you'd have to drive 60-120 minutes to get to a similar quiet and natural surrounding. And that's at 250 million - not 500...
No, I just did a find on the main page.
But since reporting is about 50% of what databases are used for, it should be very easy to find.
Unfortunately, the link you discovered was simply to a question and speculation about using XML for reporting, rather than any actual solution.
Until they can actually solve the problem, it's just an curiosity.
A quick search on the wiki showed no hits for the word 'report'.
Note that the classic problem with object databases is that they focus on transactional queries, and that DSS or reporting queries are either too slow or too difficult to perform.
So, yeah it sounds nice if you want *both* an object database and a relational one. Not a bad solution if you already have a data warehouse on the side. But if you don't it just a lot of extra work.
Next...
> Those things sound like stuff you'd write for
> those useless reports the bosses always want
> comparing apples to porcupines. Most database apps
> I've seen use pretty simple queries;
> it keeps your memory overhead down, and makes your
> app run more smoothly.
A few thoughts:
1. If you think reports are useless, then you probably put tape over the guages on the dashboard of your car as well. I can't help you there.
2. And you deliver customer portal, don't you want to show info about sales they've made in the past, credits they've accumulated, savings they've made via your 'preferred customer program', etc? if not, then you're behind the curve on portal design. if you do - are you going to send them a separate application? Or are you going to run some of these queries from your portal. Hint: pick the last option.
3. Most database apps only do simple queries. You're right. That's because the average developer wants to keep the job simple, can only write basic SQL, and doesn't have experience with usability.
4. Yep, it can take more memory. Then again, memory's cheap.
> If you're using multiple outer joins for
> anything other than reports, your schema's
> probably screwy.
5. The schema shouldn't be limited by your inability to code multiple outer joins or deal with optional data.
6. See #2 above. The concept of a 'report' being something that somehow is done in other applications is antiquated. Transactional apps have a choice: deliver only transactional views of the data - and force the user to guess what the heck's going on or go to another app, or encompass some basic reporting in the transactional app.
> All that stuffs fine if you're working for the
> government, and they can buy you a billion
> dollars worth of hardware,
> but if you're putting together an app for
> accounting and inventory control for a
> relatively small company, and you're
> using those types of queries, you're going to
> make their hardware scream for mercy, and them
> very unhappy with the speed of your fancy new
> app.
Don't know where to start, but here's a try:
1. Use a real database
2. Tune it right
3. Put it on reasonable hardware
4. Identify the performance needs (based on usability objectives) for each step in each use case. Some queries will have to be lightning-fast, others won't. Learn the difference.
5. Redundancy in the database is your friend, just got to manage it right. It will allow you to take queries you would have thought would be very slow, and run them at blazing speeds. This is also best-practice.
I do this all the time and it always results in fast applications that users *love*. There's no need to limit your use of the database to trivial queries unless you're just prototyping, aren't being paid enough to finish the work, or are using ISAM files.
In my experience the result of this approach limits the database queries to simple selects of lists or individual objects, updates, and inserts. That's fine for many prototypes of simple applications (with very simple data models).
But it breaks down as you add more functionality to the application. For example (keeping it very general) lets say that you want users to select a category to drill-down into for further work. You might start out with just a category name being selected and presented to the user. However, the next step is usability will be to add additional attributes to the name - so that the user has a sense of which categories have the most activity, most pending 'requests', or whatever. So you then need to also select attributes for each category like:
* number of orders/requests/etc performed in last 24 hours
* total number of articles/orders/etc
* number of orders/requests/etc in a critical status
How do you generate the SQL for those attributes. Now, if you are writing your own SQL - you could just join the category names with an inline view of your orders table (grouped) to pull all attributes in a single query. Beats running two separate queries.
So whether it's developed as one or two queries - how would a sql generator build that last query? And if it can't - do you generate 90% of your queries, and then do just 10% by hand. Does that sound useful?