Either I'm missing your point, or you are only telling a partial truth.
because BSD is compatible with the GPL.
It is one-way compatible. Almost all open-source licenses are one-way compatible with GPL. BSD code goes in, nothing comes out. MS-PL code goes in, nothing comes out. GPL is the blackhole of open source licenses. Stuff goes in, nothing comes out. Why? The license prohibits it.
I think that clause is fairly reasonable if I use that license for my code. If somebody is gonna bring a patent claim against my stuff, screw them, they loose the license to use my work.
How is this different than similar patent clauses in other licenses?
Well, and also I am under the impression that Postgres can answer fewer types of queries (such as count(*)) from the index due to MVCC.
This is an aspect they are planning to improve. Somehwere on their roadmap, I recall reading they were gonna get it so certian queries can be answered right out of the index. But to take a dig at you, I'll point out Postgres can do crazy things with indexes that MySQL can't--like combine two separate indexes for a query (so you dont have to create a multi-column index) or do some weird scan on them for joins. MySQL only seems to have basic kinds of indexes.
COUNT(*) is actually a tough cookie on any MVCC. I suspect they will have a way to get an approximate count of a large dataset without a sequential scan... most people who do COUNT(*) on a large set of data are doing it for pagination or just a string of text saying "059345 results found". In those cases you can be off a few records and live to tell the story.
Btw, I fooled with that planner you linked to a while ago. How the hell do you come up with a number for their IO stuff? Like, is there a magic command on a unix-like system that will give some number you can plug in?
Heh. What fun would our jobs be without creative execution plans? I mean, we don't hunt tigers anymore, so we have to get our "kill" somehow and I'll take "trick the query planner into doing what I want" as a substitute--that or finding a good parking spot.
Hardware is cheap.
'Tis true. You can by trash webservers and string them out across a load balancer and be fine. It is much cheaper to by more webservers than spend developer time squeezing more performance out of code.
The database server is the one thing you can't cheap out on. You can only make that single thing bigger and bigger.
That said, never clustered a database so I dont know its pitfalls or gotchas:-)
By the way, EC2 + Database? Does it work? I've looked around for case studies, war stories, or anything. Are people doing it? Is their stuff fast enough? Don't you have to put the table on those "perma-disks" so they don't get wiped? Are those some SAN thing or are they like NFS? The IO is important for a database, and there was little info about what, exactly, those perma-disk mounts actually are. And don't they meter disk IO?
It looks intriguing, but I need answers and can't find them:-) I know it wouldn't be cost effective to run the database server "offsite" and run the web stuff in the cloud. You'd need the database "in the cloud" with the webservers...
Replication and clustering are only one component of the whole product. The sheer amount of legwork required to make MySQL performant on all but the smallest use cases make it fairly costly. It also has so many 'isms attached that if you aren't careful you can dig your database into a MySQL-only world that is impossible to migrate out of.
I also wonder how many people have to cluster their MySQL stuff because the database itself just doesn't perform very well. I wonder if they were using a different product if they'd even have to cluster at all.
it's only slipped one release... so far...
Probably because it is a hard problem, limited developer time, and the fact there are much cooler features the community wants like materialized views (which would probably make many setups not need clustering at all!).
Perhaps the worldview of MySQL folk has been tainted by how poorly it performs. All they think of is "OMG it doesn't cluster" but maybe if they looked at the tools PostgreSQL offers, they'd see clustering might not even be necessary on PostgreSQL. Maybe PostgreSQL doesn't have good clustering yet because in many cases, people really just dont need it!
I don't have much experience with Oracle, honestly. But what little experience I've had has led me to believe it is quirky in its own right. But I think it might be that way because you can pretty much get it to do anything...
It also seemed insanely complex. The amount of garbage its installer put on my disk (in non-standard locations) was amusing. The ugly skin the used was also amusing. But damn, the integration with visual studio was nice!
Basically, I felt like I was at one of those amusement park rides that have the "you must be this tall to go on this ride", only "your applications must be this tall to use this software". And that was their free version too!
Moral? Maybe that there isn't a good free database for that level.
There are a lot of knobs to turn. They are somewhat well-documented knobs, but knobs none the less. MySQL ships with like four sample configurations.
That said, the best solution would be to make it self-opimizing. It already tries to self-optimize its query plans (i.e.: "We aren't gonna support hints on indexes... if the query planner needs hints, the query planner is broken"). It could do something similar with its configuration. But honestly, that would take development time away from more important things like materialized views or WITH queries. I'd much rather have stuff like that.
By the way, you don't have to edit the config file by hand, you can actually edit it from the comfort of your desktop in PgAdmin and even trigger a reload of the file.
It is a bit of lore. It is a difference between the two products that you might not have considered before migrating that you better take into account. I didn't even think about it, and I almost got into a huge jam as a result.
SELECT count(uid) FROM users WHERE username=BINARY('coryking');
Except that breaks unicode. Characters aren't byte[]'s, they are characters.
The proper answer is to use the correct collation. The only thing is PostgreSQL doesn't really do that kind of thing yet.
My solution was to create a new datatype based on this example that uses case-insensitive operators. Instead of using the "varchar", I my spiffy new, case-insensitive "ltext" version. Fully indexable too!
i asked you a valid question, how does innodb not maintain referential integrity?
As long as you are careful not to mix it with MyISAM I would hope that it does. It is also the only table-type in MySQL that doesn't lock itself on writes.
Honestly though, I'm not sure if I trust the product itself enough to have faith that InnoDB-only transactions would roll back properly. I also am not sure have enough faith to trust it with cascading updates or deletes.
When I have to work on MySQL databases, I only use InnoDB on write-heavy tables where I want a boost of performance and dont need fulltext. It is usually a quick fix to boost performance.
Honestly, if you are at the stage where you are tossing around words like "high-availability", it is time to actually spend cash on a big-boy RDBMs. PostgreSQL is awesome, but it can only scale so far.
If you went MySQL, yeah you can do crazy load-balanced databases out of the box, but it will cost you as much in developer time to hack around the *other* limitations in the product that it would be cheaper to buy a real database.
MySQL is only free if your developer time is not. Good DBAs aren't cheap and MySQL is needs more developer time to work around its limitations than any other database (in my humble opinions).
Therefore I always supply a DEFAULT value, that the DB can safely insert in that column, IF I haven't specified anything different during the INSERT.
But that doesn't make any sense. Take this website. Each story has a unique story id--it is a string. It also has to be unique. Tell me a proper default value for the story id? If you think of one, tell me how you will deal with two people editing stories that forgot to add a story id?
Not let me insert NULL values. Not let me forget to put a value in a column when I insert something. I said "NOT NULL" for a reason. I dont want BS values.
I've had this exact argument with somebody before and quite frankly the argument makes no sense. The idea that the database should "do something" all the time is silly, I told it NOT NULL for a damn reason--there should be no default value for a NOT NULL column unless I give it on (like a datestamp on a table). Is NULL a valid username? Is "" a valid username? Nope! If you don't explictly set the "username" on insert or update, then you are trying to corrupt the database and it should stop you. An empty string for a username is an invalid username. In fact, there *is* no logical default value for a username, which is one of the reasons I set that column to NOT NULL!
If you didn't give a proper value or forgot to specify the username, the database should protect itself from corruption. DEFAULT "" is corrupting the database!
does it really stop you at creation time with a warning / error
The only proper thing. You tried to corrupt your database by feeding it trash. A real database would tell you "no way sir, I'm not gonna let you feed me that trash" and throw an error. That's right... invalid data is as bad as corrupt data--in fact, invalid data *is* corrupt data. Who cares about the filesystem corrupting your data when your own table specification corrupts it by feeding it empty strings for usernames and "0000-00-00" as a "default value" when you forget a date.
At least in PostgreSQL create indexed functions. In other words, you can create a unique index on LOWER(username). I'm sure the other guys let you index functions too. Either way:-)
but degrading everyone else that has no interest in using it is.
It isn't degrading. When you clean up enough messes left by people who have no business touching a database, it gets fustrating. Especially when things like this are said:
how exactly does innodb not take things like referential integrity and error reporting seriously?
Because InnoDB is only half of MySQL. InnoDB doesn't do fulltext search. If you want fulltext, you gotta go MyISAM. Guess what happens to your cute, cozy referential integrity when half your tables are InnoDB and half are MyISAM? Bye bye! And error reporting? What error reporting? The part where it doesn't tell you "bye bye referential integrity!" and rolls back half your transaction and commits the other half? Or did you mean the part where it automatically adds default values to fields you set not null*? Or were you talking about the part where it thinks invalid dates are valid and doesn't throw an error? Or are those examples only good for Fortune 10(tm) enterprises and not some piddly organization like yours.
No sir, nobody who is a developer that takes their profession seriously would make a claim that MySQL takes anything seriously. They've cleaned up enough MySQL messes, thank you.
[rant]*one of my biggest pet peeves of MySQL... when I say CREATE TABLE(varchar(255) blah NOT NULL) please do not add your own 'DEFAULT ""' to the end--it is considered by some to be rather rude! I can spot MySQL schema's a mile away by this single trait. Anything NOT NULL almost always has a bullshit default value. Got a NOT NULL int--it will have a DEFAULT 0! Got a NOT NULL date, "DEFAULT 0000-00-00", which isn't even a valid date! How is that for taking your data seriously?[/rant]
Indeed I did mark it unique. However, in PostgreSQL, "Jacuqes CHESter" and "jacques chester" are two different values. In MySQL, they are the same value.
I ran into that little nugget when I migrated webapp database from MySQL to PostgreSQL.
As with any webapp, the software would let you create an account on the website. Can't have duplicate users, so the code would check to see if the username existed.
SELECT count(uid) FROM users WHERE username='coryking';
Got "0"? Well, the user is okay... well, at least on MySQL. In MySQL, "coryking", "CorYKING", and "CORYKing" are all the same. As I would soon discover, that assumption is *not* true on PostgreSQL.
Result? A month after the migration, I discovered "unique" accounts for "coryking", "CoryKING" and "CORYKING". Obviously, this can't be. All usernames in webland are case-insensitive. Thankfully, there was only a handfull of these "unique" users to clean up. Had this been left alone for a year, I'd have quite a cleanup on my hands!
Moral? MySQL is case-insensitive, PostgreSQL isn't. Honestly, the proper thing is to be case-sensitive and I assume pretty much every database besides MySQL is case-sensitive. But you have to make your unique index on LOWER(username) instead of username--oh wait, except you can't do that in MySQL cause they dont support indexes on functions... sucks for them!
The RMDBS crowd sees data and application as being separate issues.
You forgot to broaden your scope. Otherwise, you are correct.
Your database is almost literally your company. It should reflect your way of doing business--any moderately skilled developer should be able to walk into an orginization they know nothing about and using only the database schema, infer pretty much what the company does, and how it does it.
You can always fix flawed software design, but it is almost impossible to fix a flawed database design. Do your database wrong, the growth of your company will be hindered. Do it right, and your company will flourish. No joke.
Good luck finding it. MySQL has horrible documentation. The whole structure of it is a mess.
Plus, when you say "and if there is a problem, it is documented". Yeah, that is great, but most of the "well-documented" problems are long-standing bugs for insanely stupid shit.
It does make me laugh though, because honestly, I agree that mysql is well documented. Every random question I google for usually has a hit. And funny enough, the top listing is usually a page here--failing that, at least on the first page of results.
Everything breaks for mobile browsing. The interface and interaction differences are too great to try to reconsile the difference between a mobile phone and a "real" browser. That is why you do a seperate website for them.
It was NEVER designed to give a pixel-picture representation of content
It was never designed to display images either. Quite frankly I dont care what it was designed for. Times change and now days, we expect all browsers to render the way we specify. Please note nobody is saying "pixel perfect accuracy"--we are saying "behaves the way we programmed it to behave". There is a subtle, but important difference.
That makes the guarantees you want - every pixel is in its proper place.
Funny, it is actually way easier to do this with HTML/CSS than it is to actually make a page that is *not* pixel perfect. Standards compliant, liquid layouts are a bitch to get working on all browsers. Fixed grid, "pixel perfect" is easy--just drop in a table and set all the "width=''" to whatever.
Too many designers, used to working in pamphlets where they had complete control, moved to web design.
Either I'm missing your point, or you are only telling a partial truth.
It is one-way compatible. Almost all open-source licenses are one-way compatible with GPL. BSD code goes in, nothing comes out. MS-PL code goes in, nothing comes out. GPL is the blackhole of open source licenses. Stuff goes in, nothing comes out. Why? The license prohibits it.
I think that clause is fairly reasonable if I use that license for my code. If somebody is gonna bring a patent claim against my stuff, screw them, they loose the license to use my work.
How is this different than similar patent clauses in other licenses?
Why? Just out of curiosity, why are they dinging you for IO? Is it because of the fact it is on network storage?
Well, yeah, obviously I know that. But that ILIKE doesn't help much with the UNIQUE index, does it :-)
This is an aspect they are planning to improve. Somehwere on their roadmap, I recall reading they were gonna get it so certian queries can be answered right out of the index. But to take a dig at you, I'll point out Postgres can do crazy things with indexes that MySQL can't--like combine two separate indexes for a query (so you dont have to create a multi-column index) or do some weird scan on them for joins. MySQL only seems to have basic kinds of indexes.
COUNT(*) is actually a tough cookie on any MVCC. I suspect they will have a way to get an approximate count of a large dataset without a sequential scan... most people who do COUNT(*) on a large set of data are doing it for pagination or just a string of text saying "059345 results found". In those cases you can be off a few records and live to tell the story.
Btw, I fooled with that planner you linked to a while ago. How the hell do you come up with a number for their IO stuff? Like, is there a magic command on a unix-like system that will give some number you can plug in?
Heh. What fun would our jobs be without creative execution plans? I mean, we don't hunt tigers anymore, so we have to get our "kill" somehow and I'll take "trick the query planner into doing what I want" as a substitute--that or finding a good parking spot.
'Tis true. You can by trash webservers and string them out across a load balancer and be fine. It is much cheaper to by more webservers than spend developer time squeezing more performance out of code.
The database server is the one thing you can't cheap out on. You can only make that single thing bigger and bigger.
That said, never clustered a database so I dont know its pitfalls or gotchas :-)
By the way, EC2 + Database? Does it work? I've looked around for case studies, war stories, or anything. Are people doing it? Is their stuff fast enough? Don't you have to put the table on those "perma-disks" so they don't get wiped? Are those some SAN thing or are they like NFS? The IO is important for a database, and there was little info about what, exactly, those perma-disk mounts actually are. And don't they meter disk IO?
It looks intriguing, but I need answers and can't find them :-) I know it wouldn't be cost effective to run the database server "offsite" and run the web stuff in the cloud. You'd need the database "in the cloud" with the webservers...
Replication and clustering are only one component of the whole product. The sheer amount of legwork required to make MySQL performant on all but the smallest use cases make it fairly costly. It also has so many 'isms attached that if you aren't careful you can dig your database into a MySQL-only world that is impossible to migrate out of.
I also wonder how many people have to cluster their MySQL stuff because the database itself just doesn't perform very well. I wonder if they were using a different product if they'd even have to cluster at all.
Probably because it is a hard problem, limited developer time, and the fact there are much cooler features the community wants like materialized views (which would probably make many setups not need clustering at all!).
Perhaps the worldview of MySQL folk has been tainted by how poorly it performs. All they think of is "OMG it doesn't cluster" but maybe if they looked at the tools PostgreSQL offers, they'd see clustering might not even be necessary on PostgreSQL. Maybe PostgreSQL doesn't have good clustering yet because in many cases, people really just dont need it!
Well, NOT NULL is only for writing. If you've got this:
CREATE TABLE blah (
blah_id SERIAL PRIMARY KEY,
username varchar(255) NOT NULL,
favorite_food text
)
(SERIAL = int auto_increment in mysql, primary key = duh)
You cannot do this:
INSERT INTO blah VALUES(favorite_food) ("plain yogurt");
Why? you forgot the username!
But there is nothing wrong with this:
SELECT blah_id,favorite_food FROM blah WHERE username IS NULL;
It wouldn't return anything of course--as there is no username that is NULL.
If your table was this:
CREATE TABLE blah (
blah_id SERIAL PRIMARY KEY,
username varchar(255),
favorite_food text
)
Now you can do that insert statement and the "SELECT blah_id FROM blah WHERE username IS NULL" would start to return stuff.
I don't have much experience with Oracle, honestly. But what little experience I've had has led me to believe it is quirky in its own right. But I think it might be that way because you can pretty much get it to do anything...
It also seemed insanely complex. The amount of garbage its installer put on my disk (in non-standard locations) was amusing. The ugly skin the used was also amusing. But damn, the integration with visual studio was nice!
Basically, I felt like I was at one of those amusement park rides that have the "you must be this tall to go on this ride", only "your applications must be this tall to use this software". And that was their free version too!
Moral? Maybe that there isn't a good free database for that level.
There are a lot of knobs to turn. They are somewhat well-documented knobs, but knobs none the less. MySQL ships with like four sample configurations.
That said, the best solution would be to make it self-opimizing. It already tries to self-optimize its query plans (i.e.: "We aren't gonna support hints on indexes... if the query planner needs hints, the query planner is broken"). It could do something similar with its configuration. But honestly, that would take development time away from more important things like materialized views or WITH queries. I'd much rather have stuff like that.
By the way, you don't have to edit the config file by hand, you can actually edit it from the comfort of your desktop in PgAdmin and even trigger a reload of the file.
It is a bit of lore. It is a difference between the two products that you might not have considered before migrating that you better take into account. I didn't even think about it, and I almost got into a huge jam as a result.
Except that breaks unicode. Characters aren't byte[]'s, they are characters.
The proper answer is to use the correct collation. The only thing is PostgreSQL doesn't really do that kind of thing yet.
My solution was to create a new datatype based on this example that uses case-insensitive operators. Instead of using the "varchar", I my spiffy new, case-insensitive "ltext" version. Fully indexable too!
As long as you are careful not to mix it with MyISAM I would hope that it does. It is also the only table-type in MySQL that doesn't lock itself on writes.
Honestly though, I'm not sure if I trust the product itself enough to have faith that InnoDB-only transactions would roll back properly. I also am not sure have enough faith to trust it with cascading updates or deletes.
When I have to work on MySQL databases, I only use InnoDB on write-heavy tables where I want a boost of performance and dont need fulltext. It is usually a quick fix to boost performance.
Honestly, if you are at the stage where you are tossing around words like "high-availability", it is time to actually spend cash on a big-boy RDBMs. PostgreSQL is awesome, but it can only scale so far.
If you went MySQL, yeah you can do crazy load-balanced databases out of the box, but it will cost you as much in developer time to hack around the *other* limitations in the product that it would be cheaper to buy a real database.
MySQL is only free if your developer time is not. Good DBAs aren't cheap and MySQL is needs more developer time to work around its limitations than any other database (in my humble opinions).
But that doesn't make any sense. Take this website. Each story has a unique story id--it is a string. It also has to be unique. Tell me a proper default value for the story id? If you think of one, tell me how you will deal with two people editing stories that forgot to add a story id?
Not let me insert NULL values. Not let me forget to put a value in a column when I insert something. I said "NOT NULL" for a reason. I dont want BS values.
I've had this exact argument with somebody before and quite frankly the argument makes no sense. The idea that the database should "do something" all the time is silly, I told it NOT NULL for a damn reason--there should be no default value for a NOT NULL column unless I give it on (like a datestamp on a table). Is NULL a valid username? Is "" a valid username? Nope! If you don't explictly set the "username" on insert or update, then you are trying to corrupt the database and it should stop you. An empty string for a username is an invalid username. In fact, there *is* no logical default value for a username, which is one of the reasons I set that column to NOT NULL!
If you didn't give a proper value or forgot to specify the username, the database should protect itself from corruption. DEFAULT "" is corrupting the database!
The only proper thing. You tried to corrupt your database by feeding it trash. A real database would tell you "no way sir, I'm not gonna let you feed me that trash" and throw an error. That's right... invalid data is as bad as corrupt data--in fact, invalid data *is* corrupt data. Who cares about the filesystem corrupting your data when your own table specification corrupts it by feeding it empty strings for usernames and "0000-00-00" as a "default value" when you forget a date.
At least in PostgreSQL create indexed functions. In other words, you can create a unique index on LOWER(username). I'm sure the other guys let you index functions too. Either way :-)
It isn't degrading. When you clean up enough messes left by people who have no business touching a database, it gets fustrating. Especially when things like this are said:
Because InnoDB is only half of MySQL. InnoDB doesn't do fulltext search. If you want fulltext, you gotta go MyISAM. Guess what happens to your cute, cozy referential integrity when half your tables are InnoDB and half are MyISAM? Bye bye! And error reporting? What error reporting? The part where it doesn't tell you "bye bye referential integrity!" and rolls back half your transaction and commits the other half? Or did you mean the part where it automatically adds default values to fields you set not null*? Or were you talking about the part where it thinks invalid dates are valid and doesn't throw an error? Or are those examples only good for Fortune 10(tm) enterprises and not some piddly organization like yours.
No sir, nobody who is a developer that takes their profession seriously would make a claim that MySQL takes anything seriously. They've cleaned up enough MySQL messes, thank you.
[rant]*one of my biggest pet peeves of MySQL... when I say CREATE TABLE(varchar(255) blah NOT NULL) please do not add your own 'DEFAULT ""' to the end--it is considered by some to be rather rude! I can spot MySQL schema's a mile away by this single trait. Anything NOT NULL almost always has a bullshit default value. Got a NOT NULL int--it will have a DEFAULT 0! Got a NOT NULL date, "DEFAULT 0000-00-00", which isn't even a valid date! How is that for taking your data seriously?[/rant]
Indeed I did mark it unique. However, in PostgreSQL, "Jacuqes CHESter" and "jacques chester" are two different values. In MySQL, they are the same value.
Indeed. However, at least when I had the problem, there was no case-insensitive collation in PostgreSQL. Dunno if that has been fixed yet.
I ran into that little nugget when I migrated webapp database from MySQL to PostgreSQL.
As with any webapp, the software would let you create an account on the website. Can't have duplicate users, so the code would check to see if the username existed.
SELECT count(uid) FROM users WHERE username='coryking';
Got "0"? Well, the user is okay... well, at least on MySQL. In MySQL, "coryking", "CorYKING", and "CORYKing" are all the same. As I would soon discover, that assumption is *not* true on PostgreSQL.
Result? A month after the migration, I discovered "unique" accounts for "coryking", "CoryKING" and "CORYKING". Obviously, this can't be. All usernames in webland are case-insensitive. Thankfully, there was only a handfull of these "unique" users to clean up. Had this been left alone for a year, I'd have quite a cleanup on my hands!
Moral? MySQL is case-insensitive, PostgreSQL isn't. Honestly, the proper thing is to be case-sensitive and I assume pretty much every database besides MySQL is case-sensitive. But you have to make your unique index on LOWER(username) instead of username--oh wait, except you can't do that in MySQL cause they dont support indexes on functions... sucks for them!
Good times.
You forgot to broaden your scope. Otherwise, you are correct.
Your database is almost literally your company. It should reflect your way of doing business--any moderately skilled developer should be able to walk into an orginization they know nothing about and using only the database schema, infer pretty much what the company does, and how it does it.
You can always fix flawed software design, but it is almost impossible to fix a flawed database design. Do your database wrong, the growth of your company will be hindered. Do it right, and your company will flourish. No joke.
Good luck finding it. MySQL has horrible documentation. The whole structure of it is a mess.
Plus, when you say "and if there is a problem, it is documented". Yeah, that is great, but most of the "well-documented" problems are long-standing bugs for insanely stupid shit.
It does make me laugh though, because honestly, I agree that mysql is well documented. Every random question I google for usually has a hit. And funny enough, the top listing is usually a page here--failing that, at least on the first page of results.
Because you know better.
You know I find it moderately amusing how everybody who makes arguments like yours has a horribly bland looking website.
Pro Tip: Presentation matters as much as the content. In fact, presentation *is* part of the content.
Everything breaks for mobile browsing. The interface and interaction differences are too great to try to reconsile the difference between a mobile phone and a "real" browser. That is why you do a seperate website for them.
It was never designed to display images either. Quite frankly I dont care what it was designed for. Times change and now days, we expect all browsers to render the way we specify. Please note nobody is saying "pixel perfect accuracy"--we are saying "behaves the way we programmed it to behave". There is a subtle, but important difference.
Funny, it is actually way easier to do this with HTML/CSS than it is to actually make a page that is *not* pixel perfect. Standards compliant, liquid layouts are a bitch to get working on all browsers. Fixed grid, "pixel perfect" is easy--just drop in a table and set all the "width=''" to whatever.
1998 called and want their arguments back.
But there is currently no way for your payment gateway to detect if somebody is an asshole. Somebody needs to work on that :-)