Open Source Databases Revisited

free software trickles up by _|()|\| · 2000-11-12 21:41 · Score: 1

ppl said "serious companies don't use linux"

Indeed, I'm working on a crisis for a customer who runs a large web site with three Linux/Apache/PHP app. servers hitting one database server (via ODBC) running our DBMS.

Guess which component is failing?

Here we go... why? by _N0EL · 2000-11-12 21:41 · Score: 1

Oh man, yet another case of Slashdot putting up a comparison that can't possibly be made, but it will generate a lot of traffic and bring down servers to make it look like this is the happening place when it's just a geek version of Jerry Springer. This survey has been an old story on the Oracle newsgroups for a week and it's not resolvable, just "We're better than you" followed by "no, we're better, biatch." If you use these systems, you know what happens... different products are work better under different circumstances, nothing new. You are starting up a website, have no money but lots of know-how and want to get something up and running until you get loaded with cash (okay, dream). Right, you download MySQL for nothing and get up and running. You are getting a huge number of hits from tons of users (row locks become a little bit of an issue, right?) and you have some money, you get set up with Oracle. Which is better? I hit a website (a newcomer site, not up and going for very long) Friday night to place an order and for the second weekend in a row (okay, it was a food delivery place - I have no life and order in, what's the surprise?) I was returned a MySQL error. But you know what? A week ago I got multiple ORA-00020, and ORA-00018 errors trying to access my work late at night (and guess whose product that is). Lot's of factors come in, not the least of which is the talent running the whole show. It seems to me like this is a big picture issue, and isn't it good for us all that there are free alternatives that get us through testing and maybe even up and running, but realistically, how far can something free take us? Then if our idea works, and the users come, we pony-up the dough for something else. That seems to work. So what does everyone think?

--

"My mother works for Microsoft now. A whole other cult."

Re:Here we go... why? by carlos_benj · 2000-11-13 03:43 · Score: 1

I stand corrected. That's one of the things I didn't like about it when checking things out earlier and must have missed the subsequent announcement. Thanks.
There's still the question of scalability. How does it fare in that category?

--
--

As a matter of fact, I am a lawyer. But I play an actor on TV.
Re:Here we go... why? by carlos_benj · 2000-11-12 22:56 · Score: 1

So, if Oracle were free it's abilities would change? I think your view is simplistic at best. For bigger, more intense stuff you'll be using one of the top-tier providers, but it's not just because they charge money and give you someone to blame when things go south (although that is a nice feature when you need some breathing space to get things working again), but because the technology, features, stability, etc. are there already and not around some bend in the future development curve. There are a lot of things PostgreSQL won't be suitable for until you can do a database backup online instead of having to bring the database down.

--
--

As a matter of fact, I am a lawyer. But I play an actor on TV.
Re:Here we go... why? by _N0EL · 2000-11-13 05:10 · Score: 1

If I eat at a bar, I can't sit here and read Slashdot during dinner... Thanks for the suggestion, mate!

--
"My mother works for Microsoft now. A whole other cult."
Re:Here we go... why? by lowen · 2000-11-12 23:40 · Score: 1

Online backup for PostgreSQL has worked since version 6.5, thanks to MVCC. You do not need to take the database down anymore to pull a consistent pg_dumpall snapshot.
Re:Here we go... why? by rodgerd · 2000-11-12 23:38 · Score: 1

I've yet to have to bring the database down to backup Postgres. What are you doing?

Re:Low end machine? by linuxgod · 2000-11-14 07:36 · Score: 1

Either im in class, at home studying, restoring my truck,
designing someone's page, setting up a router,
or fixing some lemming's (like yourself) machine because
"The browser crashes when they get online" bullshit.

My machine stays online all day. I don't bother hanging up the modem.

You are a complete bastard child, I would assume you are no
older than 13, living with your mommie. She fucked
some bum, and you have no idea who the hell your father is.

You know what? Im gonna go dig up your grandmother's grave
and let SIGNAL 11 have his way with her. In fact, ill make sure
he digs up your grandfather too.

ETRN x

Re:Low end machine? by linuxgod · 2000-11-14 09:28 · Score: 1

Sorry kat, i have more brain in my pinky than you do in
your whole body.

ETRN x

Re:linux in manufacturing by Johann · 2000-11-13 01:48 · Score: 1

You can compile postgres yourself and the compiler will optimize for your processor.

"Fat, drunk, and stupid is no way to go through life."

--
"You're gonna need a bigger boat." - Chief Brody

Re:Your Mileage May Vary by sql*kitten · 2000-11-14 17:23 · Score: 2

Please note that this benchmark (and any benchmark, for that matter) applies only to the system and application that it's testing.

Quite. Let's see how well these "open source" databases do in TPC-C and -H.

Re:Aditional questions: by Johann · 2000-11-13 01:49 · Score: 1

Here's the specifications for Interbase: http://www.interbase.com/open/research/tech_specs. html Maximum size of database: 32TB using multiple files; largest recorded InterBase database in production is over 200GB Maximum size of one file: 4GB on most platforms; 2GB on some platforms Maximum number of tables: 64K Tables Maximum size of one table: 32TB Maximum number of rows per table: 4G Rows Maximum row size: 64KB Maximum number of columns per table: Depends on the datatypes you use. (Example: 16,384 INTEGER (4 byte) values per row.) Maximum number of indexes per table: 64K indexes Maximum number of indexes per database: 4G indexes

"Fat, drunk, and stupid is no way to go through life."

--
"You're gonna need a bigger boat." - Chief Brody

Re:16 queries on one page? by denshi · 2000-11-16 02:54 · Score: 1

I remember reading on the topic of excessive joins in a Sybase Admin book once (excelent book on "practical" database, since it shows you exactly how one company solved all the DB problems)

And what book was this? (I really want to read it).

On another subject, did you drop all the users from blacknova.net? My user (tolket) is gone....

Interesting Info by CaptainZapp · 2000-11-13 01:53 · Score: 1

Is that the official release or is that still in beta ? I heard they plan to do it, but with those versions I worked with, didn't support it.

--
ich bin der musikant

mit taschenrechner in der hand

kraftwerk

Re:Interbase by musicmaster · 2000-11-16 22:06 · Score: 1

I copied the following message from http://www.phpbuilder.com/annotate/message.php3?id =1001760:
The opensource Interbase is alive and well: At: http://firebird.sourceforge.net
http://www.ibphoenix.com
http://www.interbase2000.com
And in newslists at http://www.mers.com
Generally it's called Firebird, and Inprise don't want to know or link to us.

Re:16 queries on one page? by maraist · 2000-11-16 22:54 · Score: 2

On another subject, did you drop all the users from blacknova.net? My user (tolket) is gone....

Laugh, I'm not in charge of that, though I'm one of the OpenSource developers.. Yes, the DB was completely flushed... You need to create a new character.. Read the forums.

-Michael

--
-Michael

Re:16 queries on one page? by Johann · 2000-11-13 02:01 · Score: 1

Actually, this probably means that the database was very well-designed.

Your comment minimizes the performance issues. You can create a beautiful, 3rd normal form database and find that it is slow as hell when you try to read or write to the schema.

Most real-world databases require a whole lot more {tables}.

For a web site, pages need to return in 1-2 seconds, but if your 16 table join takes 1-2 seconds, you're in trouble because your response time now depends entirely on the network conditions.

Theoretical ideals of relational theory applied to databases are fun to talk about and design, but the real-world dictates some compromizes. Often, denormalization is the first compromize. This reduces the number of joins required to read read the database. It also speeds up writes because you have to issue fewer SQL statements.

Later.

"Fat, drunk, and stupid is no way to go through life."

--
"You're gonna need a bigger boat." - Chief Brody

Re:real-world choosings.... by uradu · 2000-11-13 02:05 · Score: 2

I've evaluated several free (as in I can make money with it without paying anyone a dime) DB servers for a small startup company, and Interbase so far looks by far the best. Not only are the features attractive, it also has more name recognition than MySQL or Posgres, and for some people that's important. We're using it particularly from Delphi apps, and support for that is great. It's amazing that the entire server can be configured to take less space than the BDE itself.

Anyway, I'm not saying that Interbase is an Oracle or MS SQL killer. But for a lot of clients and a lot of applications it's a suitable choice, and the client saves a stack of money to boot. Add to that the fact that you can deploy it on Linux, and for cost conscious customers it's essentially a no-cost solution. Just buy the box, the rest is free.

If it goes beyond tinkering: Yes! by CaptainZapp · 2000-11-13 02:14 · Score: 1

Ok, pissing contest is on. Yet, I try to keep this civilized:

Your database software failed. I lost all my data, but I can at least sue you and get some money. My business might be ruined, but hey I can cash out from winning a law suit against you. Of course I have to pay the lawyers and might have no income while the suit goes on but who cares.

Trying to be cynical ? Probably. But your logic fails utterly: In a mission critical environment you sure as hell want a database vendor, who:

is likely to provide professional, contractual support on various levels

is not likely to go away next week

who can provide an annual report indicating the amount of $ invested in R&D

ultimately who is accountable.

I'm aware that you can purchase professional support with PostgreSQL (which is likely better then what the big 5 have to offer), but frankly, being responsible for a project I wouldn't bet my carreer on it.

I never said that I necessarily agree with this mindset, but from a (project) management's perspective it's understandable. If I invest $200 000 000 into a mission critical system, the possible savings by using open source products in core elements (database, middleware) is probably not an option.

Now for the rest:

Reliable Database How do you determine reliable ? For millions M$ Access is the idea of a reliable database. Tell you what: It's not! Further disks will and go inevitably bad. Even if you have redundant disk subsystems a logical error or a pointer going haywire may have been written to disk.

good backups We certainly don't have an argument here. However, if you can't afford down time or you can't afford even five minutes of lost data this might not be enough.

Security Of course, but it's incredible difficult to provide good security. Especially in a multi tiered environment. Granted, that has nothing to do with the fact if a database is open source or not.

--
ich bin der musikant

mit taschenrechner in der hand

kraftwerk

Re:16 queries on one page? by dhogaza · 2000-11-13 02:14 · Score: 1

Well, the effect of caching query plans, or pre-parsing or pre-compilation will depend heavily on the query. For a query doing many joins that return little data using tables with appropriate indices, optimization time most likely will be an expensive component of the total time of query execution.

On the other hand, a quick query on a single table might spend a fairly high percentage of the execution time in the parser, because the optimizer has few options to explore and the resulting execution is likely to be fast.

Preliminary tests by the person who put together the current test implementation of the query cache code reports about a 3x improvement in the execution of simple queries.

Here's a reason why your test of precompiled functions might not help as much as you'd expected: though the function itself is precompiled, the query you use to reference the function is not. The time required to parse and optimize this query might be enough to overcome the fact that the function itself is pre-compiled.

Also, you talk of "stored procedures". I don't use PG's stored SQL functions, only PL/pgSQL. Stored SQL functions might not be pre-compiled, I can't say one way or another.

I'm not sure exactly what you mean by "raw calls to table operations". I do know, though, that a PG plan consists of a tree of operations like "merge join", "indexed scan", etc. These are "table operations" and they're "raw", but they're complex table operations.

Re:large? by NineNine · 2000-11-13 02:16 · Score: 1

Sorry, Mr. AC, but ORacle 8i definately, and without a doubt, is Object Oriented. Read up, bucko.

Re:yeah baby by Theodore+Logan · 2000-11-12 21:42 · Score: 1

This way, Slashdot can within hours be infected by thousands of lenghty fps (although not all of them, obviously, a fp in the original meaning of the word), none of which could be easily be dismissed as a troll without first reading the entiry crappy post.

like your post couldn't be dismissed as a troll from the title alone...

--

"If you think education is expensive, try ignorance" - Derek Bok

linux in manufacturing by SuperQ · 2000-11-12 21:44 · Score: 4

Linux is great for manufacturing.. I work for a mid-sized auto remanufacutuer, and we use postgresql all over the place, to do inventory, and core checkin. postgres has made huge strides in the last few years.. we started out with an old pentium running 6.3 postgres.. it was slow as dirt.. we are currently running 6.5.3, on a 600mhz alpha system.. and will probably move to 7 one of these days.

you have a budget? what's that? *smirk* I'm lucky if i get $50 for a stack of CD-R's.

Re:linux in manufacturing by Cramer · 2000-11-13 00:58 · Score: 2

As I recall, postgres is designed (optimized) for a PPro. In my testing of postgres (it was the only DB the RADIUS application knew how to talk to), a 486DX50 was faster than a 533 Alpha -- running 6.2.1 (as you can see, that was a while back.) I've not done any tests since "time travel" was removed -- and boy did that make a difference.

Re:Insertion of large text objects by SamBeckett · 2000-11-12 21:50 · Score: 1

There used to be an 8K limit on row-sizes.

The author of the article says its going away next release.

Re:Why the partisan fuss? by Trinition · 2000-11-12 21:51 · Score: 3

Don't forget row-level locking. Last I heard, MySQL only supported tabel-level locking. Someone was supposed to be adding it to their own flavor of MySQL, but not the main package.

I think the lack of tools, row-level locking, transaction support, etc. in MySQL are a shame. Yet, people talk like its the best thing since sliced bread. This has lead me to discover two kinds of MySQL users:

The first-time DB user who jumped into MySQL because its free. They sing the praises because they like OpenSource, the popularity of MySQL and they don't know any better.
The experienced DB-user who knows of MySQL's shortcomings and have determined that those shortcomings are not significant compared to the cost of a DB that does have those features -- at least for their purposes.

Re:Insertion of large text objects by tzanger · 2000-11-12 21:54 · Score: 1

I noticed, though, that it would not understand insertion of large text object data within an sql INSERT command as opposed to MySQL.

Postgres supports Large Objects but I've never had to use them. In fact I'm not even sure how to use them. :-)

Re:Why even bother? by carlos_benj · 2000-11-12 21:55 · Score: 1

Of course, any serious, enterprise-class projects use Oracle or DB2. I don't think that that's even a question.

You're forgetting Informix which runs many enterprise-class installations, 24/7, OLTP, Tbytes of data, all that rot.

--

As a matter of fact, I am a lawyer. But I play an actor on TV.

Re:Test Methods Allert by Anonymous Coward · 2000-11-12 21:55 · Score: 2

Did anybody else notice that PostgreSQL actually improved performance with more concurrent users ?

I didn't notice that. In fact, I noticed the opposite (which is exactly what was expected.)

What I noticed was that as you increased the number of clients, the total number of requests served increased, while the number of requests per client was lower - which is exactly what should happen - the performance decreases as the number of clients increases.

The problem is that you don't know how to read a graph.

Re:Insertion of large text objects by AlexA · 2000-11-12 21:57 · Score: 2

Try looking here . Version 7.1 should allow you to insert large (>8kB) amounts of text without having to use the large object interface (i.e. a simple INSERT/SELECT/UPDATE will do). If you're planning on implementing large object storage under PHP, read here about the pg_lo* functions, it has some useful examples.

Re:large? by NineNine · 2000-11-12 21:59 · Score: 1

That's bullshit. First of all, jsut because you're doing open source stuff, and have never even seen a real enterprise app, don't tell me that one terrabyte is 'enormous'. It's fairly standard. I've worked on several projects that had databases this big.

Secondly, Oracle IS object oriented. You can use eithe ra relational or object-oriented structure in Oracle.

Thirdly, I've never even heard of these DB's that you mention. Do they even support databases of this size?

Then use Interbase by uradu · 2000-11-13 02:16 · Score: 2

It's got both stored procedures and triggers. Some people I know that moved from Interbase to MSSQL have liked both features better on Interbase. Of course, MSSQL comes with a lot of eye candy such as Enterprise Mangler. Also, a lot of people consider Interbase one of the most rock solid db servers out there. I'm running one setup at home using Interbase 6 on RedHat 6.2 on a 486DX2/66 with 32MB RAM. Performance is quite decent for 3-5 users. The limiting factor is the slow-ass drive in there (250MB IDE). More RAM and a faster drive would grease it up.

Re:16 queries on one page? by Cramer · 2000-11-13 02:28 · Score: 3

Not necessarily. Database design is one of a very few true black arts (read: serious voodoo.) The number of queries alone isn't a problem. The layout of the data, indexing, number of queries, complexity of queries, and the volume of returned data all factor in.

Let me provide a real world example. Some years ago, it was my task to roll out RADIUS for an ISP's dialup network. The stock setup for the USR RADIUS server -- it was the only one I could get to talk to a SecurID server correctly -- was, shall we say, non-optimal. USR's stats said one (1) authenication per second when using postgres. Let's ignore the way they had postgres setup (fsync enabled) for the moment. The reason everything was so bloody slow was the database schema. Even though there was only 1 query against the USERS table, it had to return 5k. Each row of the USERS table contained everything you could ever imagine (well, that USR could imagine.) Each row consumed almost 5k of space.

After three months of testing, data collection, and analysis, I redesigned the mess. USR literally laughed at my design because it made between three and six queries depending on the user. They stopped laughing when I showed one of their engineers the setup actively handling over 100 authentications per second. The USERS table, holding 10,000 entries, was about 2M.

Re:oracle by dhogaza · 2000-11-13 02:31 · Score: 2

Uhhh...you can use Oracle 8i for DEVELOPMENT for free for as long as you want, or use an EVALUATION copy for 30 days.

But ... when you deploy, you must pay. Read your licence, please. While this wouldn't bother me a lot, it might bother Larry's lawyer's, and if your resulting site's visible I suspect they'll notice. Since you've posted your URL on slashdot, the odds that they'll notice may be substantially greater than it was until you posted :)

The resulting lawsuit might not be terribly pleasant.

While you're free to do what you want, and while Oracle's not known for going after small-fry and seems to be much less obnoxious than MicroSoft in terms of license enforcement, folks should at least know that what you are proposing is a violation of your Oracle license.

At the risk of being redundant - deployment or other non-development, non-evaluation use = $$$.

Re:Your Mileage May Vary by bracher · 2000-11-13 02:44 · Score: 3

3. MySQL's "text indexing" is useless. The evaluation function returns every record that contains any of the search terms; there is no way I've found to require all search terms. No documentation, of course.

actually, in the mysql fulltext search docs the mysql guys do a decent job of detailing the _two_ options for querying the fulltext index. putting the "match (a,b) against ('words')" in the "select" list returns the relevance rank against all rows in the table, including those with zero relevance. if you put the match clause as part of the "where" clause then mysql returns only those rows with non-zero relevance, ordered by the relevance score (but the relevance score is not one of the return columns).

- mark

Re:real-world choosings.... by dhogaza · 2000-11-13 02:48 · Score: 2

While InterBase is a fine choice (the OpenACS project intends to port to InterBase if the politics surrounding it clear up, as well as continue to support PostgreSQL), let me clear up two misperceptions about PostgreSQL that you apparently hold:

1. PostgreSQL does run under NT. It runs under cygwin, and I have no personal experience running it under NT, but it does run under NT (and, yes, I do mean the server, not just various clients).

2. In PostgreSQL, writers never block readers, so your statement to the contrary is a factual error. InterBase and PostgreSQL use a very similar paradigm for the storing of actual data, and InterBase's "Multi-Generational" stuff is equivalent to PostgreSQL's "Multi-Variate Concurrency Control", or "MVCC" for short).

Again, I'm not disparaging your use of InterBase, a fine product that hopefully will survive the current political machinations that surround it. But I can't let your misconceptions about PostgreSQL pass without comment.

Re:Your Mileage May Vary by StormyMonday · 2000-11-13 03:06 · Score: 2

But "nonzero relevance" doesn't mean "contains all terms"! Go back to the docs and take a careful look at the examples they give.

The only use I've found for the full text search is in ordering items found by other means. Marginal utility, at best.

It would also be nice if they gave a little formula somewhere for the "relevance". What's it really measuring?

--

--
Welcome to the Turing Tarpit, where everything is possible but nothing interesting is easy.

PostgreSQL Beta by AlexA · 2000-11-12 21:06 · Score: 1

Does anyone know when the PostgreSQL 7.1 beta is supposed to be released?

Re:PostgreSQL Beta by bmomjian · 2000-11-12 23:43 · Score: 2

We hope to go beta in the next month, and final around January/February.
Re:PostgreSQL Beta by iamriley · 2000-11-12 23:30 · Score: 1

Does anyone know when the PostgreSQL 7.1 beta is supposed to be released?

Yeah, and does anyone know when /. is going to make the switch to PostgreSQL? I'm tired of waiting several seconds for each page to load on my cable modem connection.

--
If you can read this, then I forgot to check "Post Anonymously".

Re:real-world choosings.... by Johann · 2000-11-13 03:29 · Score: 1

PostgreSQL does run under NT.

I had not considered the cygwin option and I concluded that PG does not run on NT natively when I went to download a binary distribution. Have you tried it under cygwin? Is there a performance hit because PG would run in the cygwin emulation, rather than native?

I was unaware of the MVCC of PG because the NT part was the first issue, so I did not explore use of PG after I didn't find a binary NT version.

Thanks for the corrections!

"Fat, drunk, and stupid is no way to go through life."

--
"You're gonna need a bigger boat." - Chief Brody

Recent MySQL News at Boston Perl Mongers by ajs · 2000-11-13 03:30 · Score: 2

Well, I just recently attended the Boston.pm group meeting which this time around was hosted by NuSphere. These are the folks that are doing MySQL modifications for their release of MySQL/Perl/Apache and PHP.

The big news was that a re-worked version of the *Progress* database back-end would be turned into an open-source MySQL-compatible back-end called Genie. This would be pretty huge, as Progress is everything that people complain that MySQL isn't. Also, MySQL is adding (either in their next release or the one after that, depending on the feature):

* Subqueries
* A boolean type
* Foriegn key constraints

Great by djweis · 2000-11-12 21:09 · Score: 1

It's good to see that Postgres has shown you can include real RDBMS features and still be faster than MySQL.

Re:Great by Cramer · 2000-11-13 00:49 · Score: 1

Too be fair, let's remember postgres and mysql were both designed completely differently by different people for completely different tasks. Each does what the creators see as necessary.

And, for the record, MySQL does has transactional capabilities -- I'm really surprised no one has posted this. There was a press release some months ago about this complete with instructions on how to activate support for transaction when building. (Note: I've not tried it.)

And while we're comparing databases, why not compare against some commercial engines like Sybase and/or Oracle? I know Oracle will shit kitten, but do it anyway.

BDB tables? by aozilla · 2000-11-12 22:02 · Score: 2

Was the author using BDB tables? If so, that's probably the problem. MySQL allows you to use BDB tables, which support transactions, but they are much slower. It is also possible that transactions, which are somewhat new to MySQL, are not yet well optimized. Either optimize your database to make transactions unnecessary, or use postgresql, at least for now. Why MySQL made the mistake of supporting transactions, I will never understand.

--
ok then your [sic] infringing on my copyright! Could you as [sic] me next time before STEALING my comments for your own?

Re:BDB tables? by FlyingDragon · 2000-11-13 06:06 · Score: 1

If I'm running a query that is updating a number of financial records to place orders and proccess information, I don't want that process to bomb out half finished
Fair enough, but don't buy into the myth that the majority of dynamic pages on the web involve critical information.
The majority of dynamic web pages are a dizzying array of counters, polls, news, comments, and other trinkets. Given how rarely a box goes down unexpectedly, losing a vote or comment is trivial.
Even in an e-commerce situation, not all data needs transactions. Take product info, for example. It's almost all read-only and staff-updated.
Mysql works wonderfully for dodging concurrency problems in a load-balanced cluster and offers faster, easier ways to sort data than a file system. BDB or no, I wouldn't trust it with financial transactions, but it's great for the splash of interaction most sites need.
Re:BDB tables? by Soulfire56 · 2000-11-12 23:15 · Score: 2

MySQL made the "mistake", as you call it, of support transactions for the simple reason that in a web environment like MySQL is often being run, transactions become especially important.
If I'm running a query that is updating a number of financial records to place orders and proccess information, I don't want that process to bomb out half finished, I want it to either be completed or fail to complete. I want the security of knowing that there isn't random orphaned data hiding within the system.
MySQL is a great quick small end database for web applications, and I really hope they find a way to optimize thier transaction work, because I really liked working with it. Until then we will continue in our transaction dependant world.
Re:BDB tables? by Precision · 2000-11-13 01:32 · Score: 1

Being the guy that setup MySQL on that box I can tell you that we have BDB support compiled into our MySQL, but the tables are still MyISM. Basically Tim wanted to test out the MySQL transaction stuff so I compiled it in..

--
- U

Re:Aditional questions: by Surak · 2000-11-12 22:04 · Score: 3

Which of them is most stable (with HUGE databases) ?

What does HUGE mean? Gigabytes? Terabytes? Exebytes? Also, size is not the only metric involved in stability. Stability with a given number of records or fields, for instance. A 100 gigabyte database could have 10 records or a million records. Depends on how big each record is. Another metric is the number of simulatenous requests it can handle. One might work fine if it gets a 100 requests at a time, but chokes if it gets a 1000 requests at a time. A million record database might be only accessed by 2 clients at a time...size doesn't equal number of users.

Which of them is most likely to be able to recover withouth glich from an "impure" shutdown ?

Good point, but they're probably about the same in that regard (just taking an educated guess here).

Which will benefit more from a HUGE ram quantity or multiple cpus ?

Most well-written software will take advantage of as much RAM as it can get. The question would be which program is better at managing its own memory?

As far as mutiple CPUs go, programs have to be written specifically to support multiprocessing. On Linux or Windows NT, for the most part, this means writing the code to use multiple threads. I'm not sure if either of them are or are not, but that would be a very good point.

--
My journal has hot /. gossip.

Re:July Article by xneilj · 2000-11-12 22:05 · Score: 1

What are you talking about? The linked article has a date of 20001112. At the top of this (new) aticle, the guy refers to a previous article he wrote back in July...

--
rm -rf / is the evil of all root

Maybe I'm ignorant... by superdoo · 2000-11-12 22:06 · Score: 1

but how does SourceForge function when it has pages with 16 queries, crashes with 30 simultaneous clients, and serves .77 pages/s with only 5 clients? I'm no web/database guru but that kind of performance seems crazy. I guess those projects can't be all that busy or the site would be locked-up all the time? or what?? Judging by the performance of this column at phpbuilder.com maybe that site has some performance issues as well. I guess my question is, should I be using these sites to learn PHP/Database programming? Do these performance problems show actual problems with the database server or with the database schema and program designer(s)?

As I said, maybe I'm ignorant and missing something as I am new to this type of development...

Re:Maybe I'm ignorant... by jbellis · 2000-11-12 23:02 · Score: 1

I'm guessing that they probably have beefier machines in production than they do for "let's test these databases."
Re:Maybe I'm ignorant... by johnnyb · 2000-11-12 23:13 · Score: 4

I think you're missing how many clients 30 is. As far as hard-hitting clients go, 30 is a _huge_ number to have simultaneously. If you remember MathWorld, it served up 2 Gig worth of data per day, yet only had between 5 and 20 clients at any one time, most of which were downloading pictures, not pages. I'd say that we were probably getting about 3 page connections per second (I could be wrong, my data is obtained from watching a tail -f on a log file, and remembering the results).

Anyway, the number of tables you have depends greatly on the problem you are trying to tackle. If you've used SourceForge, you'll see that there is a whole lot of data displayed on each page. I would be more worried if he would have been able to get all of this from a single database query (could you imagine a table that would do that - yuch)

--
Engineering and the Ultimate
Re:Maybe I'm ignorant... by supersnail · 2000-11-13 00:29 · Score: 1

The guy did point out that he specially selected the database intensive pages for the tests, so, this would not represent the typical usage of real people sufrfing the site.
This seems fair enough as he wants to thrash the database rahter than the web server.
Completely off topic now. If we put togther a system for a TPC/C benchmark using donated hardware and free software could we get a "transactions per second per $" of infinity??

--
Old COBOL programmers never die. They just code in C.

Re:MySQL wins hands down. by dhogaza · 2000-11-12 22:11 · Score: 2

The link to the supposed IBM study returns an error. Which is too bad, because I'd love to critique the study - Tim Perdue's benchmarking work isn't the first to show that MySQL's table locking paradigm breaks down under high load. Remember that he's been studying alternatives because SourceForge, using MySQL, has had problems in this area. And remember that Slashdot has helped fund the integration of the Berkeley DB backend and MySQL for the same reason.

It's not all myth, folks. Table locking sucks, page-level locking (Sybase) is much better and row-level locking (Oracle, Interbase, Postgres) much better still in high concurrency systems.

Re:What about other Database Tools? by ichimunki · 2000-11-12 22:20 · Score: 3

I don't know about MySQL, but from what I can tell psql for postgreSQL is fairly full featured in terms of wrapping SQL in functional language-- and comparing the minimal work I've done in GUI DB tools to the experiences I've had working in command line or Perl with pgsql, I'd take the non-gui process any day, it's much more direct, flexible, and doesn't suffer from the same tool-specific learning curve (i.e. solid Perl + solid SQL = extremely portable skill, whereas any of the listed commercial tools may or may not be portable-- this is important for both ramp-up time on new experiences/jobs/whatever, and for personal development or career building). As for the tools you mention, do any of these function via ODBC, and therefore have the capability to connect to any database with drivers? Not to say that you don't have a valid point, I'm just wondering if there isn't a good reason why the Free Software types haven't embraced GUI and ungeek-friendly tools...

--
I do not have a signature

What's www.phpbuilder.com Run? by semicolon · 2000-11-12 22:21 · Score: 1

I wonder what www.phpbuilder.com uses, Postgres or MySQL, because right now, its slow as molasses.

Re:What's www.phpbuilder.com Run? by Sxooter · 2000-11-13 06:34 · Score: 1

mysql. tim had problems with the earlier (6.5.3) versions of postgresql and corrupted indexes I believe

--

--- It is not the things we do which we regret the most, but the things which we don't do.

interbase "open source" db by wheel · 2000-11-12 22:23 · Score: 1

A while ago, interbase was brought to my attention as a commercial grade, free, "OS" database. Their licence, the IPL, is prety non-restrictive, but I don't know how compatible it is with the GNU/GPL. Any opinions on this?

I haven't had time to evaluate it myself, either. Anybody out there used it?

Re:real-world choosings.... by bugger · 2000-11-12 22:23 · Score: 3

Interbase is free, it is not restricted by the GPL.

Interbase has been released under a variant of the MPL that "protects" the Interbase name (IOW, only Interbase may release certified Interbase builds). Anyone may grab the code and use it whichever way they like.

Somebody mod this up by jabbo · 2000-11-13 03:31 · Score: 2

This gets right at the heart of why good Oracle DBAs often pull salaries in excess of $150K. At the end of the day, speed and reliability count for a lot more than other peoples' preconceptions.

--
Remember that what's inside of you doesn't matter because nobody can see it.

Another real-world observation by Adam+Wiggins · 2000-11-13 03:32 · Score: 2

We were using Informix at my workplace for a while; in less than a month the thing completely "melted down" and dropped all of our data. The suits were in a panic.

We switched over the Postgres in one swift maneuver, and it's been running great ever since. Speed was never in question; we just needed a database which wouldn't drop all its data one night while no one was looking. :)

Besides which, the ultra-cool pg_dump command makes backups a no-brainer.

Kudos to the Postgres team. And my company won't be dump enough to try to use proprietary software again in the future, that's for sure...

Pg 7.x kicks ass! Re:linux in manufacturing by StandardDeviant · 2000-11-13 03:44 · Score: 2

I've been using Pg since the 6.4 series of builds, and about a month and a half ago switched to 7.0.2. It is worth the upgrade IMHO, but of course since you're relying on it for critical business data you'll probably want to test it some first... ;-)

Oh, and if you don't have a budget how come you have a 600mhz alpha? 600 amd mhz is pretty cheap, but my impression is that 600 Alpha mhz are very much not... :-)

--

--

News for Geeks in Austin, TX

Re:Why even bother? by carlos_benj · 2000-11-13 03:45 · Score: 1

That may be correct. I don't remember off-hand which product it sprang from but they've taken vastly divergent paths since then.

--

As a matter of fact, I am a lawyer. But I play an actor on TV.

View the PRINTABLE page - one page by aint · 2000-11-13 04:09 · Score: 1

Well, the printable version is one page, read it.

-- .sig --

Re:open source MOBILE database by ichimunki · 2000-11-13 05:10 · Score: 2

That license is the most insanely tortured piece of legalese I've seen in a long time. How is it useful to call something Open Source when, after reading the license, I have almost no clue what I'm allowed to do with the software? Is this not proof that Open Source is a phrase that is too vague to be useful, since so many seem to feel the need to write these kind of "Open Source" licenses that confuse the issue of user freedom to a point beyond which it becomes almost unrecognizable?

--
I do not have a signature

Re:large? by xeno-cat · 2000-11-12 22:24 · Score: 1

Please name situation in which your clients database has reached 1TB. I have worked with SAP, and Baan (When Baan was solvant), have not seen a dataspace of 1TB and these are enterprise apps.

I'm not saying that 1TB and greater DB's are not out there, I know one individual who designed a multi TB database for some phone company. These, as the previous poster mentioned, are specialized. You are probably a consultent in a specialized market without even realizing it. The majority of the market does not see databases this size yet.

Cheers

--
"A few great minds are enough to endow humanity with monstrous power, but a few great hearts are not enough to make us w

Re:open source MOBILE database by RocketJeff · 2000-11-12 22:31 · Score: 3

db.linux is OSS Data Management for Information Appliances
Open source and royalty free.

Except that this is only true when using it on an 'Open Source Operating System.' This means that your project can't use db.linux if you ever want it ported to Solaris, AIX, or Windows - even if your software is Free/Open.

I'd love to use db.linux on a couple of projects but this limitation is a killer. The complete license is at License Terms.

Your Mileage May Vary by StormyMonday · 2000-11-12 22:32 · Score: 5

Please note that this benchmark (and any benchmark, for that matter) applies only to the system and application that it's testing. Database applications vary so widely that it's very difficult to get any meaningful numbers outside of very specific areas.

As an example, I recently ran some tests that came up with the exact opposite results. My application is a large message tracking database. Query speed is secondary; insertion speed is critical. MySQL handles my test data set in 20 minutes; PostgreSQL takes over 3 hours (!). For this application, PostgreSQL is Right Straight Out.

Other notes:

1. PostgreSQL's tables took up roughly twice the space of MySQL's.

2. MySQL's lack of transactions is a real pain. I can, however, work around it (in this particular application, at least).

3. MySQL's "text indexing" is useless. The evaluation function returns every record that contains any of the search terms; there is no way I've found to require all search terms. No documentation, of course.

4. The latest beta of MySQL can use Berkeley DB tables to get real transaction handling. Unfortunately, this is even slower than PostgreSQL.

Obvious conclusion: Run your own tests and draw your own conclusions.

--

--
Welcome to the Turing Tarpit, where everything is possible but nothing interesting is easy.

Re:Your Mileage May Vary by King+Babar · 2000-11-12 23:07 · Score: 3

Please note that this benchmark (and any benchmark, for that matter) applies only to the system and application that it's testing. Database applications vary so widely that it's very difficult to get any meaningful numbers outside of very specific areas.
As an example, I recently ran some tests that came up with the exact opposite results. My application is a large message tracking database. Query speed is secondary; insertion speed is critical. MySQL handles my test data set in 20 minutes; PostgreSQL takes over 3 hours (!). For this application, PostgreSQL is Right Straight Out.

Well now, that information is completely useless, because you didn't mention anything that I would need to replicate your result. You didn't mention the versions (or the identities, really) of the databases you used, the hardware, the operating system, anything very interesting about the application...
So what exactly were we supposed to glean from this? In particular, the point of the target article, as I understand it, was really "PostrgeSQL used to be horribly slow for our application, but when we had some issues with our current set-up, we re-tested the very newest version of PostgreSQL and the latest mySQL we could deal with, and, wow, things have changed."
Now, I don't personally either believe or disbelieve what you posted, but only because I have no idea what proposition I'm supposed to be interested in here, because you didn't tell us.

--
Babar
Re:Your Mileage May Vary by Sxooter · 2000-11-13 00:35 · Score: 1

With regards to postgresql... Did you try bracing begin/end commands at each end of your super large insert series? On my box, 10000 inserts run in 11 seconds with begin end. 100 run in 12 seconds without them.

--

--- It is not the things we do which we regret the most, but the things which we don't do.
Re:Your Mileage May Vary by StormyMonday · 2000-11-13 01:36 · Score: 2

Yup. The results have the BEGIN/COMMIT brackets in there. You're right; it does make a really big difference.

Question is moot, anyway. We'll probably just stuff everything into a flat file and forget about "online response" (:-(.

--

--
Welcome to the Turing Tarpit, where everything is possible but nothing interesting is easy.

Re:Why the partisan fuss? by tuffy · 2000-11-12 22:42 · Score: 2

I think the lack of tools, row-level locking, transaction support, etc. in MySQL are a shame. Yet, people talk like its the best thing since sliced bread.

I'm pretty certain few people actually think MySQL is a wonderful DBM that's the best thing since sliced bread. Most are keenly aware of its deficiencies but defend it because of two features that are invaluable to a successful open-source product:

It is well-documented. MySQL has copious amounts of online documentation (much more than PostgreSQL, as I recall) and at least two large books (one if which is quite informative). The importance of good documentation cannot be overstated.
It interoperates well. MySQL has bindings for about a zillion different languages, whereas PostgreSQL is still catching up. This gives MySQL the appearance of being better supported, and in the world of open-source, the better supported product will win out over the technically superior one.

There's no doubt that PostgreSQL is a good DBM, but MySQL will remain more popular until the support for it arrives.

--

Ita erat quando hic adveni.

Re:MySQL wins hands down. by segmond · 2000-11-12 22:45 · Score: 2

the url you posted is wrong, please post the correct URL, I do like to read IBM's report.
thank you.

--
------ Curiosity killed the cat. {satisfaction brought it back | it didn't die ignorant | lack of it is killing mankind

Re: What about ... - They're out there by evolspit · 2000-11-12 22:49 · Score: 1

There are some admin tools for PostgreSQL here.

pgadmin - 32bit app for win*

phpPgAdminweb interface admin. This is a port of phpMyAdmin for MySQL (which I believe can be found at zend.com)

Re:16 queries on one page? by nebby · 2000-11-12 22:53 · Score: 3

I'm pretty sure k5 sets a limit of <100 queries on a page before they start worrying about performance. At least that's what Rusty said if I remember correctly. I'm guessing they probably run about 10-20 queries on most pages, however.

My site runs about 10-20, but thats only after caching lots of data in memory.

--
--

Re:Aditional questions: by maraist · 2000-11-12 22:54 · Score: 2

Mysql uses multi-threading (each connection gets it's own thread), but I've read slashdot posted benchmarks that show Mysql failing after so many threads are spawned via concurrent connections(I think 60). Perhaps they're using some finite resources and the thread spawner was not using semephores or some other such resource control.

Postgres uses multi-processing (each connection gets it's own process) with shared memory segments and a controlling/monitoring process (postmaster).

Thus if Mysql could do a better job of managing resources under heavy loads, then it would waste less physical memory. Additionally, I don't know how good Linux is with multi-threading under heavy loads. Can it faithfullly handle hundreds of potentially thousands of concurrent/competing threads? I've never seen any tests to show this... I'm aware that the default max processes/threads was hard-coded in the kernel to 512 at one point (I believe it requires a recompile to change). This might have something to do with mysql's crashing.. You'd have to go to single-user mode to max this puppy out. Additionally I don't know if it's possible for an app to determine how many more threads can be spawned by the OS.. If a thread fails to spawn, then you can corrupt the entire process.. In MP, if you fail to spawn a process, then you simply lose that connection.

In general, I'd prefer MP on a Linux system. It's a similar story with Apache. You get a lot of reliability, and not _that_ much memory waste when compiled correctly. Additionally, MP has the potential to span computers (as in a slasdot article on a multi-hosted extension to Linux)

-Michael

--
-Michael

Low end machine? by linuxgod · 2000-11-13 05:11 · Score: 1

How would these 2 compare on a low end machine?
Around 200mhz / single processor?

Since i run a small site that uses MySQL, Apache/Php4,
I would like to know. I might convert the server over to
postgres if it would show performance above mysql on
this type of machine.

I don't think it would make much difference, accept
that it would give me more flexability with my site
and the same/ or better speed. Most of my
traffic is on FTP anyway.

ETRN x

Re:Low end machine? by linuxgod · 2000-11-13 11:02 · Score: 1

More like you ridiculing me.
Your the only one on here doing it.
Its childish too.

ETRN x
Re:Low end machine? by linuxgod · 2000-11-14 05:25 · Score: 1

You are an idiot.
Repeating yourself.

Look, he is still at home at 8:00am.
What a looser.

ETRN x

Re:MySQL wins hands down. by nedlilly · 2000-11-13 05:15 · Score: 1

The link you included to IBM returns a 404. Got another one?

A soon-to-be-open-source by popocatapetl · 2000-11-13 05:58 · Score: 1

Yes, this is a plug for something that I am working on, but it's something that we are very proud of and which is going open source on x86 GNU/Linux.

Check out GT.M at http://sourceforge.net/projects/sanchez-gtm. This is a vetted, industrial strength high speed transaction processing database that is in production worldwide at a number of large (and small) banks, and has been benchmarked at 1,200+ online financial transactions per second on a high end RS/6000 (http://www.sanchez.com/news/pr000828.htm), every transaction having full ACID properties.

It's not open source yet, but it will be soon (other niceties such as a project home page are not quite there yet because we're busy scrambling to get a release out). Meanwhile, a binary can be downloaded, as can PDF manuals.

GT.M also has some unique database replication capabilities for creating an application that is continuously available 24x7, even if the data center is blown away or even as the underlying database schema is upgraded. Read the manual for details.

What's missing? SQL/ODBC/JDBC access requires a commercial product (an open source freeware solution is not yet available).

Re:Aditional questions: by The+Mayor · 2000-11-13 06:53 · Score: 1

Actually, I think "Exebyte" is a company that manufactures 8mm tapes and tape drives. Tee hee!

--
--Be human.

Re:open source MOBILE database by maw · 2000-11-13 08:14 · Score: 1

Open source but not free.

If it's not free then it is not open source.

But do realise that you can pay money for software which is open source (or free -- they're the same thing).

--
You're a suburbanite.

Re:Insertion of large text objects by dhogaza · 2000-11-13 08:25 · Score: 1

PostgreSQL, as distributed, comes configured with an 8KB blocksize, which effectively limits the size of varchar to a bit under that. However, you can configure it to use a blocksize of up to 32KB, which means you can handle the largest string Netscape will return from a textarea widget.

Among many other enhancements, PG 7.1, due for beta very shortly and the version Tim Perdue reviewed in his article, removes this restriction. You can have varchars (and other large types) of virtually any length - though at some cost in performance. Some types (like text) can also be compressed much as InterBase does with its text type.

Sourceforge is SLOW by ink · 2000-11-12 22:59 · Score: 1

I know that this may be off topic, but Sourceforge is one of the s l o w e s t sites on the internet. I wouldn't mind so much, if they hadn't co-opted so many open source projects. The other day I was trying to download Galeon and the FTP download kept timing out. An Australian mirror site handed out the goods in less than 10 seconds (we have very fast internet connections here). I doubt this is a side effect of using MySQL's table locking, but I suppose it could be; regardless it is frustrating to see more and more projects moved to an already incapable system.

Their HTTP serving seems to be a quick as can be, it's only while attempting to access download.sourceforge.net that it becomes unusable.

The wheel is turning but the hamster is dead.

--
The wheel is turning, but the hamster is dead.

Re:Sourceforge is SLOW by Precision · 2000-11-13 01:40 · Score: 1

This is something we know about and that we're fixing. Look for a big boost in a couple weeks.

--
- U

Re:Aditional questions: by god,+did+I+say+that · 2000-11-12 22:59 · Score: 1

As far as mutiple CPUs go, programs have to be written specifically to support multiprocessing.

No they do not.

this means writing the code to use multiple threads.

No it does not. It is the path of least resistance, though. Unfortunately, it tends to be the buggiest path, as well.

--

--

-- Eat right, exercise regularly, die anyway.

Stored Procedures? by guinsu · 2000-11-12 23:01 · Score: 1

Do either MySQL or PostgreSQL include the ability to use stored procedures? I've found them to be a life saver on SQL Server and wouldn't mind trying open source/gpl'd databases if they had that feature.

Re:Full Text indexing. by bigdisk · 2000-11-12 23:03 · Score: 1

I don't think the /contrib/ code for full text indexing in postgres counts as real FTI. There's no relevancy ranking and it's kludgy to set up and select data out of it.

However, it does make good use of postgres' triggers and someone with some time could enhance it to make it useful.

In particular, someone should enhance it to make the "noise words" list selected out of a database table, instead of hard-coded in the C-code of the.

Re:16 queries on one page? by johnnyb · 2000-11-12 23:04 · Score: 2

Actually, this probably means that the database was very well-designed. I have seen way too many databases that consisted of one to three tables. Most real-world databases require a whole lot more. Maybe you should also look at the page he is talking about before you comment. That page has a lot of functionality.

--
Engineering and the Ultimate

Re:What about other Database Tools? by Chagrin · 2000-11-12 23:06 · Score: 1

Just use the appropriate ODBC driver (myODBC for example) and keep using your tools. Works fine.

--

I/O Error G-17: Aborting Installation

Re:What about other Database Tools? by segmond · 2000-11-12 23:40 · Score: 2

Since you bring this up, you seem to bring it up as if you would like to use it. Open and free software come as a result of programmers wanting to scratch their itch, if you have this itch, why don't you start the project? What are you waiting for? You can do it, even if you are not a great coder, start something, look at slashdot, look at where it is, it didn't come from a great codebase, but it is here because it was started. Scratch that itch! :)

--
------ Curiosity killed the cat. {satisfaction brought it back | it didn't die ignorant | lack of it is killing mankind

Re:16 queries on one page? by jslag · 2000-11-12 23:06 · Score: 1

Anyone who has designed a site and its database in such a way that generating a page requires 16 queries should not taken seriously when conducting a database performance test.

Nonsense. One page may well be doing some extremely complicated processing, which has evolved over many iterations of the code - it's not hard to imagine that such a page could have 16 queries in it, especially if it's being written in an oo style. Sure, in an ivory tower you could spend days / weeks refactoring and cut down the number of queries a bit, but computer time is now significantly cheaper than programmer time.

Re:large? by NineNine · 2000-11-12 23:07 · Score: 1

Oh, easy. I'm an ASP/COM/Oracle developer. I jsut got finished converting a state's Medicaid system from Mainframe to Web-RDBMS based. The Medicaid claims filled approx 1.5 terabytes, and grew at a rate of 100 gig/month.

Re:16 queries on one page? by Vryl · 2000-11-12 23:44 · Score: 1

What about one that performed several hundred or even several thousand?

Such things exist, and you, Sir, are talking out of your hat.

Pray tell me your criteria for deciding that it was 'very badly designed', and how you would do better.

oracle by stepson · 2000-11-12 23:52 · Score: 2

You can get Oracle 8i for free from oracle, and while its a bear to setup, I've found it to be faster than PostGRE 7.0 in a lot of ways ... of course, you'll make up for that with all the money you'll spend in training costs ;).

Re:oracle by stephend · 2000-11-13 00:05 · Score: 1

For anyone that thinks MySQL is good, you should take a look at Oracle. It may be closed-source, but it does make MySQL look like a toy. The basics are no more complex than Postgres but there's a lot more to it.

And if you need help setting up Oracle, have a look at my web site (there's the Oracle on Linux Installation HOWTO there).
Re:oracle by Sensor · 2000-11-13 00:14 · Score: 1

thats for a non-commercial training/development license only!

Oh, good by karma+kameleon · 2000-11-12 23:59 · Score: 1

Finally, a SQ..., *cough* I mean, sequel to the the last database article.

Re:Test Methods Allert by Anonymous Coward · 2000-11-13 00:03 · Score: 1

What *I* noticed is that in one graph the number of request *per*second* increseased for PostgreSQL. And this does not look right, so the original poster is quite right.

Sorry, No.

Yes, the number of requests-per-second increased (by a large amount at first, then leveled off after you hit 20 clients) but keep in mind that you're MAKING MORE REQUESTS. If the database is properly scalable, the number of simultaneous connections will increases until the DB hit's its' processing limit, and then level off, which is exactly what happened in the graph on pages 3 and 4.

Think about it - (on page 3) the DB is capable of handling (approx) 3 "complex queries" per second - it doesn't matter how many clients are making those requests... now, if your clients take too long to process the data, and (with 5 clients) only request 1.5 pages per second, then that's all that the DB will serve.

The initial scaling period (from 1.5 to 3) tells me that at 5 clients, the DB is waiting around, and that the web server (or whatever is processing the returned data) is the bottleneck - you increase the number of clients, the DB sever has less waiting around to do.

The problem with the article is that there is no information on the queries themselves, or how he conducted the tests (did he run each one several times and average the results? what were the test systems?, etc) .. With some substantive data information, we could find out what the reasons for the results were, but right now, all we have is conjecture.

Re:Aditional questions: by AndroSyn · 2000-11-13 00:05 · Score: 1

One of the things not often mentioned with multiprocess vs multithreaded applications is the penalty you must pay for keeping the cache on multiple CPUs sync'ed when dealing with multiple threads. Also with linux threads are scheduled in the exact same fashion as processes so you gain even less.

Now on an OS like Solaris, you might gain from multiple threads, but only because Solaris' scheduler can tend to suck hairy nuts at times...

Re:Aditional questions: by Emil+Brink · 2000-11-13 00:05 · Score: 2

What does HUGE mean? Gigabytes? Terabytes? Exebytes?
Ah, "exebytes". That's the unit of choice for measuring the size of executables produced by that big & rich company, right? :^) The SI prefix you're looking for is exa. On the other hand, perhaps exe is handy as a "computerization", along the lines of the "K" for 1024 in KB (SI uses "k" for 10^3)... More information about SI prefixes is available here. Or, maybe your fingers slipped and I'm just being more nitpicking than usual. ;^)

--
main(O){10<putchar(4^--O?77-(15&5128 >>4*O):10)&&main(2+O);}

Re:Why the partisan fuss? by jidar · 2000-11-13 08:36 · Score: 1

paraphrased: Dumb newbies use Mysql, but us smart people use postgres.

This is insightful? Surely it should be moderated as flame bait.

--
Sigs are awesome huh?

Re:real-world choosings.... by TheLink · 2000-11-13 08:48 · Score: 1

Even though I haven't tried, I wouldn't recommend it.

My general rule is: don't use stuff made for Unix on NT, unless you're just messing about, looking at features.

One of the main reasons is often stuff made for Unix assumes forking isn't that expensive. But that is not true on NT. Forking in NT is a real killer. It's almost like double clicking and launching an app. You'll have to wait for the app to do threads _well_ first.

--

Too many replies beneath your current threshold

Open Source SAPDB by Aussie · 2000-11-13 09:47 · Score: 1

anyone looked at SAPDB by SAP ?

They claim Enterprise strength, Free and source available April 2001

Binaries available now at www.sapdb.org

Too many for your simple mind? by bartwol · 2000-11-13 10:40 · Score: 1

You are way out of your league here.

The functional needs of an application (and users) should drive the technical design, and scaling is just another challenge to be met after addressing functional requirements.

I just came off a site design that has 10+ complex queries per page, and this was after extensive optimization. Mind you, the pages have extraordinary, dynamic, user-responsive functionality. If we needed 16 to deliver the needs, then so it would have been.

As for database scaling, there are many options for clustering and replication and multi-processing. Your own words suggest you don't think in this realm. My world is perhaps not so simple as yours.

You speak with such authority, and yet, you are plainly presumptuous, uninformed, and hasty in your conclusions. Listen more. Talk less. THINK.

Re:Too many for your simple mind? by macpeep · 2000-11-13 15:52 · Score: 2

You are way out of your league here.
Not so. You say "you are uniformed and hasty in your conclusions" and then you say stuff like I quoted above. I have extensive experience in large scale web site back end design and have been a lead engineer in several sites that have very heavy loads (such as large dot-coms and large e-commerce sites for more traditional "chimney" companies). I do know what I'm talking about here.
I know quite well that a site might need to pull data from a large number of data sources to build up a page (especially in personalizable portal-type front pages). This doesn't mean that you have to do a large amount of database queries, however. Caching is one crucial strategy if you're going for scalability and speed, for instance. Planning the site and page layout so that you don't need so much data for one page is also crucial. If there is no clear simple way to cache the data you need to produce something more complex then cache the result (yes, the HTML!) and only update the HTML every so often - as needed - based on some flag. Read the flag together with some other data that you must get from the database anyway. This is what I'm talking about.
I'm honestly majorly surprised that the performance test mentioned that the site used a page that required 16 queries. When I saw the performance results, I wasn't so surprised they were so poor. I'd like to see the source code for that page..

Re:Test Methods Allert by Tom+Davies · 2000-11-13 10:51 · Score: 1

>Did anybody else notice that PostgreSQL actually
>improved performance with more concurrent users ?

>The only explanation here is the caching
>behavior of the data base.

No, the other explanation here is that the test was run on a quad processor machine!

Perhaps PostgreSQL can perform simultaneous queries in different threads?

Tom

Tom

--
I have discovered a wonderful .sig, but 120 characters is too small to contain it.

Re:Article summary: by paulschreiber · 2000-11-13 11:28 · Score: 1

It is not uncommon to have orders of magnitude performance differences between databases. I would love to know how what a tuned ORACLE would get.

Well, if you have 4 GB of RAM and 50 GB of disk space, you might be able to find out. :-)

Paul

Re:Aditional questions: by bigdisk · 2000-11-12 23:08 · Score: 2

Stable on huge databases? Geocrawler has 10GB of text in a giant table and runs with only 750MB RAM on a single server. Since updates/inserts/deletes can happen simultaneously, geocrawler does not shut down when new emails arrive.

Stability from an impure shutdown? This is definitely a win for MySQL. Postgres has completely self-destructed for me a handful of times when the machine it was on hard-locked. The only recovery method is to create a fresh database from a backup dump.

Huge RAM and multiple CPUs? This is probably a function of the OS more than anything. MySQL is limited more simply because of its locking problems, right? So no matter how many CPUs you have - only one can update a given table at a time. That's a pretty important limitation if you have a database that has any updates simultaneously with selects.

load control by woggo · 2000-11-12 23:10 · Score: 4

PostgreSQL increases total throughput with an increased number of clients, although it does not do so linearly. That's not the same as "increased performance". Think "scale-up", not "speed-up".

Why? The answer is simple: load controlling. Basically, as concurrency increases, the cost of locking and that of buffer pool misses become prohibitive and transactions begin to starve. So most "smart" DBMSes will queue a few transactions for later processing after a certain point.

Because MySQL uses such coarsely-grained locks (at the table level), the opportunity for concurrency is very low to start with; hence the decreased throughput and starvation.

If you're interested in this stuff, look at Jim Gray's seminal paper "Granularity of Locks and Degrees of Consistency in a Shared Database" and Chou and DeWitt's "An Evaluation of Buffer Management Strategies for Relational Database Systems." You should be able to find either from the ACM Digital Library or in an anthology (like Stonebraker's _Readings in Databse Systems_).

~wog

Re:16 queries on one page? by maraist · 2000-11-12 23:10 · Score: 3

Essentially what that test showed was that mysql had to lock all those tables in order to do the join.. And that with Postgres's versioning system, it was a trivial matter.

I remember reading on the topic of excessive joins in a Sybase Admin book once (excelent book on "practical" database, since it shows you exactly how one company solved all the DB problems). It essentially said that determining join order was actually a very slow process.. Exponential in fact.. Especially if your DB tries to determine which indexes to used based on statistics... The selection of indexes increases the number of possible variations incredibly. Sybas's solution was to only compare a select few of the possible join methods - which is a trade-off.. Alternatively, if you were to create a stored procedure, then which-ever method was best at creation time will be used from there-on-out...

Stored procedures - in Sybase at least - reduce a great deal of the overhead, especially in a 16-way join. Sadly, I don't know if Postgres does ANY pre-compiling of stored procedures. They have two main methods: straight named 'sql' statements, and 'psql' (a true stored procedure 'language'). My guess is that they do little more than pre-parse the statment (if even that). If this is the case, then using a stored procedure might even be slower than raw SQL since a call requires the parsing of the function name, then loading of the function into memory from a table, then the parsing of that statement (two more steps than raw processing). It is, however, necessary for triggers and rules.

Postgres (and I believe mysql) has raw-c extensions, but I really dislike these for general DBA operations (not to mention I haven't spent the time learning them).

As for the 16-table join, a general web page probably shouldn't do this (especially a heavily used page), but a database SHOULD at least let the DBA perform the ad-hoc query when trying to extract information. The only real solution is summary information, which often tends to be more trouble then their worth... Another solution is de-normalization (which makes me feel dirty)... Basically like what Blacknova.net uses... If you have 1 to N data, then just pick some upper bound N and include that many extra columns directly in your main table. Fast and efficient, provided that you choose the right N.

-Michael

--
-Michael

Re:Test Methods Allert by f5426 · 2000-11-12 23:12 · Score: 2

> What I noticed was that as you increased the number of clients, the total number of requests served increased, while the number of requests per client was lower - which is exactly what should happen - the performance decreases as the number of clients increases.
What *I* noticed is that in one graph the number of request *per*second* increseased for PostgreSQL. And this does not look right, so the original poster is quite right.

> The problem is that you don't know how to read a graph.

Compare graphs on page 3 and page 4. Then come back explaining us how *you* read those graphs.

Interpreting profiling results is a very difficult task. In that case, I suspect that it is impossible without getting the hands on the test configuration to try various alternative loads. Results are (IMHO) too strange to be accepted without further investigation.

Cheers,

--fred

--

1 reply beneath your current threshold.

Re:16 queries on one page? by bigdisk · 2000-11-12 23:15 · Score: 1

The source code to SourceForge is available in CVS and you are very welcome to make suggestions about how to reduce the number of queries on that page. As the author of much of the sourceforge codebase, I'd love to see a patch come in that somehow magically reduces the database load.

Also, the article clearly states that these are worst-case scenarios, chosen to test the database, not the PHP code.

Re:Why even bother? by johnnyb · 2000-11-12 23:16 · Score: 1

Isn't Informix based on an old version of Postgres?

--
Engineering and the Ultimate

Re:Why the partisan fuss? by tuffy · 2000-11-13 00:05 · Score: 2

Actually, you are quite wrong. PostgreSQL has more documentation

The online PostgreSQL book is ~490 pages and the latest online MySQL manual I have is ~460, which is comparable. But MySQL also has an >700 page "New Riders" book and an >460 page O'Reilly book. Other than the online text, I haven't found any other PostgreSQL books, which is unfortunate.

Python, in particular, has PostgreSQL bindings but they don't yet conform to the 2.0 DB spec, which is a pain. In general, the PostgreSQL modules seem older and less maintained than the MySQL modules, which is both a cause and a symptom of support problems.

Again, I see nothing technically wrong with PostgreSQL. But if it's going to get more popular, I still believe it needs more support in terms of documentation and languages.

--

Ita erat quando hic adveni.

Windows version? && PostgreSQL vs. MSSQL by Da+VinMan · 2000-11-13 00:08 · Score: 2

I have been unable to locate binaries for Windows NT/2000? Does anyone know where/if such a thing exists? I would like to evaluate PostgreSQL for our project instead of SQL Server 7.0 or 2000. I would also appreciate comments anyone has on such a move.

Thanks in advance!

--
Please mod this post only if you think others should/n't read this. I have enough ego^H^H^Hkarma. Thanks!

Re:16 queries on one page? by benwb · 2000-11-13 00:09 · Score: 1

On the other hand, if you throw gobs and gobs of memory at your db, shouldn't it be able to handle the caching for you?

Banner ads! by Da+VinMan · 2000-11-13 00:11 · Score: 2

Subject says it all.

--
Please mod this post only if you think others should/n't read this. I have enough ego^H^H^Hkarma. Thanks!

Re:What about other Database Tools? by realkiwi · 2000-11-13 00:16 · Score: 1

pgaccess.

Tcl/tk and in the distribution

Rocks

--
realkiwi

Re:Why the partisan fuss? by Jason+Earl · 2000-11-13 00:24 · Score: 2

This is absolutely not true. There are three Python 2.0 DB compliant PostgreSQL drivers. PyGreSQL has just added 2.0 compliancy, PoPy is 2.0 compliant, and there's another one that also purports to be 2.0 compliant.

Apart from Momjian's excellent online manual there is still the existing PostgreSQL documentation. Everything from basic queries to creating your own types, to hacking out extensions in C are covered by this documentation. MySQL may have more books about it, but most of the information in these books will be duplicate.

The fact of the matter is that PostgreSQL has more features than MySQL, is more standards compliant (SQL-92), is released under a more liberal license (BSD style versus GPL), and it is now even starting to outperform MySQL for simple selects and joins.

The reasons for staying within MySQL's limited abilities are getting smaller and smaller. Why not just use PostgreSQL in the first place? Eventually every successful database project is going to want the features that PostgreSQL has, why start with MySQL and face the prospect of migration later.

Threads vs processes by tmu · 2000-11-13 00:31 · Score: 4

I'm sorry but I think this grossly misses several points and undermines the appropriate difference between threads and processes.

One way to think of threads is as a solution to bloated processes. In operating systems like Solaris, which are designed to scale to 32 or 64 (or more) processors, the number of locks in the kernel is enormous. As a result, context switches and process creation time tend to be higher. In gross terms, you pay for the concurrency at high numbers of processors with poor performance at lower numbers of processors. This is not necessarily bad (especially if you plan to use 16 or more processors at some point) but it is just the way things are.

Linux, on the other hand, penalizes scalability for large numbers of processors in order to get much better performance with smaller numbers of processors. Linux does this because Linus doesn't believe that 16+ processor machines are common or sensical and that the kernel should be optimized for common (and sensical) cases.

Why do I mention all of this? Because Linux process creation times are slightly faster than Solaris thread creation times. Anyone who fetishizes multi-threading versus multi-processing doesn't really understand the difference between the two and when it really matters.

Processes are contexts of execution. Threads are contexts of execution that may share a memory space with other contexts of execution. These are fundamentally different things in some operating systems. In Linux, a thread is just a process that shares a memory space with another process. This is because processes are *fast* to create and switch to in Linux (optimized for the common case, remember?).

Anyway, I don't mean to bash maraist at all, just to point out a common set of misconceptions about these things. When considering these architectures, each database system must try to optimize across all of the operating systems they plan to be deployed on (just like the apache project, which i think was an excellent analogy).

real-world choosings.... by cjsteele · 2000-11-12 21:09 · Score: 4

I work for a large manufacturing company, and recently we submitted our budget for next year (2001)... well, on there we had SQLServer, and a bunch of other commercial software. We got our budget back in its final form, and it did NOT include our SQLServers... we needed a database system, so I threw together a proposal for some smaller MySQL servers.

My manager liked the idea, but wanted info on other open source databases... I pointed him towards Tim's article (this was last week) and we've been on our merry way since. We now have both MySQL and PostgreSQL boxes in production for testing, and will make our determination later on!

Tim's article played a HUGE roll in our initial impression forming, and I would recommend it to all and any who are looking for info on open source databases!
-C

--
"This above all, to thine own self be true" :x!

Re:real-world choosings.... by cjsteele · 2000-11-13 16:31 · Score: 1

Thanks for the tip... unfortunately, after poking around the interbase home page for a few minutes, it doesn't look like their documentation is quite up to snuff! :-(

It did look like Interbase supports everything that I need (triggers, stored procedures, transactions, views, and SELECT INTO statements), but I can't seem to locate much information on what the language syntax is for their stored procedures, or other such key documentation.

Any help? Have you used their database to do ODBC programing in windows? Fortunately, I'm just the lowly shmuck who has to administer whatever database we finally go with; I just have to make sure it works for everything we need!

cheers!
-C

--
"This above all, to thine own self be true" :x!
Re:real-world choosings.... by Johann · 2000-11-13 01:34 · Score: 2
I am building a data entry site in PHP + Interbase and Interbase has shown itself to be rock solid (both on NT and Linux).
I could not use PostGreSQL because it does not run on NT (client only has NT server and does not want a Linux box for this system). I don't know much about PG, so I don't know how good it is.
I did not use MySQL even though I have run it on Linux and NT for these reasons:
- No transactions (could not wait for MaxSQL before implementation).
- No integrity constraints. Unfortunately, data entry systems require integrity constraints and the attitude of the MySQL developers is IMHO assinine. I have worked on databases with and without integrity constraints and in my experience, the data is always worse in databases where the philosophy is to 'let the database developers' deal with it.
- Unlike MySQL, PG, and most databases, in Interbase, writers never block readers. Check the product overview for details.
Interbase may not have industrial strength for massive number of concurrent users, but this particular implementation is a data entry, low volume system. I cannot personally attest to the viability of Interbase in a large number of users.
Concerning the 'opennes' of Interbase, the source is released, there are no restrictions and this database is not going anywhere soon. The main problem is that Inprise's release of the source left a bad taste in the mouths of free software advocates. This issues have been remedied and they have delivered on their promise to release the source.
Check Interbase out, you'll be surprized.

"Fat, drunk, and stupid is no way to go through life."
--
"You're gonna need a bigger boat." - Chief Brody
Re:real-world choosings.... by deefer · 2000-11-12 21:34 · Score: 3

What about Interbase?
I'm a bit confused about the whole deal, Interbase is now Open Source, which is nice, but is it free beer? The website (here says it is freely downloadable and open source) but is it free as in beer?
Interbase is a nice database for small to mid range deployments. Fast, small, and rock solid. But nobody is getting behind it - the Interbase section on sourceforge has no files released yet.
Any other /.'ers out there with links to what is really going on? I gather there was some grief and a splinter group has been formed (FireBird?)

Strong data typing is for those with weak minds.

--
Strong data typing is for those with weak minds.

Full Text indexing. by Anonymous Coward · 2000-11-12 21:12 · Score: 1

Postgres has FTI.. Well sorta, it's a hack included with postgres but it works reasonably well.

They should plunder ispell for ideas.

So much for MySQL by jm91509 · 2000-11-12 21:13 · Score: 2

Which of the two is the more widely used? All you hear is that MySQL is the best. Seems that this is not the case...

Re:16 queries on one page? by paulschreiber · 2000-11-13 11:35 · Score: 1

Gee, all this talk about normalizing and denormalizing reminds me of the BCNF question on my databases midterm. :-)

There are methods for determing (theoretically) nice ways of breaking up tables to eliminate redundancy and ensure lossless joins. Look up Boyce-Codd Normal Form, 3rd Normal Form, et cetera.

Of course, if your DB gets real large, you may not want to do joins anyway.

Paul

Re:Why even bother? by Anonymous Coward · 2000-11-12 21:14 · Score: 3

We are a medium sized company developing web apps. We are not poor, but we would be if we ruled out open source software. We use db2 for most work, but postgres is being investigated and we have found it quite workable for our many needs.
Your comments remind me of similar ones a few years back when we started using linux and ppl said "serious companies don't use linux" Nobody (except a small-minded minority of change-haters) is saying that now.

Re:Why even bother? by jm91509 · 2000-11-12 21:17 · Score: 1

Is this for the support that you get, or are these genuinely better at handling large databases (>100Gb)? Any pointers to stats on MySQL/PostGRE handling large databases?

Kinda Ironic by FrankDrebin · 2000-11-13 12:16 · Score: 1

Anyone else find it ironic that in the published report the graphs look a lot like they were done in M$ Excel (at least to me it looks that way)? I guess it still shows that even us open-source pundits find it convenient to stick with commercial apps sometimes. One might have tried putting the benchmark data into either the MySQL or Postgres open-source DB, then whip up a bit of Perl or Python or something and fire it through some open-source tools for graphics. Does anybody doing this sort of thing on a regular basis have any suggestions?

--
Anybody want a peanut?

Test Methods Allert by CaptainZapp · 2000-11-12 21:18 · Score: 5

Did anybody else notice that PostgreSQL actually improved performance with more concurrent users ?

The only explanation here is the caching behavior of the data base. Howerver, this also indicates that benchmarking databases is not a really trivial task, because exactly such effects must be considered for database benchmarks.

Further, besides a few graphs. The test says actually nothing different then: PostgreSQL improved strongly, while MySQL is a dog. Actually it says nothing at all.

Not that I mind the results. I worked with both databases briefly and believe that PostgreSQL is far closer to an industrial strength database (Lack of transaction control disqualifies MySQL for that in the first place). Nevertheless, I think the results really lack any significance.

--
ich bin der musikant

mit taschenrechner in der hand

kraftwerk

Aditional questions: by psergiu · 2000-11-12 21:19 · Score: 4

- Which of them is most stable (with HUGE databases) ?

- Which of them is most likely to be able to recover withouth glich from an "impure" shutdown ?

- How much disk space can they waste in tablespaces with frequent spaces ?

- Which will benefit more from a HUGE ram quantity or multiple cpus ?

--

--
1% APY, No fees, Online Bank https://captl1.co/2uIErYq Don't let your $$$ sit in a no-interest acct.

Re:Aditional questions: by Surak · 2000-11-14 04:17 · Score: 2

Erhm... you just contradicted yourself.

First you say that programs do not need to be specifically written to support multiprocessing and then you say that using multiple threads is the path of least resistance [to support multiprocessing.]

Actually, applications can run on SMP machines with no problem, but if they use only ONE process or thread (about the same thing on Linux, but very different things on Solaris, as a previous poster mentioned), then they will not be running on more than one processor, thus not taking advantage of multiprocessing... Yes, there will be SOME benefit when running other processes, but the application taken by itself (not counting other processes...) will not have any added advantage other than the fact that the proocessor it is running on will have more cycles free.

--
My journal has hot /. gossip.

Re:Article summary: by f5426 · 2000-11-13 18:16 · Score: 1

> Well, if you have 4 GB of RAM and 50 GB of disk space, you might be able to find out. :-)

The article said the hardware was a Quad-Xeaon with 1Gb of RAM. I beleive that ORACLE may be able to get past the fabulous 3 request per-second MySQL / PostgreSQL did...

Cheers,

--fred

--

1 reply beneath your current threshold.

Re:16 queries on one page? by macpeep · 2000-11-12 23:20 · Score: 2

One page may well be doing some extremely complicated processing, which has evolved over many iterations of the code - it's not hard to imagine that such a page could have 16 queries in it, especially if it's being written in an oo style.

I don't agree with that statement. In cases where you need data using 16 different queries, no doubt many of those could have been cached for a huge speedup. I've written code for back ends for several major sites and I've seen very complex funcationality indeed where only a few queries were needed per page - most data was pulled from caches. Also, OO helps you write code where you can fetch data lazily. For instance in a discussion forum, you might keep all root-level messages headers in a cache, the 1000 last accessed bodies in a cache, 10000 last accessed headers in cache, parent-child relationships in cache.. In this way, you don't have to go to the database 16 times every time users come to the front page of the discussion forum and are served the exact same page - 30 times per second.

Like you said, computer time is now significantly cheaper than programmer time. Well, memory is also cheap. Keep stuff in memory and use good old abstract data types to get your data fast.. RDBMS's are very overused these days.

Re:16 queries on one page? by macpeep · 2000-11-12 23:22 · Score: 2

Perhaps the database was well designed from a data modelling point of view but not from a performance point of view. A good database design for a web site that needs to scale is such that it allows you to cache the data that is used most often. On a web site that gets 30 hits per second but where data only changes about once per 30 seconds, one should start thinking why the data is pulled from the database every single time. User profiles on personalized pages can be cached, data can be pre-fetched with one query, cached and then accessed from the cache when needed instead of doing several queries for a smaller result set etc. etc.. 16 queries is insanity!

Re:16 queries on one page? by dhogaza · 2000-11-12 23:26 · Score: 2

Sadly, I don't know if Postgres does ANY
pre-compiling of stored procedures. They have two main methods: straight named 'sql' statements, and 'psql' (a true stored procedure 'language'). My
guess is that they do little more than pre-parse the statment (if even that).

Commentary like this disgusts me, because the writer makes it clear he doesn't know anything about Postgres but merely presumes it's a poor implementation.

For the record, PL/pgSQL queries are pre-compiled, which is why there's no dynamic SQL capability in the language. You can create dynamic SQL from one of the other languages (pltcl, plperl) which exposes the SPI interface to the query engine to the user.

Work is under way to allow optional caching of individual query plans (i.e. by explicit statements), which allows one to avoid parsing and optimization overhead. When completed (7.2?) this will be great in the web environment.

I know little about Sybase so will avoid speculation about that RDBMS, might you be so kind as to avoid speculating about Postgres as well?

Re:16 queries on one page? by maraist · 2000-11-12 23:26 · Score: 3

As a followup idea.. 16 joins reminds me of a project that I did once where we used ID's for every table.. Thus every table stored meta data and a single piece of actual data.. Thus if you had 17 pieces of data to display, then you'd have to do 16 joins. There were three solutions to this:

First, cache all the smaller tables, and do the translation (from say user_id to user_name) in an external hash (I know perl can easily do this; don't know about php). For static fields such as user-names, email addresses, etc, this is easy. Unfortunately this doesn't help when trying to find all users with email addresses going to aol and souce code that's over-due, etc. But it does solve the summary report problem.

Second is to break up the query into multiple operations.. You need an incredible amount of memory to do certain types of joins, and sometimes the sql-processor guesses the wrong method. Even though postgres's "explain" command told me it was doing hash-lookups for a user-name summary, I found that it was faster to first get all the data with user-id's, then perform a second search with
"select user_name from users where user_id in (x,y,z,....)"

So long as you don't have more than a couple dozen entries, this works nicely.. ESPECIALLY if you have thousands of rows of real-data with only one or two unique user-names. This was a nice-short term fix, but "clean" sql should all be done on a single statement.. Soo onto the third

Denormalize back to form 4 or 3. Namely, whenever you have a user-name, don't use a user-id.. Sure you consume 8 more bytes of memory per row, but it greatly speeds up searches. The method is to ONLY use ID's when you can not uniquely represent a primary key with a single column.. (such as first-name, last-name,...).. But in our computer world, "user-name" tends to be unique. "project-name" should be unique, "working-branch" should be unique.. In which case, at the design level, it's worth the extra consumed space.. Now I know that Mysql advocates (including me a year ago), cringe at the idea of keying off variable length strings.. For me, it was the idea that you can compare integers much faster than strings.. BUT, if the string is an index, then your sorting is going to be MUCH faster, strings distribute much better than auto-incremented integers (which tend to want to be linear). Additionally, the fact that you won't have to perform a join at all removes almost all of the performance problems.

When you don't use meta-data as primary keys, then you find that you only perform joins when you actually have relational data... That joins based on summary data almost always goes away.

Additionally, if your DB ever get's corrupt (say you lose a table), then you don't have to fret over "who the hell was user-id 155?". Given that Mysql doesn't have roll-backs, the potential for data corruption with meta-data is enormous (when you do both reads/writes).

-Michael

--
-Michael

Re:large? by Nerds · 2000-11-12 23:31 · Score: 1

You obviously don't know the meaning of an OO database. Oracle is a relational database, end of story. Just because their dopey little GUI says "Create relational structure" does not make it a relational database.

What? Can you explain this to those of us who aren't as smart as you think you are? First of all, Oracle is an object relational database. Second, you first claim that Oracle is relational, and then go on to tell us that the GUI(as if anyone uses the crummy GUI that comes with Oracle) does not necessarily make it a relational database. Hello? Don't you have recess or something now?

--
My other .sig is 'The Art of Computer Programming'

Re:Why the partisan fuss? by segmond · 2000-11-12 23:32 · Score: 3

Actually, you are quite wrong. PostgreSQL has more documentation, I know this cuz I have combed every possible link on both site. Go back to PostgreSQL page, there is a full book free on PostgreSQL on their documentation page. Also, PostgreSQL has bindings for a zillion languages as well, what do you want? Perl? C? Python? C++? more? Before you post, please update yourself before you post your misinformations. I like both database, and I use both. :)

--
------ Curiosity killed the cat. {satisfaction brought it back | it didn't die ignorant | lack of it is killing mankind

Re:16 queries on one page? by maraist · 2000-11-12 23:39 · Score: 3

First, I started off with the words "Sadly, I don't know".. Therefore removing myself as an expert in that area. It was more a question than anything else.

As for the other languges, perl, tcl, etc, those can only be less optimial forms, and are more for procedural operations, which has nothing to do with over-comming the 16 join problem. (With the exception maybe of hashed-lookups, but that's dangerous)

It's great to hear about cached query plans (I'd forgotten the term).

As for the problem with speculation.. If nobody were to speculate, and those in the know don't impart their knowledge, then has anyone gained anything? 'Qualified' speculation does exactly as here; Inciting those knowing specific facts to quality errors. My speculation _was_ based on emperical analysis, however, so it wasn't as if I was blindly stateing something.

But now, as a question, is the compiled sql stored in any more optimal fashion? Or does postgres use an interpreter (which does little more than tokenize). To me, a compiled sql statement means that its making raw calls to table operations..

Simple benchmarks that I've done haven't shown any performance increases with using stored procedures, but that might just be because the compilation / interpretation stages were insignificant with respect to the overall operation. The other interpretation is that little is actually gained by the compilation of the sql statement (at least until the cached query plans come around).

-Michael

--
-Michael

Re:Windows version? && PostgreSQL vs. MSSQL by Jamie+Webb · 2000-11-13 01:03 · Score: 1

I think there's a good reason why there are no PosgreSQL binaries available for Win32 - because no-one with any sense uses them. After searching for binaries for some time, I gave up and downloaded the source, Cygwin, etc. It compiled, but crashed every five minutes when I actually tried to use it. I don't think PostgeSQL will be viable for Windows until someone does a proper port. I doubt that will happen.
Long live Linux! Ahem... Of course, for that system I ended up with SQL Server, because it needed to be Windows. Maybe one day...

Naturally, if I'm wrong and someone does have some stable binaries, I'd be most interested.

What about other Database Tools? by smoon · 2000-11-12 21:22 · Score: 4

Open source databases -- great. But how about tools to facilitate database development? In commercial terms you've got modeling systems (e.g.: Erwin), Extract-Transform-Load (ETL) tools (e.g.: Informatica, Sagent), Reporting tools (e.g.: Brio, Business Objects, etc.). GUI managers, GUI drag-n-drop development tools, etc.

In the open-source world you've got... Well you've got to custom code in some non-database-centric language, e.g.: C, C++, Java, Perl, etc.

Anyone want to start up an open-source alternative for an ETL tool? Target open-source databases as source and target systems. A usable open-source system would really raise the visibility of open source in the (hitherto) commercial-only data warehousing market.

--
"But actually trying to use m4 as a general-purpose langage would be deeply perverse" --ESR

large? by NineNine · 2000-11-12 21:22 · Score: 1

No way. 'Large' is a terabyte or bigger. 100 Gig is no big deal.

Re:large? by popocatapetl · 2000-11-14 01:46 · Score: 1

My guess is that this database is probably bloated because someone is storing images of paperwork as binary data within the database instead of storing the document somewhere else with a reference number in the database. Sounds like Oracle and the disk drive manufacturer are growing fat at taxpayer expense.

Large for a database is determined by the application.

Normally, large for a transaction processing database is in the 10s of GB. Large for decision support may be in the 100s of GB. TB databases would typically be found in data warehouses. These should be different database engines. You wouldn't use a Lamborgini to haul a 50 ton trailer any more than you would use your transaction processing database engine for a warehouse.

MySQL wins hands down. by Kiss+the+Blade · 2000-11-12 21:25 · Score: 1

A recent study by IBM showed that PostgreSQL has significant scalability problems when compared to MySQL. It seems obvious from this that PostgreSQL will be limited to the lower end of the market, and will never make it to the big time.

Also, thanks to the remarkable synergy that MySQL has with the commercial Website's staff (thats what my marketing manager says, anyway;), PostgreSQL will have difficulty generating the same loyal following. Lest we forget, technical superiority is only a very small part of the battle, you must also win Mind share. PostgreSQL is the BSD of the database world - limited to an irrelevent core market segment.

KTB:Lover, Poet, Artiste, Aesthete, Programmer.

--

KTB:Lover, Poet, Artiste, Aesthete, Programmer.
There is no

Re:MySQL wins hands down. by segmond · 2000-11-13 01:37 · Score: 2

I think he is a lying schmuck, many people have replied to his post requesting for the URL, and he hasn't responded. I checked google's cache and it returned nothing. he is just on a karma rampage.

--
------ Curiosity killed the cat. {satisfaction brought it back | it didn't die ignorant | lack of it is killing mankind
Re:MySQL wins hands down. by tommyq · 2000-11-12 21:40 · Score: 4

Lest we forget, technical superiority is only a very small part of the battle, you must also win Mind share

Well, that depends on what you mean by "win." Open source is not a popularity contest. In the past, it may have been true that to ensure the continued existence of a cool project, it needed to have a chance of dominating the market at large. But nowadays, as long as there are enough developers interested in a project--oftentimes only because of its "technical superiority"--it can stay alive, grow and improve. If a project can attain real technical superiority in this way, I call this winning.

That being said, I think it should be remembered that for many simple web sites, MySQL was the right choice because its superior performance with a read-oriented setup. Of course, Postgresql people claim a read-oriented setup isn't a real database application at all, but having designed several such small sites, I really wouldn't use anything else. (Flat files?? What an unnecessary pain!)

Just my two cents . . .

--
Respondeo dicendum quod . . .

16 queries on one page? by macpeep · 2000-11-12 21:27 · Score: 3

Anyone who has designed a site and its database in such a way that generating a page requires 16 queries should not taken seriously when conducting a database performance test. Still, it does seem to suggest that MySQL does not perform too great (which doesn't come as a surprise to me really.. All I'm saying is that a site with a page that requires 16 queries is *VERY BADLY DESIGNED* and will not scale very well at all!

open source MOBILE database by kootch · 2000-11-12 21:29 · Score: 3

I've posted this before, but I'll post it again.

Centura Software makes an open source mobile database software called db.star and db.linux

db.star is the first open source, embeddable data management solution for Information Appliances (IAs). "It is a high performance, small footprint, developer friendly database engine which facilitates the development of rich and powerful applications for popular IA platforms, such as the Palm® Computing Platform and Windows®-powered Pocket PC. "

Open source but not free.

db.linux is OSS Data Management for Information Appliances

Open source and royalty free.

July Article by Qarl · 2000-11-12 21:30 · Score: 1

The page currently linked to refers (and links) to the database benchmarking article as "my July Article". Way to stay recent, Slashdot.

--
--Carl

Article summary: by f5426 · 2000-11-12 21:31 · Score: 3

PostgreSQL is faster the MySQL for this guy.

But, as always in the case of database server choices, nothing replace real world test. You should test the various servers on *your* data.

Frankly, I found the numbers quoted in the article ridiculously low. But as we don't know te volumetry of the data, the hardware used, the database cache size, etc, etc, this boils down to 'someone found PostgreSQL better than MySQL for its usage'.

It is not uncommon to have orders of magnitude performance differences between databases. I would love to know how what a tuned ORACLE would get.

Cheers,

--fred

--

1 reply beneath your current threshold.

Yup... by Da+VinMan · 2000-11-13 01:08 · Score: 2

Thanks for your experiences. I may yet collapse and simply use Linux for a test RDBMS server, but I suspect my client will opt for the perceived safety of SQL Server 2000. That's OK, it'll look good on my resume! ;+)

--
Please mod this post only if you think others should/n't read this. I have enough ego^H^H^Hkarma. Thanks!

Scenerio. by linuxgod · 2000-11-14 05:33 · Score: 1

Scenerio:

Your IIS has just been hacked ( no surprise ). The hacker
has used a common exploit to gain access to your server,
and M$ will not release a fix untill 2 weeks later. You have
10,000 users depending on that server for every day work.
The hacker is getting in again, and you have 5,000 users
online. NOW FIX YOUR WEBSERVER !!!

Answer: You can't fix it.

Real life situation here funny boy. Id like to see
you fix this since 3 MCSE's can't.

You are a joke.

ETRN x

Insertion of large text objects by Kruemelmo · 2000-11-12 21:33 · Score: 2

After having used MySQL for quite a while I played with Postgresql and was quite impressed. I noticed, though, that it would not understand insertion of large text object data within an sql INSERT command as opposed to MySQL. I can't find comments about this anywhere... Does anybody know if it is something you wouldn't (shouldn't) do for some reason? Or is it another limitation of Postgresql?

Not to start an Urinating Competition by CaptainZapp · 2000-11-12 21:39 · Score: 1

but as long MySQL lacks fundamental database concepts like transaction control & integrity constraints (foreign keys, triggers) it's useless for deployment as a production database in a half-mission critical environment.

Also, what is the real value of scalabilty for Avergae Jos User? It's relevant for a phone company billing system where the CDRs come flushing in by the millions per hour. But if you require that horsepower, you certainly don't lack resources to spend mucho bucks on a commercial database (from a company you can sue if something goes wrong), the iron to run them all (with a few nifty EMC disk arrays) and serious consulting.

--
ich bin der musikant

mit taschenrechner in der hand

kraftwerk

Why the partisan fuss? by tmu · 2000-11-12 21:39 · Score: 4

I should say at the outset: I've used both Postgres and MySQL on production projects and like both for very different reasons. But here's the thing (and this is a common problem with technology product evaluations): these products have very different design goals and aren't really that comparable.

I'm not saying the standard "MySQL's not a database because it doesn't support transactions and databases have to. ACID!" (although I'm sympathetic to that point of view, I don't think individual words like "database" are worth fighting over--If MySQL wants to call itself a database, fine). What I am saying is that Postgres was designed to be a full-fledged SQL92-compliant database with transactions and triggers and foreign keys and the whole lot. MySQL was designed to be a SQL-based front-end to a very fast database file format. These products are not the same and comparing them without agknowledging that seems foolish.

I've been very pleased by the speed improvements in Postgres recently (partisan testing aside, Postgres 7 really is much faster). I've also been impressed by feature additions in MySQL (although it still isn't close to what you would expect to get if you're coming from the Oracle or DB2 world). But both remain inherently died to their design goals. This isn't a bad thing, at all, because different projects need different products with different design goals.

I compare this to the (often senseless) comparisons of NetBSD, OpenBSD and FreeBSD. One is designed for portability, another for security (features be damned!) and the last for performance, features and multi-processing under (primarily) intel. Although they come from a common code base, they have obviously diverged in design goals. Instead of bashing one product or another (all of the *BSD's and the two databases discussed have *major* problems that are bashable) use the one that seems most appropriate to your needs.

Slashdot Mirror

Open Source Databases Revisited

160 comments