Sun Eyes PostgreSQL
Da Massive writes "Sun is looking seriously into the database market - namely PostgreSQL. It says Oracle and IBM and even Microsoft licensing fees are way too expensive for the average punter.
This from John Loiacono, executive vice president of software: "We're not going to OEM Microsoft but we are looking at PostgreSQL right now," he said, adding that over time the database will become integrated into the operating system."
This really isn't a surprise. MySQL has both licensing problems, and feature problems in the competitive high-end markets. PostGreSQL has none of these issues, and can hold its own in a comparison with Oracle or SQL Server. These features led RedHat to PostgreSQL for their RedHat Database product, and I see little reason why they wouldn't attract Sun as well.
The only thing that slightly bothers me about their strategy is that Sun has been pushing their Java Systems hard. If they actually wanted to bolster that strategy, they'd have three major options for a Java Enterprise Database:
1. Cloudscape/Derby - This product makes the most sense from a technology and licensing perspective, but the fact that it was an IBM product (even though Cloudscape was originally a separate entity before being acquired) taints the software in such a way as to make Sun look bad if they used it.
2. Daffodil - This database is an excellent choice, but it would require the acquisition of another company, a move that the Sun shareholders might question. It would also bring quite a bit of flak in Sun's direction as Daffodil is an Indian company.
3. McKoi SQL - An excellent choice for a Java database, but lacks brand recognition. The feature levels and scalability of the database are still considerable questions. The GPL license also allows Sun less freedom to modify the database in comparison to the BSD license used by PostgreSQL.
As for the choice of Sunbird, I think it's simply a matter of "why not?" It's not like there's any particular leader in the market, and Sunbird plays nice with Firebird/Mozilla.
Javascript + Nintendo DSi = DSiCade
it's clear why they wouldn't go with MySQL (technical shortcomings aside).
Actually, I'd say that the technical shortcomings have a LOT to do with it. PostgreSQL can be placed head to head with Oracle and still pretty darn appealing. MySQL really don't have that capacity (yet), and is hampered by its non-ANSI comaptible design and SQL variant. So I'm not certain that the decision was made entirely on licensing alone. After all, Sun does support the GNOME project as well, and that is solidly under the GPL.
Javascript + Nintendo DSi = DSiCade
On top of being closer to the standards Oracle uses, IIRC, PostgreSQL uses a transaction model that is essentially identical to Oracle's, even though it's implemented differently. In spite of the hype around database independence, the reality is that the differences in transactional behavior radically affect the ability to port from one database to another. The fact that PostgreSQL's native stored proc language already looks a lot like Oracle's PL/SQL, with an effort to make PostgreSQL run PL/SQL unmodified in the works elsewhere, is another big plus.
Sun's Java Enterprise System is about programming in Java rather than the tools in Java. The technology of the product isn't hugely important its the fact that the API and development is in Java. Databases are clearly easy with Java as JDBC makes the actual choice a pure commodity. So what Sun want is a solid database, for free, that rounds out their platform effort and means that in one download and license a client can "get started"... which often means it is all they use.
An Eye for an Eye will make the whole world blind - Gandhi
SQL92:
PostgreSQL > MySQL; but MySQL is improving it's feature set
SQL3:
PostgreSQL > MySQL; PostgreSQL has a few SQL3 features
Speed:
PostgreSQL ~= MySQL; sometimes faster, sometimes not
Database\table\row\... Size:
PostgreSQL > MySQL; PostgreSQL has less size restrictions, or at least, the limits are much larger than those of MySQL
Stored Procedures:
PostgreSQL > MySQL; MySQL not yet, but in 5 they have SQL:2003 like stored procedures; PostgreSQL has SQL, C, pgSQL, Tcl, Perl, Python and roll-your-own and a few not bundled with PostgreSQL
Installation\maintenance:
MySQL > PostgreSQL; MySQL is easier to set up
OS Support:
PostgreSQL ~= MySQL; postgres came a long way, e.g. there's now a stable Windows version.
A 'punter' is common British slang for 'your average joe'.
Also used to mean a gambler or a prostitues client!
Yes. They have been compared.
A quite legnthly comparison can be found here.
SQL92 compliant is a relative term.
It's a British-ism meaning about the same as "bloke", only it can apply to men or women. Tends to have shades of "lowest common denominator" to it, meaning something like "an ordinary slob off the street picked at random".
If a job's not worth doing, it's not worth doing right.
It's a British-ism meaning about the same as "bloke", only it can apply to men or women.
I've mostly heard it used/used it myself to describe "customers", particularly gamblers. As in "I had a punt on that nag but lost my shirt". I believe politicians use it to describe their electorate, but I couldn't possibly comment.
Disclaimer: I've only heard the term used in Scotland; it's usage elsewhere in Britain may be more/less general.
This is where the serious fun begins.
Athletic Scholarships to universities make as much sense as academic scholarships to sports teams.
It's British for "Joe Six-Pack"
Just junk food for thought...
I do find it interesting that Telstra is a Sun software customer
Telstra are also a big Microsoft customer and also a big Linux user. They use IBM GSA extensively too. What's your point?
I know one guy who worked on an implementation of part of the Telstra Mobile billing system for IBM GSA as Telstra found out that they weren't cathing the milliseconds to seconds in cell switch time and therefore billing users for it.
This just like the comment in the article is just padding. It doesn't really add anything to the post.
IBM has DB2, Microsoft has Microsoft SQL Server, Sun has.... Oracle? No....
I doubt very highly that Sun would buy PostgreSQL Inc, they would partner with them and do some code development of PostgreSQL to get it to the level where it can definately compete head on with Oracle (Although Oracle do have a lot of other software that at present Sun doesn't have) and MS SQL and DB2. The thing they would be best off doing (And probably will do) would be to go out and hire key developers of PostgreSQL to try to prioritize more the requirements that they are after.
Curiosity was framed; ignorance killed the cat. -- Author unknown
In this context punter is a buyer with shades of uninformed buyer. The term comes from the race tracks where betters became known as punters, and has evolved to refer to more uninformed buyers of especially tech and financial products.
Degaussing scares the bad magnetism out of the monitor and fills it with good karma.
MySQL > PostgreSQL; MySQL is easier to set up
I don't think this is much of an issue, I recently installed postgreSQL on my Windows XP machine in order to try it out. The installation was 100% simple and painless.
My Karma: ran over your Dogma
StrawberryFrog
After all, Sun does support the GNOME project as well, and that is solidly under the GPL.
Actually most of the GNOME is licensed under the LGPL.
My blog
Sun already has several engineers working on Derby through Apache. Sun bundles Derby with Glassfish (the newly open-sourced Java EE 5 app server), which also integrates Derby into the app server for the EJB timer service, and bundles it with the Java Enterprise System stack. Sun is actively promoting Derby as a development database. There was a story about it here on Slashdot not too long ago.
Sun used to bundle Cloudscape before IBM bought Informix, and subsequently switched to Pointbase. For App Server 9/Glassfish, they pulled Pointbase and replaced it with Derby.
American slang for 'punter' is 'mark'. A gambler, but more specifically, a loser.
Oh well, what the hell...
Since you're unfamiliar with the term, you must be unfamiliar with The Register. The BOFH alone is worth the price of admission.
This next song is very sad. Please clap along. -- Robin Zander
Isn't a prostitute's client called a "john" or "trick"?
I know some people get confused and call the "ho" the "trick", thanks to the ill informed rappers claiming that "biatches aint nothin but hos and tricks"
...is probably the most fair comparison.
.NET--meaning that you can write stored procs and functions in any .NET language. So, they are probably a pretty close match except in a couple of areas--PGSQL is free (libre and gratis), and PGSQL is not platform dependent. I think that the fact MSSQL only works on Windows is a major drawback when all its competitors offer products that run on Windows, Linux and various UNIX derivatives. Various "facts" notwithstanding I still think that Windows servers are a greater administrative burden and more difficult to secure than other alternatives--perhaps the next server version after 2003 will have addressed that.
Don't know much about Postgres in production environemnts. It seems clean and I like the fact you have a choice of stored procedure languages.
I have had experience with both in production environments, and I've come to the conclusion that PostgreSQL is clearly a step above MSSQL in terms of features and scalability. It is much better than MSSQL with concurrency and managing contention (MSSQL's locking strategy is quite brain dead). There is much more flexibility and power to create user functions and stored procs in PGSQL--you can do things like make user-defined AGGREGATE functions and data types in addition to having a choice of languages (none of that is possible with MSSQL). I find that all things being equal PostgreSQL is probably faster as well (largely an assumption becasue the PostgreSQL systems I've worked with are running on considerably less powerful hardware than the MSSQL systems I am doing). A lot of people comment about the ease of administration of MSSQL but I find that PGSQL really isn't that hard to manage even if you don't use GUI tools.
Oracle is certainly one step above PGSQL in power--but of course that comes with a very hefty price tag. That price isn't just in licensing either--Oracle takes more time to administer and you also pay by losing flexibility, since enterprise systems based on Oracle better do things the "Oracle way" or you are inviting trouble (just like with Microsoft products, Oracle really pushes its single-vendor solutions).
I have not played with Yukon/MSSQL 2005 yet, though I've heard a fair bit about it. From what I've heard it closes the gap a fair bit and comes much closer to PGSQL in terms of features and performance--it is supposed to handle locking/contention better and its has embraced
That's 3 in America, actually. Two in NY, one in California.
This is not the greatest sig in the world, no. This is a tribute.
Actually that's not remotely true. We're not talking about MySQL here. PostgreSQL is quickly gaining all the "high-end" features of Oracle: tablespaces, failover, replication, etc. In some cases, they aren't yet as fine-grained as Oracle. In other cases, they're superior. PostgreSQL is quickly coming into its own.
On top of this, it's a lot less painful to work with, and the SQL featureset is far nicer. After having worked with them both on a daily basis, the only reason I'd willingly use Oracle is if I was working with terabytes of data and had lots and lots of money to throw at Oracle to make it work and support it. Which I don't. Like Sun is saying, this is unjustified for most people.
Don't think of it as a flame---it's more like an argument that does 3d6 fire damage
Thank goodness for SQLite, then. Just add the dll (or dependancy for linux-folks) to your package and voila. This also assumes you're using some DB-abstraction layer, such as PDO, ADO, ADO.Net, etc...
. Define sqrt(x) as something really evil like (x / rand()), and bury it deep. Watch your coworkers go nuts.
The biggest one that has made a difference in my life lately:
Table Partitioning:
PostgreSQL > MySQL; Mainline PostgreSQL has table partitioning as of 8.1-beta, by leveraging inheritance (Postgres is an Object-Relational Database).
Queries on the aggregate of the partitions are directed at the parent table, and optimized to only look into appropriate sub-table by checking CHECK constraints of the sub-table against the query WHERE clause.
Basically, you do it like this (contrived, but related to how I'm using them at the moment):
MyBigFatTable stores timestamped data from a bunch of a machines at regular intervals, keying off of the machine id and the timestamp of the data:
CREATE TABLE MyBigFatTable (
machineid INTEGER REFERENCES machines(machineid),
stamp TIMESTAMP,
data_x FLOAT,
data_y FLOAT,
[... lots more data fields
PRIMARY KEY (machineid, stamp)
);
Your problem is, the table size grows and grows and grows unbounded, and database operations continue to get slower and slower (inserts, updates, and selects) as the table grows. You have a policy to expire the data after a month which limits the maximum growth, but this in turn requires lots of deletes happening all the time, which again hurts performance.
The inheritance-based partitioning solution is to leave that table definition as it is, and also define:
CREATE TABLE MyBigFatTable-2005-10-05 (
PRIMARY KEY (machineid, stamp),
FOREIGN KEY (machineid) REFERENCES machines(machineid),
CHECK ( stamp >= '2005-10-05 00:00' AND stamp '2005-10-06 00:00')
) INHERITS MyBigFatTable;
As you can see, the column definitions are inherited, but you must re-specify the PK/FK stuff. The added check clause says that only data from Oct 10, 2005 is valid in this subtable.
You set up a maintenance script to create your new time-based tables ahead of time (say once a day create tables for the next day), and you do your data INSERTs into the specific subtable (you know the timestamp of the data you're inserting, so you can generate the appropriate table name from that (MyBigFatTable-2005-10-05).
You run your SELECTs against the original MyBigFatTable just as you did before. It automatically includes any rows from its child tables. Further, if your SELECT's WHERE-clause was constraining a query to a specific time-range, only those children of MyBigFatTable whose CHECK constraint indicates they could possibly have relevant data are checked.
And as for the problem of expiring data and the delete traffic you had before? You simply drop the old child tables with "DROP TABLE" from a maintenance script when they're a month old - no DELETEs neccesary.
11*43+456^2
> Actually that's not remotely true. We're not talking about MySQL here.
> PostgreSQL is quickly gaining all the "high-end" features of Oracle:
> tablespaces, failover, replication, etc. In some cases, they aren't yet as
> fine-grained as Oracle. In other cases, they're superior. PostgreSQL is quickly
> coming into its own.
Hmmm, as much as I like postgresql I don't see it that way:
1. replication? it's most often used as a clunky way of implementing failover - yuck. In my large data architectures, replication is almost never used: it's almost always the worst solution to some problem.
2. tablespaces? yep, they're good things to have. that's fine - i think oracle and db2 have supported them for around twenty years, so it's hardly high-end technology tho.
3. failover? ok, this is critical - but there are also many different forms & flavors. I'm not familiar with what postgresql has so I won't comment - other than to say it needs to be rock-solid.
ok, how about a few more:
4. memory management: a high-end database should give you a ton of control over how memory is handled - especially when you plan to buy tons of it. Here the big databases allow you to assign different amounts of memory to different buffer pools, which are then assigned to different tablespaces. These bufferpools (caches) are how to easily ensure that hits against some tables or indexes occur 99% of the time from memory, and others 50% because they're so much larger. I'm pretty sure that neither postgresql or mysql can do this.
5. process management: in db2 your application writes to a buffer pool, an asychronous agent picks up that change and writes it to a log file, another asynchronous agent picks it up and writes it to the table. This heavily-asychronous behavior (and yes, with memory & processor tuning available for each agent type) allows you to maximize write-throughput. Postgresql and mysql are still in the slower sychronous world.
6. parallelism: in mysql and postgresql all queries are single-threaded. In db2 and oracle a large query will actually split itself up into multiple sub-queries to maximize throughput for multiple cpus and storage arrays. This provides db2 & oracle with linear performance improvements up to 4-8 cpus. In othe words, large queries that perform table scans can take advantage of SMP hardware for the commercial products - and cut down your query time by 75% on a 4-way compared to mysql and postgresql.
7. partitioning: btree indexes only work for very selective queries - like when you want 1% or less of the data of a table. But for many queries you need to crunch 5,10,or 15% of the data. That's where range partitioning comes in: you just scan the data you absolutely need to. So, while db2 or oracle are scanning 10% of the data - postgresql or mysql still have to scan 100% of the data. That would result in a 10x increase in speed over postgresql or mysql.
that's just off the top of my head - given a little time this list would double.
Postgresql is a fine tool, and it has all the technology that db2 or oracle had 12-15 years ago. And that's a cool achievement, and qualifies it do a ton of cool projects. Plus, with time it will catch up. But it still has a *long* way to go.
Yeah, Postgres doesn't currently support this. IMHO it isn't that useful -- the performance improvement I'd expect would be pretty small (for one thing, all Postgres buffering is done in addition to the kernel's buffering, so the net impact will be smaller). It also adds a significant administrative burden -- you need to configure which objects go in which pools, as well as how large each pool is.
DB2 may well be better than Postgres here, but your explanation above doesn't make a lot of sense. In Postgres, a committing transaction only needs to wait for the WAL record describing the transaction to be flushed to disk (multiple transactions that commit concurrently can be flushed via a single fsync(2)). That is the only I/O that needs to be done synchronously -- the rest can be done async (notably, this includes the table I/O itself -- the modified buffers are just marked dirty in memory and are subsequently flushed to disk). Note that a backend may also need to wait for dirty pages to be flushed from the buffer pool if it is trying to replace a dirty page with a clean one, but (a) those flushes are done via write(2), so there is not necessarily a disk flush involved (b) the background writer in 8.0+ is intended to resolve this by ensuring that most of the work of flushing dirty pages is not done by a normal backend.
PostgreSQL 8.1 (currently in beta) includes "constraint exclusion", which is essentially a primitive form of table partitioning (using inheritence and check constraints, you divide the data into tables with distinct check constraints; the optimizer has been improved to recognize when a child table can be omitted from the query plan by looking at the check constraints involved).