Slashdot Mirror


UML, PostgreSQL Get Corporate Support

tcopeland writes "An article on NewsForge highlights some changes in the upcoming PostgreSQL release (v7.5) that are funded by Fujitsu. PostgreSQL core team member Josh Berkus says that "Tablespaces, Nested Transactions, and Java support" are being underwritten by Fujitsu; this has also been mentioned on the postgresql-hackers list. He also says that 7.5 will be "...the most significant new release of the software since version 7.0 almost four years ago". Good times for PostgreSQL users!" And ggoebel writes "Jeff Dike posted a notice to the UML [User-mode Linux] developers mailing list: 'The first bit of news is that as of last Monday, I am working for Intel. They generously offered a full-time position, off-site, with my time mostly spent on UML. This basically means that UML is no longer a part-time, after-hours thing for me, so we should start seeing more work happening on it, especially compared to the last month or two.'"

20 of 213 comments (clear)

  1. Solid stuff, that PostgreSQL... by tcopeland · · Score: 5, Interesting

    ...RubyForge has been running on it for almost a year now, no problems.

    Only a half million records and only about 75K queries a day, so it's not a huge DB... but it's definitely getting the job done.

  2. UML is no longer a part-time, after-hours thing by Anonymous Coward · · Score: 4, Funny

    This basically means that UML is no longer a part-time, after-hours thing for me

    You have my deepest sympathy.

  3. UML by Un+pobre+guey · · Score: 5, Informative
    OK, UML is User Mode Linux. Got it. No, no, I'm not confused, I get the coincidence with the other extremely widespread use of the acronym. No prob, Dude.

  4. Table spaces? by AKAImBatman · · Score: 5, Interesting

    Does this mean that PostGreSQL will actually be able to write *directly* to disk cluster? That would be one serious performance boost! My only request is that they do us all a favor and make sure that we can fragment the tables across spaces. It tends to suck when one table fills an entire drive, and it refuses to use all the space on the other drives.

    1. Re:Table spaces? by jadavis · · Score: 5, Informative

      "Tablespaces" allow you to put individual tables on different storage devices. Prior to tablespaces, an entire database had to be on one device*.

      You are referring to two completely different technologies:

      (1) "Writing directly to disk cluster" - By that you seem to mean direct disk access, not through the filesystem. I don't even think this is part of the PostgreSQL TODO, because there is just not a very strong need. Are you experiencing performance problems in this regard?

      (2) "fragment tables across spaces" - By that you mean "Table Partitioning". That allows you to break up a single table across multiple storage devices. That would be very valuable technology, but as far as I know, won't make 7.5.

      If all these features really work out for 7.5, they should call the release 8.0, and maybe they will.

      *: There are some tricks you can use if you need to move a single table to a different device prior to 7.5. I think symlinks work fine, but if it's important, I'd wait for 7.5 or ask on the -general list to make sure it's correct.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    2. Re:Table spaces? by jadavis · · Score: 4, Insightful

      They are going to reconsider this if someone can write a caching system that can beat the os but so far that hasn't happened.

      It's a little more complicated I think. Using the filesystem has other advantages as well:

      (1) PostgreSQL can work well with other applications running. Let's say you invent the best caching algorithm possible, then you still have two seperate caches, one for PostgreSQL and one for everything else. That means you have to dedicate the machine to PostgreSQL and have a high PostgreSQL cache (but any other app will suffer), or give postgres a low amount of cache space and it will suffer.

      (2) The postgres developers don't want to worry about the bugs involved in making their own filesystem. Also, who's to say they can make a filesystem as fast right off the bat? It might be a huge development effort, with relatively minor benefit for most people.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    3. Re:Table spaces? by GooberToo · · Score: 4, Informative

      "Tablespaces" allow you to put individual tables on different storage devices. Prior to tablespaces, an entire database had to be on one device*.

      Strictly speaking, that's not true. You can move things around manually, and some have done so, but it's not pretty, not easy, and not easy to maintain. Implementation of tablespaces in PostgreSQL simply allows its users to easily do what was previously an arcane-voodoo art. So clearly, it's a big step up. But, you already knew that. ;)

      "Writing directly to disk cluster" - By that you seem to mean direct disk access, not through the filesystem. I don't even think this is part of the PostgreSQL TODO, because there is just not a very strong need. Are you experiencing performance problems in this regard?

      That's correct. AFAIK, there is no desire to implement raw partition support. The speed difference is minimal and the required code is large. Basically, you wind up writing a FS and associated buffer management into the database. The return generally is not very high. It used to be, many years ago. These days, filesystem technology and implementations are plenty fast. Those that want raw partition access, IMO, are simply living in the past.

      If all these features really work out for 7.5, they should call the release 8.0, and maybe they will.

      You are correct. Accordingly to the list, the numbering constantly goes back and forth. From what i gather, they are waiting to see what features actually make it in. Depending on the scope of changes, they'll then determine the version number. As a rule of thumb, people are calling it 7.5, simply because nothing else has been blessed.

      Please don't think I'm correcting what you've said. You've said nothing that I disagree with. I'm simply adding a followup remark. ;)

      Cheers!

    4. Re:Table spaces? by Just+Some+Guy · · Score: 4, Interesting
      By that you mean "Table Partitioning". That allows you to break up a single table across multiple storage devices.

      For the uninitiated and lazy, is there any compelling reason why that's better than putting the database files on a RAID and letting the OS split the table across devices?

      --
      Dewey, what part of this looks like authorities should be involved?
    5. Re:Table spaces? by kpharmer · · Score: 4, Informative

      > For the uninitiated and lazy, is there any compelling reason why that's better than putting the
      > database files on a RAID and letting the OS split the table across devices?

      Sure, you might want to distribute your data across multiple arrays. For example - keep your logs and tempspace on an fast & expensive raid 0+1 array of fast (15k drives). Then put small OLTP stuff on a another raid 0+1 array. Then put your huge graphic images, documents, etc on a much more economical RAID5 array.

      I use multiple arrays all the time for performance and economics (in db2 & oracle) - this is cool to see postgres pick itup.

  5. Re:UML is pretty awesome by gtrubetskoy · · Score: 5, Interesting
    It's really the future of "shared" webhosting because it balances the power of a full server against the cost of a shared one.

    I respecfully disagree. While UML gives you excellent isolation, it is an extremely inefficient way to virtualize your server since it does not take advantage (by design) of all the optimizations that UN*X provides. UML is great for kernel developers and applications where isolation is far more important than performance.

    In Linux virtual server hosting, the future will be Linux VServer Project

    (ok, I'm somewhat biased, I admit)

  6. Re:That's all fine and dandy, but... by jbolden · · Score: 4, Interesting

    Well obviously if they are funding development then they will have influence on what gets worked on. What political agenda do you see Intel as likely to have on advancing user mode? It would seem to me that this is fairly typical of Intel software devleopment for the last 15 years -- making sure that there is publically available code highlighting how to do cool things with their CPUs.

  7. Re:Good to Hear... by Twirlip+of+the+Mists · · Score: 4, Informative

    the primary DB System for so long has been MySQL. PHP coders don't have too much for an alternative

    Au contraire, there are PHP interfaces for PostgreSQL, Oracle, Sybase, and MSSQL built right in to the source distribution. I seem to recall that back in the Bad Old Days before Mac OS X, when you had to compile things yourself, building PHP with all the necessary libraries was a huge pain, but now it's a trivial thing. Marc Liyanage maintains a PHP module package that snaps right into the built-in Apache web server on your Mac, and it already has most of the necessary bells and whistles built in.

    --

    I write in my journal
  8. All Welcome and expected - expect more.. by eamacnaghten · · Score: 4, Insightful
    This is great news, not only for the projects involved, but for FOSS in general.

    Also this is consistent with the Open Source Paradigm. Where it is in the interests of companies to improve the software, and the advantages far outweigh the disadvantages of them not being exclusive. It is this philosophy, in my opinion, that will beat proprietary software models such as Microsoft, and it is these companies that are key in stopping those who want to halt the advancments of FOSS using idiotic patents and other invalid IP arguments.

    --

    Web Sig: Eddy Currents

  9. Re:That's all fine and dandy, but... by SIGALRM · · Score: 5, Insightful

    The true benefit of projects such as this is their independence from the big brother corporations

    You mean like Sun and HP funding the Apache group?

    Or Novell and Ximian underwriting the Mono Project?

    Or IBM contributing to F/OSS?

    Do you think these and other projects would be where they are today without the backing of serious money/resources?

    --
    Sigs cause cancer.
  10. Postgres is kicking butt by johnnyb · · Score: 5, Interesting

    and taking names. In addition to Fujitsu's additions, they are also doing point-in-time recovery. They have multiple replication solutions. It's an absolutely wonderful database to develop for.

    It's got several really cool features, such as the ability to create your own index types, the ability to create your own column types, the ability to create rules for updating views, and a lot of other things that make it an absolute joy to work with.

    The only thing I don't like about it is that it needs the ability to read bytea's as if they were BLOBs. Then life would be perfect!

    From Fujitsu's pile, tablespaces is the most interesting feature I see - and that's actually pretty cool. That's one of the things that really allows you to realize the logical/physical separation that relational databases promise.

  11. This rules! by Anonymous Coward · · Score: 5, Interesting

    I'm loading more than half a million records into a Postgres db on my iBook as I write this, and I gotta say that pgsql is cool as hell. The data type support alone (polygons?!?!) makes it worth the small amount of extra effort it takes to get it up and running.

    Postgres flat blows away MySQL in every way I can thnk of except for the fact that one has to "manually" vacuum (cleanup + reindex) the db ... but that's what cron is for. The only things I miss from my MSSQL days is the ability to do on-the-fly data type changes on columns; this is actually a good thing because now I'm not so lazy about designing the db right in the first place. ;-)

    If you're out there playing with MySQL or MSSQL, you owe it to yourself to give Postgres a shot.

  12. Good news! by rfernand79 · · Score: 5, Interesting

    Certainly good news! :) PostgreSQL is a very robust and complete database, enjoyed by many academic users (mostly because of its excellent implementation of different SQL standards...) It's nice to hear that a company is backing them up now. UML and Intel, really cool, too. It's not as good as Linus/OSDL, but definitelly equivalent to the Linus/Transmeta years. So, in general, is this the road for the free world now? Backed up by powerful companies who also benefit? I certainly hope so.

  13. User-Mode Linux Management by Anonymous Coward · · Score: 5, Informative

    ...If you want to manage a lot of UML virtual machines, I _highly_ recommend UMLazi. It has a very slick configuration file format-- configuration directories instead of a single file, which makes it really easy to manipulate with scripts--, and they've obviously put a lot of thought into security.

    I had a few problems getting it started, but the developers were very helpful.

  14. Why corporate self-interest can be good for OSS by j.+andrew+rogers · · Score: 5, Interesting
    There is more than just Fujitsu supporting PostgreSQL and the reasons there is corporate interest is pure unadulterated self-interest of the best kind.

    Postgres is getting really close to the functionality and capabilities of the Big Commercial Enterprise DBMS, close enough that anyone can see that bridging that gap is quite doable. Most of the arguable weaknesses in Postgres are in the more esoteric high-end feature space, as it is already strong and quite feature complete for most routine RDBMS work. And the upcoming new version addresses a great many of those weaknesses. As the article said, this is going to be a major release.

    The self-interest part is that it is a HELL OF A LOT CHEAPER for a corporation to pay people to add those last few features and bits that they want to Postgres than to pay an unholy amount of money to buy the required Oracle licenses. The Postgres engine is clean and fundamentally pretty good in an engineering sense, and so enterprise feature tweaks are relatively cheap. It is all about dollars and sense at the end of the day. Purchasing Postgres plus feature development is almost always going to be vastly cheaper than buying Oracle. And unlike Oracle, it is pretty much a one-time fixed cost. It is worth repeating that the engineering strength and scalability of the underlying Postgres platform is the primary reason the market is evolving this way. The gap between MySQL and high-end RDBMS is comparatively much too great for a company to fund closing that gap because a lot of additional arguably unrelated work may be required because of the internals. This increases time to delivery of features, increases the cost of adding high-end features, and increases the risk of problems.

    If Oracle suddenly dropped its enterprise licensing costs by a couple of order of magnitude, then it would seriously threaten Postgres development. But since that is unlikely to happen, corporate money will continue to flow into making Postgres a formidable Oracle replacement, which it is already well on its way to being.

  15. Re:what's the point? by sumbry · · Score: 4, Interesting

    You've obviously never run a large database before. While a single RAID partition is fine for most uses, when you get into situations where you measure queries by how many are run per second then things really start to hit the fan.

    Tablespaces allow you to do things like place a table that is 90 percent read and 10 percent write on one RAID array while taking another table that is maybe 50 percent write and 50 percent read on another table and then taking the Postgres WAL and placing that on a completely different array.

    Table usage varies greatly across large databases. Some tables barely get touched, others get written to alot, others get read from alot.

    I'm currently running a database where our peak loads are around 35 queries, per second. I've actually symlinked table locations to put my most heavily accessed tables on a seperate RAID array from the rest of my database. This gave me a 3 fold increase in speed. This is really noticed when we do things like VACUUM the db.