PostgreSQL 9.2 Out with Greatly Improved Scalability

← Back to Stories (view on slashdot.org)

PostgreSQL 9.2 Out with Greatly Improved Scalability

Posted by Unknown on Monday September 10, 2012 @11:07AM from the rev-your-engines dept.

The PostgreSQL project announced the release of PostgreSQL 9.2 today. The headliner: "With the addition of linear scalability to 64 cores, index-only scans and reductions in CPU power consumption, PostgreSQL 9.2 has significantly improved scalability and developer flexibility for the most demanding workloads. ... Up to 350,000 read queries per second (more than 4X faster) ... Index-only scans for data warehousing queries (2–20X faster) ... Up to 14,000 data writes per second (5X faster)" Additionally, there's now a JSON type (including the ability to retrieve row results in JSON directly from the database) ala the XML type (although lacking a broad set of utility functions). Minor, but probably a welcome relief to those who need them, 9.2 adds range restricted types. For the gory details, see the what's new page, or the full release notes.

10 of 146 comments (clear)

Min score:

Reason:

Sort:

Re:/. Poll by Eponymous+Hero · 2012-09-10 11:24 · Score: 4, Insightful

E) stop using oracle and start using postgres

--
insensitive clod overlords obligatory xkcd car analogy russian reversals whoosh pedant fanbois ftfy in 3...2...1..PROFIT
Re:That's great and all, but . . . by Anonymous Coward · 2012-09-10 12:31 · Score: 5, Informative

9.3. Seriously.
http://rhaas.blogspot.com/2012/06/absurd-shared-memory-limits.html
Re:That's great and all, but . . . by dragonk · 2012-09-10 13:17 · Score: 5, Informative

I just posted this to the blog, but I will repeat it here --
There is a very good reason we OS vendors do not ship with SysV default limits high enough to run a serious PostgreSQL database. There is very little software that uses SysV in any serious way other than PostgreSQL and there is a fixed overhead to increasing those limits. You end up wasting RAM for all the users who do not need the limits to be that high. That said, you are late to the party here, vendors have finally decided that the fixed overheads are low enough relative to modern RAM sizes that the defaults can be raised quite high, DragonFly BSD has shipped with greatly increased limits for a year or so and I believe FreeBSD also.
There is a serious problem with this patch on BSD kernels. All of the BSD sysv implementations have a shm_use_phys optimization which forces the kernel to wire up memory pages used to back SysV segments. This increases performance by not requiring the allocation of pv entries for these pages and also reduces memory pressure. Most serious users of PostgreSQL on BSD platforms use this well-documented optimization. After switching to 9.3, large and well optimized Pg installations that previously ran well in memory will be forced into swap because of the pv entry overhead.
Re:That's great and all, but . . . by cas2000 · 2012-09-10 13:40 · Score: 4, Funny

you atheists love to take all the fun out of things, don't you?
Eliminate the human sacrifice now and next you'll be saying we have to get rid of our Steve Jobs altars.
Range data types by slack_justyb · 2012-09-10 13:41 · Score: 5, Interesting

I think everyone has glossed over the single most important feature in the Postgre SQL that they have refined in this release, IMHO. Ranged data types. Let's say you have a meeting schedule DB application. Well currently if you want to restrict a room between two times (start and stop) so that no one else can have the room during that time, you are going to have to write that logic in your application.

Postgre's range data type allows you to create unique checks on ranges of time. This can in two lines of code, do every single logic check that is needed to ensure no two people schedule the same room at the same time.

How this is not showing up on anyone's radar is beyond me, or maybe we all just use Outlook or Google Calendar now. However, the range types are not just limited to the application of time, but of anything that requires uniqueness along a linear fashion, as opposed to just checking to see if any other record matches the one that you are trying to insert.
Postgres-Curious by kwalker · 2012-09-10 13:42 · Score: 4, Interesting

TL;DR: Is there an advanced PostgreSQL for MySQL Users guide out there somewhere? Something more than basic command-line equivalents? And preferably from the last two major releases of the software?
Long version
I've been using MySQL personally and professionally for a number of years now. I have setup read-only slaves, reporting servers, multi-master replication, converted between database types, setup hot backups (Regardless of database engine), recovered crashed databases, and I generally know most of the tricks. However I'm not happy with the rumors I'm hearing about Oracle's handling of the software since their acquisition of MySQL's grandparent company, and I'm open to something else if it's more flexible, powerful, and/or efficient.
I've always heard glowing, wonderful things online about PostgreSQL, but I know no one who knows anything about it, let alone advanced tricks like replication, performance tuning, or showing all the live database connections and operations at the current time. So for any Postgres fans on Slashdot, is there such a thing as a guide to PostgreSQL for MySQL admins, especially with advanced topics like replication, tuning, monitoring, and profiling?

--
... And so it comes to this.
1. Re:Postgres-Curious by Art3x · 2012-09-10 15:14 · Score: 4, Informative
  
  PostgreSQL replication is new (revision 9.1) so there may be little out there (Yes, there was replication, but with additional software, like Slony).
  I'm in the weird position of having used PostgreSQL mainly --- for seven years, writing dozens of applications --- but never MySQL. I've also used --- out of necessity only --- Microsoft SQL, Oracle, and Ingres, and PostgreSQL is much better. Just from a programming point of view, the syntax is, in my mind, simpler yet more powerful --- more ANSI-SQL-compliant, too, I've heard.
  Anyway, the point is, I've never used anything I like more. I adore PostgreSQL. It's so powerful. So many useful datatypes, functions, syntax. Not to mention it's ACIDity.
  To your question, though --- are there any good books to help a MySQLite move to PostgreSQL? Not that I've come across. But then again, I haven't found any good PostgreSQL books --- or even, for that matter, very well-written SQL books, period. They all are stupefyingly boring --- but I got what I could out of them.
  Actually, PostgreSQL's documentation is not that bad. In particular, try sections I, II, V, VI, and III, in that order. Skip anything that bores you at first. You can always come back. Honestly, there can't be that much of a learning curve for you, coming from MySQL.
2. Re:Postgres-Curious by rycamor · 2012-09-10 17:17 · Score: 4, Informative
  
  Unfortunately, I haven't found a really good guide of the type you are looking for. I can give you my experiences, going from MySQL to PostgreSQL, back to MySQL to support it at a large company, and then back to PostgreSQL. Generally, these days there is really *nothing* that I can find about MySQL that can't be done better in PostgreSQL. I mean it. At least for awhile MySQL could boast of native replication, but Postgres biw has that and it is arguably much more robust than MySQL's solution (had the misfortune to support MySQL replication for 2 years). Ditto with full-text indexing, and just about any other MySQL feature.
  Main differences:
  1. PostgreSQL is much more "correct" in how it handles data and has very little (essentially no) unpredictable or showstoppingly odd behavior of the sort you find in MySQL all the time. Your main problem in migrating an app to PostgreSQL will be all those corner cases that MySQL just "accepts" when it really shouldn't, such as entering '0000-00-00' into a date field, or allowing every month to have days 0-31. In other words, PostgreSQL forces you to be a lot more careful with your data. Annoying, perhaps, if you are developing a non-mission-critical system like a web CMS or some such, but absolutely a lifesaver if you deal with data where large numbers of dollars and cents (or lives) depend on correct handling.
  MySQL has provided for a fair amount of cleanup for those who enable ANSI standard behavior, but it is still nowhere close to PostgreSQL's level of data integrity enforcement.
  2. MySQL has different table types, each of which support different features. For example, you cannot have full-text indexing in InnoDB (transactional) tables. PostgreSQL has complete internal consistency in this regard.
  3. MySQL has an almost entirely useless error log. PostgreSQL's can be ratcheted up to an excruciating level of detail, depending on what you want to troubleshoot. Ditto with error messages themselves.
  4. MANY MANY more choices in datatypes and functions to manipulate them. Definitely a higher learning curve, but worth it for expressive capability.
  5. Don't get me started on performance. Yes, if you have a few flat tables, MySQL will be faster. Once you start doing anything complicated, you are in for a world of pain. Did you know that MySQL re-compiles every stored procedure in a database on every new connection? PHP websites with per-page-load connections can really suffer.
  6. Don't get the idea that PostgreSQL is more complex to work with. If you want simple, you can stick with the simple parts, but if you want to delve into complex database designs and methodologies, PostgreSQL pretty much opens up the world to you.
  - Glad to be back in the PostgreSQL world...
Re:That's great and all, but . . . by schmiddy · 2012-09-10 17:02 · Score: 4, Insightful

There is a serious problem with this patch on BSD kernels. All of the BSD sysv implementations have a shm_use_phys optimization which forces the kernel to wire up memory pages used to back SysV segments. This increases performance by not requiring the allocation of pv entries for these pages and also reduces memory pressure. Most serious users of PostgreSQL on BSD platforms use this well-documented optimization. After switching to 9.3, large and well optimized Pg installations that previously ran well in memory will be forced into swap because of the pv entry overhead.

I don't see your comment on the blog (maybe it has to be approved?), but the same issue was raised here during review of the patch. The concern was mostly blown off (most PG developers use Linux instead of BSD, that might well be part of it), but if you had some numbers to back up your post, the -hackers list would definitely be interested. Ideally, you could give numbers and a repeatable benchmark showing a deterioration of 9.3-post-patch vs. 9.3-pre-patch on a BSD. If that's too much work, just the numbers from a dumb C program reading/writing shared memory with mmap() vs. SysV would be a good discussion basis.

--
http://cltracker.net -- powerful craigslist multi-city search
Re:How PostgreSQL stacks up to Oracle ? by rycamor · 2012-09-10 17:04 · Score: 5, Informative

Generally there is very little in the sense of logical data manipulation capabilities in which Oracle exceeds PostgreSQL (usually the opposite, actually). The main advantage Oracle has is in the extreme high end of scalability and replication, and that benefit is offset by massive complexity in setup and configuration. Even there, PostgreSQL is closing fast these days, with built-in streaming replication, table partitioning, and all sorts of high-end goodies.
I do all sorts of PostgreSQL consulting, and you would be surprised at the number of large companies and government organizations considering migration from Oracle to PostgreSQL.
And if you *really* need PostgreSQL to go into high gear, just pay for the commercial Postgres Plus Advanced Server from EnterpriseDB and you will get a few heavy-duty add-ons, including an Oracle compatiblity layer.
Also, IMHO one of the really cool things about PostgreSQL is the number of very geeky tools it puts at your disposal, such as a rich library of datatypes and additional features, along with the ability to create your own user-defined datatypes.