Slashdot Mirror


New PostgreSQL Guns For NoSQL Market

angry tapir (1463043) writes "Embracing the widely used JSON data-exchange format, the new version of the PostgreSQL open-source database takes aim at the growing NoSQL market of nonrelational data stores, notably the popular MongoDB. The first beta version of PostgreSQL 9.4, released Thursday, includes a number of new features that address the rapidly growing market for Web applications, many of which require fast storage and retrieval of large amounts of user data."

88 of 162 comments (clear)

  1. next for NoSQL by SchroedingersCat · · Score: 5, Insightful

    Next, NoSQL databases will add schema and ACID support and the circle will be complete.

    1. Re:next for NoSQL by cowwoc2001 · · Score: 2

      Impossible.

      The entire premise behind NoSQL is trading consistency for availability (which actually means "latency" since everything is eventually available). You will never ever get ACID from NoSQL databases.

    2. Re:next for NoSQL by bdares · · Score: 4, Informative

      "NoSQL" doesn't mean "No SQL". At least, not all the time. I've heard it pronounced "Not Only SQL" more than once. RDF triple stores can also be considered NoSQL databases, and they can provide ACID. (They use SPARQL instead of SQL as a query language - hence being something other than a SQL DB.)

    3. Re:next for NoSQL by greg1104 · · Score: 5, Interesting

      All "NoSQL" means is that the database doesn't use SQL as its interface, nor the massive infrastructure needed to implement the SQL standard. This lets you build some things that lighter than SQL-based things, like schemaless data stores. There several consistency models that let you have a fair comparison. It's not the case that NoSQL must trade consistency for availability in a way that makes it impossible to move toward SQL spec behavior.

      Differences include:

      • Less durability for writes. Originally PostgreSQL only had high durability, NoSQL low. Now both have many options going from commit to memory being good enough to move on, up to requiring multiple nodes get the data first.
      • No heavy b-tree indexes on the data.
        Key-value indexes are small and efficient to navigate,
      • No complicated MVCC model for complicated read/write mixes.

        Today NoSQL solutions like MongoDB still have a better story for sharding data across multiple servers. NoSQL also give you Flexible schemaless design, scaling by adding nodes, and simpler/lighter query and indexes.

        PostgreSQL is still working on a built-in answer for multi-node sharding. A lot of the small NoSQL features have been incorporated, with JSON and the hstore key-value index being how Postgres does that part. Both system have converged so much, either one is good enough for many different types of applications.

    4. Re:next for NoSQL by Slackus · · Score: 1

      Impossible.

      The entire premise behind NoSQL is trading consistency for availability (which actually means "latency" since everything is eventually available). You will never ever get ACID from NoSQL databases.

      There are already NoSQL databases supporting ACID: http://ravendb.net/

    5. Re:next for NoSQL by cjc25 · · Score: 2

      the SQL standard.

      That's cute

    6. Re:next for NoSQL by fuzzytv · · Score: 1

      Except that indexes are only BASE. Good luck with querying it ....

    7. Re:next for NoSQL by Anonymous Coward · · Score: 2, Interesting

      "Schemaless design" always just sounds like a whitewashed buzzword for "Excel spreadsheet" to me.

      There's a very simple way to make a "schemaless design" within a relational database, and it's generally regarded as Not Best Practice (tm). You need a table with a unique PK (any old GUID or autoincrementing integer will do just fine), a FK to whatever bit of "real indexing" you need (user id or whatever), and two string fields (varchar, nvarchar, character varying, whatever your RDBMS likes to call them). One holds the "key" and the other holds the "value". Now, you need to create an index on the FK. Not a unique index, just a nonclustered, ordinary index. It really is a shit way to store data, but that's why it's Not Best Practice (tm). And now you've just reinvented "NoSQL". And best of all, you can use a real, set-theory-based data retrieval language (that is, SQL) to retrieve it! Of course, you lose all of the advantages of that very well-thought-out language by throwing all of your data into a shit-heap, but hey, you're a web designer, it's not like you're smart enough to make a query that does anything beyond "SELECT * FROM ShitHeap WHERE UserID = @UserID" anyway.

      Of course, there are advantages to a shit-heap, which the NoSQL fanboys will no doubt express vehemently about 10 seconds after I make this post. But why would you bother with one when you're already incurring the overhead of running PostgreSQL? You have the power, and you have the system set up to handle that load. Why dumb it down? Because you're dumb? Not likely. Even the dumbest of managers know when to hire an expert.

      This just reeks of "me too!" on the part of PostgreSQL. Nobody that feels a need to use NoSQL is going to consider using PostgreSQL for that task, and nobody that uses PostgreSQL is going to feel the need to use Not Best Practices (tm) in their RDBMS schema. It's a solution in search of a problem, and it's going to flop. Don't invest your time, energy, or money in it, because it will be abandoned for non-use in a year or two.

    8. Re:next for NoSQL by drinkypoo · · Score: 1

      Nobody that feels a need to use NoSQL is going to consider using PostgreSQL for that task

      Unless they're already using Postgres for something else, and the server is already running, and they'd like to be able to use all the same tools and scripts to manage their NoSQL databases as their PostgreSQL databases. In which case, this could be a big help.

      Why so negative? Just a hobby?

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    9. Re:next for NoSQL by fuzzytv · · Score: 1

      Except that the transactions in MongoDB can touch only a single document. Which kinda makes the whole ACID idea pointless, because that's about consistency of the whole database. Saying "it's ACID, but only within a single document" is a bit like "you can have any color, as long as it's black".

      I'm not sure about CouchDB - I know it used the same approach (single-document transactions), but maybe that changed a bit.

      One of the absolutely terrible things coming from the whole NoSQL movement is redefinition of existing terms. "Consistency" is a great example of that, "availability" is another one.

    10. Re:next for NoSQL by Bill,+Shooter+of+Bul · · Score: 4, Insightful

      http://en.wikipedia.org/wiki/S...

      "Popular implementations of SQL commonly omit support for basic features of Standard SQL, such as the DATE or TIME data types. The most obvious such examples, and incidentally the most popular commercial and proprietary SQL DBMSs, are Oracle (whose DATE behaves as DATETIME,[30][31] and lacks a TIME type)[32] and MS SQL Server (before the 2008 version). As a result, SQL code can rarely be ported between database systems without modifications."

      That's why its cute.

      --
      Well.. maybe. Or Maybe not. But Definitely not sort of.
    11. Re:next for NoSQL by Anonymous Coward · · Score: 1
      MyISAM was considered ACID and it did table level locking, so I don't see why we should now disqualify MongoDB because it does document level locking.

      "Consistency" is a great example of that, "availability" is another one.

      Neither of those terms have been redefined. Hell one of the key features of most NoSQL databases is "Eventually consistent" and I have no idea what you think you mean by "availability": I've had far, far, better experiences clustering MongoDB and Redis than I ever have with (eurgh) MySQL or Postgres.

    12. Re:next for NoSQL by Bengie · · Score: 1

      It supports a small subset of cases that can be ACID.

    13. Re:next for NoSQL by Anonymous Coward · · Score: 1

      MyISAM was considered ACID

      Hahahahahahahahahaha.

    14. Re:next for NoSQL by umdesch4 · · Score: 1

      Dammit, I wish I had mod points for this. It restored my faith in humanity a little bit...so thanks for that.

    15. Re:next for NoSQL by Man+Eating+Duck · · Score: 1

      Yeah, it's irritating, but should not be an insurmountable obstacle for migrating schema+data. I have done this thrice for a database consisting of 40 tables and about 2.7 million rows total (DB2 -> PostgreSQL 7.something, DB2 -> SQL Server 2005, and SQL-Server -> PostgreSQL 9.1). Yes, I know that those numbers are small-ish, but the data contained a lot of user input and included every quirk and special character under the sun :)

      This database was storage for a Java application using Hibernate, which likely evaded some obstacles (see below). The procedure took a couple of days in each case: export schema, change a few data types in the schema (bool/int, date/datetime and so on), after which the inserts work if you pay attention to string escapes, encoding, and so on. I scripted the conversion in each case, so that every test iteration started from a fresh dump. I tested it by exporting all data from both DBs to native language data types and diffing the results.

      Of course you can get in a lot of trouble if your client software is not using abstraction for db access, I suppose that some software contains quite a lot of literal SQL in the source, and SQL syntax differs in amusing ways. In some causes I can see no other reason for it than "because fuck you, that's why". Also, if your client software relies on non-standard features of a specific RDBMS you might have to rewrite to account for that.

      So yes, I would very much like for all vendors to have a standard-compliant default mode, from which exported data would seamlessly import into other RDBMS's (re. sig, how do you write that correctly?). Sadly, most vendors (apart from OSS alternatives like PostgreSQL which should bend over backwards to make migration easier) have no interest in making it easier to switch to another RDBMS, so this will never happen.

      Granted, I was part of the team developing said application, and DB portability was something of a pet peeve of mine, for which I was very thankful during the migrations. Due to that we had few DB-related issues during those migrations. I'm not even a DB-admin by trade, so most real DB-admins should have no problems doing what I did.

      --
      Are you a grammar Nazi? I'm trying to improve my English; please correct my errors! :)
    16. Re:next for NoSQL by Man+Eating+Duck · · Score: 1

      While your parent *was* a bit snarky in his reply, I can see only two reasons why you would try to finagle your NoSQL needs into a PostgreSQL server: you don't understand how to use a traditional RDBMS but would still like to advertise that you're using PostgreSQL instead of MongoDB (not likely for most devs), or the decision is made for you by management for the reasons you mention. If you need some NoSQL solution in a new project it's not very difficult to create an instance, and the infrastructure for the future production DB of your project should be a consideration based on your needs, not what is incidentally already there (hey, it's a DB, it should do the job, right?). Right tool for the job.

      For the record I am very used to working with traditional SQL databases, and I particularly like PostgreSQL. Still I know there are lots of use cases for the various NoSQL DBs. They are different beasts, some of which are tailored for very specific applications. I haven't scrutinised the new features of PostgreSQL, but if a NoSQL db were a better fit for the project I would need strong reasons not to go for it.

      *Analogy warning* If you have to change a large amount of Torx screws, you could probably accomplish it with a flat blade screwdriver of an approximate size if that's what you have in your shed, but it might save you a lot of destroyed screws to buy a Torx driver instead.

      --
      Are you a grammar Nazi? I'm trying to improve my English; please correct my errors! :)
  2. Re:Nice touch but too late! by Bryan+Ischo · · Score: 1, Insightful

    References, please.

    I have a feeling you can't produce anyway, because relational databases are still widely used.

  3. Re:Nice touch but too late! by mwvdlee · · Score: 5, Insightful

    By "industry" you mean the 0.001% of websites that could actually benefit from NoSQL?
    How many sites you visit use NoSQL? Do most webshops, blogs, news sites and forums? Does Slashdot?

    --
    Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
  4. How about Parallel Query Execution? by wispoftow · · Score: 4, Interesting

    NB: I love PostgreSQL with all my heart. I always upgrade to the most recent version, because they implement features that I really need. Added to the existing features of Postgres, it's totally awesome.

    But as I have moved toward "Big Data" and the market segment that these new-fangled (non-relational) databases target, I find myself wishing that Postgres would be able to run my vanilla query (*singular*) using all processors. As it is now, I have to either write some awful functions that query manually-partitioned subtables, or simply wait while it plods through all billion or so rows.

    1. Re:How about Parallel Query Execution? by mlyle · · Score: 5, Interesting

      Look at Postgres-XL that we just released. It's a clustered database and can do MPP execution of your queries and has good write-scalability too. (To use all the processors in each machine, you'll want to have a few data nodes per machine.) It's pretty clever about planning a lot of things.

    2. Re:How about Parallel Query Execution? by CadentOrange · · Score: 1

      Do you know if there are any plans to merge Postgres-XL into vanilla Postgres?

    3. Re:How about Parallel Query Execution? by Anonymous Coward · · Score: 1

      Do you really have enough I/O bandwidth in the machine to keep multiple processors busy on a single query?

    4. Re:How about Parallel Query Execution? by Wootery · · Score: 1

      I like the way the linked page uses Web 2.0 when it means scalability.

      Great job with the buzzwords.

    5. Re:How about Parallel Query Execution? by KingSkippus · · Score: 2

      I like the way the linked page uses Web 2.0 when it means scalability.

      Great job with the buzzwords.

      You know, I was just going to let this go, chalked up as random Internet stranger being an asshat, but seriously. Are you SO bored or jealous of other people's achievements that you have nothing better to do than to sit around and nitpick the friggin' ad copy of a marketing page that was undoubtedly written not just for people who want to know the technical specifications of the product, but common usage applications for it also? What you're calling a "buzzword" is information that business wonks need to know when faced with the question, "Will this solve my problem/fulfill my needs?"

      When you develop your own database system, you can write your own ad copy to say whatever you want it to. Or if you prefer, apply for a job at Postgres as their chief marketing guru, and if they're dumb enough to hire you, you can write its ad copy to be purely technical-oriented until the product is completely irrelevant in an actual production environment. ("Now for OS/2 Warp and BeOS!") Otherwise, forgive me if I don't put much weight into your opinion on the matter over the people who have written a kick-ass enterprise-quality system that is pretty much given away for free.

      Seriously, what exactly are you implying by your comment, that PostgreSQL isn't a capable database system? That they just use buzzwords instead of actual technical brainpower and muscle as the basis of their software? Because I can tell you that to people who architect, engineer, administer, and eat database systems for breakfast, you are sadly off-base here, and this comment comes off as extremely pompous and ignorant.

    6. Re:How about Parallel Query Execution? by fuzzytv · · Score: 1

      So what exactly does 'web 2.0' mean? Because I can tell you it's a completely vague term, used to create hype around so many disparate concepts it lost all the meaning it once had. And even if you manage come up with some definition, do you really think the business wonks will understand it / should be responsible for choosing the technology?

      Wootery implied nothing about Postgres being incapable database system, just that the 'web 2.0' is a buzzword. That says nothing about Postgres (or rather Postgres-XL, because that's what the website is about). And IMO he's right. Also, the ad hominem arguments are annoying.

      BTW Postgres is not a company. It's an open-source project, with a community developing it, so there's no 'chief marketing guru' position to apply for.

    7. Re:How about Parallel Query Execution? by mlyle · · Score: 2

      Really premature, and unlikely in any event.

      I think Postgres is good for what it is-- a clean, single-node database system. Clustering adds some complexity in deployment (we're working on making this easier) that you wouldn't want to incur for a typical Postgres install.

      I think there are pieces of Postgres-XL that make sense to be in core/vanilla PostgreSQL, and we'll be working to contribute them upstream. Likewise, there are more pieces from TransLattice's commercial database offering, TED, that Postgres-XL could benefit from that we intend to contribute.

    8. Re:How about Parallel Query Execution? by Wootery · · Score: 1

      Are you SO bored or jealous of other people's achievements that you have nothing better to do than to sit around and nitpick the friggin' ad copy of a marketing page that was undoubtedly written not just for people who want to know the technical specifications of the product, but common usage applications for it also?

      I'm a techie. Like many, I find vacuous marketing to be grating. That's really all I was trying to get across. It's a very minor detail, and a very minor dig. I'm not attacking your product.

      (I can play at quasi-psychoanalysis though, if that's the game: are you so insecure about your product you have to rant at snarky Slashdotters, and imagine further insult which isn't there?)

      What you're calling a "buzzword" is information that business wonks need to know when faced with the question, "Will this solve my problem/fulfill my needs?"

      Other than a mention of JSON support, nowhere on the page are there any web-specific points being made. The section labelled 'Web 2.0' then goes on to discuss scalability. I get that scalability is what's being hinted at, but really this situation should be reversed.

      I wasn't commenting about Postgres-XL, or PostGres. Indeed, I warm to Postgres - a rare example of a 'proper, grown-up DBMS' that prioritises correctness - and I like the general look of Postgres-XL.

    9. Re:How about Parallel Query Execution? by mlyle · · Score: 1

      Sure, and it inherits from Postgres-XC significant code.

      However, it has a few things Postgres-XC doesn't. First, if you write a complicated join, Postgres-XC sends the entire tables involved in the join back to the coordinator node, which then promptly dies/grinds to a halt. Postgres-XL has a planner that is able to A) push down much more of complicated queries, and B) allow tuples involved in sophisticated queries to flow directly between the different data nodes.

      Postgres-XL's planner is also able to run large queries in parallel across the entire system (MPP).

      Finally, we've done a fair bit to improve both stability/maturity and to eliminate various kinds of bottlenecks that were bottlenecking OLTP performance. We also have some small additional features (multitenancy-- hiding stuff from the catalog people shouldn't see and making databases true isolated containers). On the other hand, we are also missing some features that Postgres-XC has because of our focus on stability and production readiness.

    10. Re:How about Parallel Query Execution? by mlyle · · Score: 1

      You can try it in the cloud for free for small datasets/poke around by signing up at http://stormdb.com/ .

      The database server itself is a little harder to deploy than PostgreSQL, but we have a fairly decent set of documentation (feedback appreciated) at http://files.postgres-xl.org/d... . As far as application support goes--- if it runs on Postgres, odds are it will run on Postgres-XL.

  5. No SQL by TechyImmigrant · · Score: 2

    It would be nice if noSQL databases adhered to the promise in the name. They replace the query language with something sane and secure.

    --
    I should use this sig to advertise my book ISBN-13 : 978-1501515132.
    1. Re:No SQL by bhcompy · · Score: 1

      Depends on what you mean by safe. Pick ENGLISH queries are strictly read only.

    2. Re:No SQL by shrewdsheep · · Score: 1

      All databases are relational (noSQL or otherwise). SQL formalizes some aspects of relational algebra but this does not imply anything about the implementation nor necessarily about the interface. If you like "simpler" interfaces use ODBC/ORMs on SQL or noSQL databases.

    3. Re:No SQL by jbolden · · Score: 1

      No they aren't. Object databases are not relational they are hierarchical. Associative databases are associative not relational. And many of the "NoSQL" databases the partitioning scheme changes the outcome of queries which is a complete violation of the relational concept that order of rows doesn't effect the return from invoking queries.

    4. Re:No SQL by Anonymous Coward · · Score: 1

      So what's wrong with SQL? I've honestly never found myself thinking it sucks as a query language, and the parts I did think sucked outright monkey balls with package-specific functions (Oracle sucks bad for this in places...)

      I hear this complaint a lot but it's never really quantified by those who say it and I'd love to know why. If what people meant to say is 'I hate having to organise my data into relational tables and hate having to deal with the process of upkeep and curating my data' then yes, that is a fair comment but that's more about the database's mode of operation itself rather than it's query language.

    5. Re:No SQL by Anonymous Coward · · Score: 2, Interesting

      The main problem is that SQL sucks.

      Compared to what? I'm not sure you have any idea what kinds of inflexible horrors preceded the relational systems. Furrthermore, SQL is based on relational algebra, which underlies the whole RDBMS concept; if you need a data manipulation language closely matching the capabilities of an RDBMS, it has to be set-oriented and based on relational algebra (or relational calculus, which is equivalent). And there you have the root cause of the problem: a serious impedance mismatch between a set-oriented query language and a regular imperative programming language (OO notwithstanding.)

      The so called "4G" languages tried to bridge this gap and failed miserably. Various ORM schemes are not brilliant, either. Ruby on Rails seemed to offer a glimmer of hope with its "convention over configuration" approach, but that ran into a myriad of exceptions and performance problems. Nevertheless, SQL is too well matched to the strengths of relational systems to be discarded without thought. I don't know what the solution is, but ditching SQL in toto isn't likely to be part of it.

    6. Re:No SQL by drinkypoo · · Score: 1

      No they aren't. Object databases are not relational they are hierarchical.

      The only CORBA implementation with which I've ever been familiar definitely used a relational store, and made relational queries. I think you are speaking a little too generally.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    7. Re:No SQL by fuzzytv · · Score: 1

      Right. Implementing this without SQL is so much simpler.

      Oh wait ...

    8. Re:No SQL by jbolden · · Score: 1

      I'm talking about intermediate calculations. Just to pick a stupid example say you had a 3 row database and one of the columns was a 2*2 matrix. You wanted to take the product.

      a*b*c != b*a*c so row order matters but
      a*(b*c) = (a*b)*c so order of computation does not.

    9. Re:No SQL by jbolden · · Score: 1

      I haven't played around with Object Database and CORBA. But at least on first glance the two seem to be pulling in opposite directions. An Object Database at its best is an extension of memory for a family of applications all sharing a common object library. Essentially one big application. CORBA is moving in the idea of relational, but even further, trying to separate data from the underlying application. It seems to me that's an either / or choice.

  6. Horizontal scalability? by michaelmalak · · Score: 2, Interesting

    A hallmark of NoSQL is horizontal scalability. The lack of schema in NoSQL was a brief rebellion against ivory tower DBAs that has since been regretted once it was realized that merely transferred the schema and schema versioning implicitly into the source code, and spread throughout it. Sounds like PostgreSQL got the bad part of NoSQL but not the good part.

    1. Re:Horizontal scalability? by mlyle · · Score: 4, Informative

      We just released Postgres-XL so you can have horizontal scalability and MPP.

    2. Re:Horizontal scalability? by salimma · · Score: 1

      Is there a reason for adopting a different license from core PostgreSQL ? Seems like it makes the information flow a one-way street (from SQL into XL but not vice versa). Looks like an interesting project! Throw in EnterpriseDB-level Oracle compatibility and there's a captive market out there

      --
      Michel
      Fedora Project Contribut
    3. Re:Horizontal scalability? by mlyle · · Score: 1

      Honestly, just organizational readiness. We are likely to move to a BSD-style license in the future... but initially taking a license that made it a little harder to have a closed-source fork was easier to convince some members of the team/board. (There's been a significant amount of investment in producing this codebase and we wanted to make sure we didn't immediately enable a competitor).

  7. Is it web scale? by Anonymous Coward · · Score: 1, Funny

    Is Postgres now web scale?

    1. Re:Is it web scale? by gargleblast · · Score: 1

      Is Postgres now web scale?

      That depends: are you going to blow some project to hell because you get a woody playing with software like it's a sex doll?

  8. Re:Going in the wrong direction by Tablizer · · Score: 5, Insightful

    Most people don't need NoSQL. Last I checked, most people aren't Facebook or Google

    Some people get overly optimistic about their start-ups or new projects. It's like planning on where to park all the beemers before you even get your first sale.

  9. Mutt-hater! by Tablizer · · Score: 4, Funny

    Mixing/Mashing makes no sense

    No HalfSQL movement?

  10. Re:Nice touch but too late! by bhcompy · · Score: 1

    Well, not the industry being referenced, but many enterprise accounting and inventory databases are PICK based because they're blazing fast and completely reliable. ADP is an example of a company that sells PICK based solution. OpenQM is the open source descendant of PICK

  11. Re:If this is anything like MariaDB by Anonymous Coward · · Score: 5, Informative

    I am actually *using* this thing. Implemented a database with ~100K XML records, access them by arbitrary XPATh expression.

    Of course "normal" access is slow, but once I agree with the customer on an access pattern, I can set up a functional index. Then we are at a couple millisecs per access (on very low-end hardware). And with GIN indexes, I can even set up things like "find all records where tag A or tag B or tag C equal one of "foo" or "bar". All for a handful millisecs. No database tuning whatsoever -- plain vanilla PostgreSQL 9.1 as it comes to us with Debian.

    Of course you can't compare it with -- say -- Elastic Search, but as soon as I finish uttering "Java" my box is out of memory :-)

    OK, on a more serious note: the usage patterns still are different: if you plan to have 100M biggish records, you'll probably want to throw a lot of boxes at the problem (unless the problem has a very specific structure). Then you'll probably be better off with Elastic Search or some such. OTOH if you want transactions, an SQL database it is. If you need both, you are in a tough place (cf. CAP theorem), so you gotta think hard.

    I don't fucking care whether it's called SQL or noSQL if it's well-done. And PostgreSQL is damn well done. The community rocks too.

  12. Re:If this is anything like MariaDB by Anonymous Coward · · Score: 1, Informative

    Actually PostgreSQL performs a lot better than MySQL/MariaDB, and looks after your data better. PostgreSQL has far fewer gotcha's (no database is perfect!), and is not the mess that MySQL is.

    People started migrating from Oracle to PostgreSQL at version 8.0 or earlier, and now PostgreSQL is at 9.4 beta. Companies like Digg initially went to MySQL because Master/multislave was in core for MYSQL - now as of 9.0, PostgreSQL has that too.

    With PostgreSQL supporting JSON, it allows people to use the NoSQL paradigm were it appears to be useful, without locking themselves out of a fully fledged relational database.

    If your data is important to you, and you might need some serious DB querying stuff, then PostgreSQL should be considered.

  13. the hype by Tom · · Score: 5, Insightful

    As a big fan of SQL database, I've been watching this NoSQL hype for a few years now, and I'm still not impressed.

    No doubt, there are a few scenarios where a conventional database isn't the best solution. But quite honestly, 90% of the people jumping on the bandwaggon would be served just as well with an SQL database - except that like so many things, you need to do it right.

    I'm no database expert (but I know a couple), so my SQL isn't perfectly optimized and stuff, but even with a little bit of interest I see that putting some effort into your database and query design pays off massively.

    And I've seen enough cases where someone scraped their SQL database and went NoSQL for absolutely no good reason. You think you're huge and SQL is too slow? Unless you just sold to FB or Google for a couple billions, you very likely are not as huge as you think. I'm running a PostGIS database doing fairly complex geography calculations on non-trivial datasets, and it's blazing fast, and whenever it isn't one hour with an SQL expert and some experimenting makes it so, because it always, with no exception, turns out that my SQL or my database design is at fault, not the database itself.

    If you've got a billion users, I will grant you that you have special needs. But every NoSQL use I have seen has been a case of people moving database work to software code instead, mostly because programmers are plenty and cheap, while experienced database experts are not.

    So I'm still amused and very little impressed, and I'm certain NoSQL will go the way of Java or every other hype ever - for a while it's everyone's darling, then people realize it still doesn't give us AI and it can't make coffee, and will start to figure out where it actually is the best solution and stop using it for everything else.

    --
    Assorted stuff I do sometimes: Lemuria.org
    1. Re:the hype by CadentOrange · · Score: 2

      I've seen this a million times. People with poorly designed relational databases with no thought given to query plans complain that their database is slow. They then migrate said database to a NoSQL solution (typically a document database like MongoDB) and then find that it is still slow! . In a few cases, the NoSQL solution is significantly slower.

      The problem is NoSQL encompasses many different types of solutions. Key value stores like Redis are pretty good (key lookups support wildcards!!!) and I use them as an alternative to memcache. Document databases like MongoDB? If you're excited about them because you don't need a schema, you're just asking for carloads of trouble down the line because you've mistakenly bought into the thinking that you can just chuck arbitrary data into Mongo and get it to perform well.

    2. Re:the hype by Common+Joe · · Score: 2

      There's a pretty good book I own in paperback (electronic versions available) for high performance stuff from PostgreSQL. It's called PostgreSQL 9.0 High Performance. It's probably beyond what you want, but if you're interested in looking at it, let me know and I'll bring it next time we get together.

    3. Re:the hype by dave420 · · Score: 1

      And your work is entirely not intended to be performed in a NoSQL DB. There are still plenty of uses for NoSQL, and it is most certainly not hype itself (although there has been lots of hype about it). It will be here for a long, long time, as it has some incredibly useful use-cases. You should accept that there are uses you are unaware of :)

    4. Re:the hype by complete+loony · · Score: 1

      I don't have a problem with relational databases, but even though I'm pretty good at writing SQL queries, I don't like SQL as a language.

      There should be a middle ground somewhere between SQL, Map-Reduce and Object Oriented coding styles. I want code reuse, I want encapsulation, I want LLVM-like compiler optimisations. I want to push as much processing down to where the data is, without needing to hand optimise everything.

      SQL doesn't provide much of those things.

      --
      09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
    5. Re:the hype by jbolden · · Score: 1

      Java hasn't exactly gone away.

      Anyway there are fundamental design choices.

      Relational requires that table row order is completely unimportant i.e. essentially fully commutative
      No SQL requires that computations on table row order is merely associative

      Many many mathematical operations are associative that are not commutative. That can be a huge change in how data is manipulated computationally far far faster. You don't have to be Google or Facebook you just have to have enough data and complexity that CPU is a serious constraint (rather than disk access).

    6. Re:the hype by Tom · · Score: 2

      If your application is really, really simple and you need truly massive amounts of throughput, then NoSQL is no doubt the way to go.

      Somewhere between 1% and 10% of the shops doing NoSQL really fall into that category. Maybe as many again might, with enough growth.

      Many years ago, long before NoSQL was a thing, I was the sysadmin of one of the largest e-commerce operations in my country. We had enough users and data and throughput that a big consulting company that was tasked with developing the next-gen system for us proposed that we buy not one, but two Sun E10k. At that time, pretty much the biggest commercial machine money could buy. The current system ran on six quad-core Dell servers. Because we had optimized the living daylights out of it, with custom shared memory kernel modules for data exchange, a highly customized database installation (we were Europe's largest installation of this system, so we had pretty much every vendor support we could wish for) and, most importantly, an exceptionally well-crafted design and implementation.

      You can throw NoSQL at your problem for maximum scalability. But everything in this world comes with a prize tag. You sacrifice all the advantages of SQL. As I said: The use of NoSQL very often means moving problems that a good SQL database solves for you into code. So it means more code with more potential bugs, all in order to re-invent the wheel because you think you can make it more round. :-)

      But again: For a few percentage of cases, it is the right way to go, I am not saying it's all nonsense. Just saying it's a hype and it'll calm down.

      --
      Assorted stuff I do sometimes: Lemuria.org
    7. Re:the hype by Tom · · Score: 1

      Java hasn't exactly gone away.

      Which is exactly what I'm saying.

      Every hype ever has always followed the same pattern. First it is the second coming of christ (or, for hypes prior to 0 AD, the first). Then, it is the solution to all your problems and everyone uses it for everything. You can't get venture capital, employment or a marriage without it. After a while, people realize that for some strange reason, sliced bread is really cool, but it isn't really the best armour and the roof is always leaking and the wheels could really be more round. Every stone table / bard / newspaper / Twitter / microtelepathy chip then sings the song of crash and burn, while some tech geeks silently figure out just what it is really good for and what not. In the end, we get sliced bread for breakfast and the rest of our lives goes back to what it used to before.

      Until the next hype.

      --
      Assorted stuff I do sometimes: Lemuria.org
    8. Re:the hype by Tom · · Score: 2

      Because it's a database.

      SQL is 40 years old. In that time, dozens of programming languages, patterns and styles have come and gone. And SQL is still here, exactly because it doesn't care if your language is OO, functional, redundant, brainfuck or agile deployment for optimized synergies with next generation engagement framework whatever.

      A database needs to concern itself with the data, not with the programming patterns of the application.

      --
      Assorted stuff I do sometimes: Lemuria.org
    9. Re:the hype by Tom · · Score: 1

      I do. Re-read my original posting, and this time until the end. :-)

      --
      Assorted stuff I do sometimes: Lemuria.org
    10. Re:the hype by Daniel+Hoffmann · · Score: 1

      Just like to point out that performance is not the _only_ reason to switch to a noSQL database. For example, in my project we are switching our very small DB to a noSQL solution to have schema-less data. Other examples include: proper Object-Oriented mapping to the database (no hibernate hell,) graph databases, distributed databases with auto-sincronization (part of the database is on a mobile phone and when it connects to the internet it syncs with the remote server automatically.)

      Sure you can do all that with a traditional relational database, but it brings a whole lot of pain.

    11. Re:the hype by complete+loony · · Score: 1

      Writing queries on databases with large numbers of heavily normalised tables is a pain. Many queries end up similar or the same, with common table joins, and similar filter criteria. Sometimes this means you end up writing front end code to emit SQL, so that you can increase the flexibility of your application. And then you want to change the structure of a few tables. Good luck hunting down all of the queries in your app that need to change.

      Or if a query gets too complicated for the database to use the right index. You end up re-writing the whole thing manually, using temporary tables, perhaps in a procedure.

      Compilers have come a long way since databases and SQL were invented. Surely we can shift building query plans into the compilation process? Eliminate dead columns from the result set. Why stick with a just in time language when I'm sure we could build something better.

      --
      09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
    12. Re:the hype by Daniel+Hoffmann · · Score: 1

      SQL is still there because it is (almost) pure relational algebra, the math is there to prove its correctness and the math can be used to optimize your queries. You can prove that one query returns the exact same lines as another, so the DB can find and run the faster query instead of your slower one.

    13. Re:the hype by Jawnn · · Score: 1

      But every NoSQL use I have seen has been a case of people moving database work to software code instead, mostly because programmers are plenty and cheap, while experienced database experts are not.

      This, in spades. And my life ( a large part of which is seeing to the performance of our applications) is hell because of it. If I had a dollar...

    14. Re:the hype by peter.kingsbury · · Score: 1

      But java IS coffee!

    15. Re:the hype by iggymanz · · Score: 1

      there are plenty of mature and better persistence layers out there than hibernate, that work with SQL database. You lose a lot when you lose schemas, structuring and knowing types of data is very useful for later analysis. Pay now, don't worry about it later

    16. Re:the hype by Daniel+Hoffmann · · Score: 1

      Some noSQL allows for hybrid schemas (have part of the record be structured and optionally with indexes, but also allow any other data to be added to the record) and that is what I am using. Personally I have not used other persistence layers other than hibernate, but my time with hibernate left me very bitter.

      But anyway, that was not the point I was trying to make, the point is that whenever people say "database" they think they need the absolute most efficient thing ever. But there are a myriad of cases where performance is not that important, some noSQL databases are plenty easier to use than the conventional JDBC/ODBC/Persistence library used to interact with SQL databases. People often forget to weight performance against programmer productivity.

    17. Re:the hype by Tom · · Score: 1

      People often forget to weight performance against programmer productivity.

      That's exactly when you want to use a library, framework, etc.

      Personally, I've become a fan of Doctrine 2, which does all the nice object-orientation and other stuff and I still have a SQL database behind it all with its power and performance. Not to mention the crazy stuff you can do with good queries.

      --
      Assorted stuff I do sometimes: Lemuria.org
    18. Re:the hype by Tom · · Score: 1

      There are tools to accomplish this (orms) but this is incredibly kludgy and a pain to maintain.

      Really? Funny how I've never noticed. I've used several ORMs, recently Doctrine 2, and it's the opposite of pain.

      I can have an easy to maintain database connection up in mongo immediately with zero impedance mismatch and rapid development. I can push "mongo" collection objects all the way up to the angular UI and back down to the database with almost zero coding. I was playing with a little app last night and wanted to add crud support. It took about 10 seconds to source the mongo driver and have the code complete.

      There's also a mismatch between rapid development and production code. For my purposes, ORMs solve the gap perfectly. I don't know your use case, so I won't judge.

      However, there is a big difference between prototypes and hacking up a quick app and doing something for serious production use. I see you understand that as well. I've just gone a step further - I prototype in Symfony2 with Doctrine2, because it means I don't have to completely re-write everything for production use.

      --
      Assorted stuff I do sometimes: Lemuria.org
    19. Re:the hype by Tom · · Score: 1

      Good luck hunting down all of the queries in your app that need to change.

      You've heard of database abstraction layers, yes? :-)

      Here is how I and everyone sane I know develops: Write straighforward code and queries first. Then check where your performance bottlenecks are and optimize those, ignore the stuff where it doesn't matter if you could improve performance. In one current app, I have some queries that get executed on the order of 200,000 times whenever I make a run, and some that get executed ten times. Do I give a flying fuck if I could optimize the queries that get called 10 times?

      So maybe some queries are a pain. I'm not a DB expert so my most complicated queries join over 10 or so tables with but a single sub-select. But even so, the pain is small because only the most frequently executed queries get the hand-optimisation treatment. Everything else I leave to the ORM because I know that on less frequently called queries, caching give me more performance gain than doing some SQL-fu.

      --
      Assorted stuff I do sometimes: Lemuria.org
  14. Re:Nice touch but too late! by Simon+Brooke · · Score: 5, Insightful

    A small minority of companies, with very special needs, are using NoSql databases for a small proportion of their operations. Those companies do include some big ones, such as Google and Twitter, but still in absolute terms the numbers are small. A tiny minority of companies have moved away from relational databases altogether. But the numbers are statistically insignificant and are likely to remain so for decades. And the relational model does have some real and enduring benefits which will make it the right tool for many jobs far into the future.

    Remember this is an industry that advances very slowly indeed. Your bank, and your utility companies, are still using programs written in COBOL - technology which is fifty years behind the curve.

    --
    I'm old enough to remember when discussions on Slashdot were well informed.
  15. Re: apples and oranges by Anonymous Coward · · Score: 1

    I prefer to use Mongo when prototyping - when the data structures are still in flux, and I want something quick, flexible, and - most importantly - right now.

    That, and setting up MongoDB is a breeze: `aptitude install mongodb`, and you can start using it straight away.

  16. Re:Going in the wrong direction by Anonymous Coward · · Score: 1

    "NoSQL" is a band-aid while programmer competence catches up with resource constraints. This old cartoon continues to be relevant, but mostly it's, "We don't know how to use this wheel, so we're going to reinvent it poorly. Eventually our duplication of effort will be complex enough that we're going to need to move something indistinguishable from the traditional system, but nobody needs to say that yet."

    It's like the programming language rule that everything eventually looks like a reimplementation of LISP with syntactic sugar.

  17. Re:If this is anything like MariaDB by fuzzytv · · Score: 3, Informative

    Well, yes and no. PostgreSQL had a text-only JSON data type since long time, and was able to index keys using expression indexes. That's nothing new.

    The 9.4 improvements are that the (a) JSONB is stored in a binary form, and (b) a lot of ideas from HSTORE data type, plus new ones were implemented. That means that you can create "universal" index without prior knowledge of what keys will be interesting. So then you can ask for data containing arbitrary keys, sets of keys, values, documents etc. See http://www.postgresql.org/docs...

    Sure, it's not perfect and the index may get somehow big, but well ...

  18. Nice touch but too late! by fuzzytv · · Score: 4, Insightful

    I don't know whether angry tapir knows what relational means, but I see nothing in his post IMHO suggesting he has no clue. JSON is great for storing non-relational data (hierarchies, data without fixed set of columns, ...). Not all data are purely relational, it's often a mix.

  19. Re:Nice touch but too late! by psmears · · Score: 1

    (Todd Codd anyone?)

    Close... acutally his name was Edgar Frank "Ted" Codd

  20. Going in the wrong direction by LinuxFreakus · · Score: 1

    Actually, more than you think should probably use NoSQL. It isn't really any harder if you build it that way from the start and if your startup happens to get gigantic you won't have a relational database to migrate away from as one of your problems. You'll still have problems though, and even with NoSQL you need to "do it right" or it will still have issues when it gets huge.

  21. Going in the wrong direction by LinuxFreakus · · Score: 1

    And I might add that one of the most painful parts of migrating away from relational databases after you are already huge and bursting at the seams is that usually folks will have relied on the transactional consistency they provide for all the app logic and business processes. Suddenly wanting to change all that code to handle eventual consistency is not trivial at all, but if you were doing it all along because you started out that way... fewer pains.

  22. the hype by LinuxFreakus · · Score: 1

    No matter how much you optimize your schema and your queries there are limits to what one machine can handle. Depending on what your application or business needs are, this may happen MUCH sooner than a billion users. For many, merely tens of millions of really active users are enough to exceed these limits, and when you are a startup trying to grow and add features it is easier said than done to ensure that every piece of code you release is so perfect that you will not rock the boat at all, since one minor slip effects EVERYTHING. At that point your choices are custom sharding (expensive, painful, error prone). Or horizontally scalable NoSQL. Personally, I would just choose NoSQL from the start. It is not harder to use, and you have a lot more wiggle room to respond when you want to release features quickly and iterate over them to improve performance if you decide to keep them. And yes, you can use functional sharding and multiple relational databases, but sooner or later if you are successful, you will hit the same problem.

  23. It's not just about being a document store by dvoecks9011 · · Score: 1

    When we were looking into new options to supplement our MSSQL servers, we settled on Mongo. We were aware that Postgres will act as a document store in addition to being a traditional RDBMS, but our decision was largely based on 2 things: We acknowledge that we'll likely never be able to completely eliminate our use of MSSQL. So, if we need an RDBMS, it will still be there. The other main factor was that Mongo makes replication, failover, and sharding a snap, relative to other systems. We don't have a DBA to implement replication for us. So, the simplicity was a huge factor. There are billion ways to store your data, and they all come with positives and negatives. Postgres can pretty much be everything to everyone, but like any system where that's the case, it can be harder to configure (for me, a developer, anyhow... I'm sure DBAs are laughing at me).

  24. Re:Nice touch but too late! by Curunir_wolf · · Score: 1

    I still need a list of these big sites that moved from SQL. No MySQL, no Postgre, no Hadoop...

    Hadoop uses SQL? I don't think so.

    --
    "Somebody has to do something. It's just incredibly pathetic it has to be us."
    --- Jerry Garcia
  25. Re: Nice touch but too late! by bhcompy · · Score: 1

    Pick and it's descendants are NoSQL by definition. It is not a relational database, it has no schema. It has a master dictionary and a subset of dictionaries for each account defined. Pick databases are accessible over SQL with an API/wrapper. By default, Reality implementations are generally an OS running in a VM process on a host operating system(ADP uses Digital UNIX/Tru64), while the wrapper allows it to communicate with the outside world.

  26. Re:Nice touch but too late! by Anonymous Coward · · Score: 1

    You can use HIVE on Hadoop. It supports a subset of SQL.

    You can't do everything with it, however. No in-equijoins, not much in the way of analytic support. But you can do joins, aggregations and some other stuff.

    Still it's no replacement for a proper RDBMs when you want to get frisky with the queries. Hand writing map reduce jobs to do exactly what you need can be time consuming to write, and even more so to change with evolving requirements.

  27. Not really by bigsexyjoe · · Score: 1

    "NoSQL" is a pretty bad name actually. They should be called non-relational databases. In many cases you can use SQL or something like SQL on them.

    People never use NoSQL to get away from the SQL language (although I don't like SQL at all). They use it to change the trade-offs in ACID complacence and to not have to keep their data completely relational.

    1. Re:Not really by ultranova · · Score: 1

      "NoSQL" is a pretty bad name actually. They should be called non-relational databases.

      A typical file system is a non-relational database. Does that make it NoSQL?

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

    2. Re:Not really by acscott · · Score: 1

      Martin Fowler discusses the NoSQL moniker and seems to agree with you: https://www.youtube.com/watch?... It' NoSQL Distilled to an hour by Martin Fowler from NoSQL Matters Conference

  28. Re:Missing the point entirely... by fuzzytv · · Score: 1

    Nothing. Because the postgres community didn't mean this to be "aim at the NoSQL market." The fact that angry tapir puts that into an abstract on ./ does not make it 'community opinion'.