New PostgreSQL Guns For NoSQL Market
angry tapir (1463043) writes "Embracing the widely used JSON data-exchange format, the new version of the PostgreSQL open-source database takes aim at the growing NoSQL market of nonrelational data stores, notably the popular MongoDB. The first beta version of PostgreSQL 9.4, released Thursday, includes a number of new features that address the rapidly growing market for Web applications, many of which require fast storage and retrieval of large amounts of user data."
Next, NoSQL databases will add schema and ACID support and the circle will be complete.
References, please.
I have a feeling you can't produce anyway, because relational databases are still widely used.
By "industry" you mean the 0.001% of websites that could actually benefit from NoSQL?
How many sites you visit use NoSQL? Do most webshops, blogs, news sites and forums? Does Slashdot?
Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
NB: I love PostgreSQL with all my heart. I always upgrade to the most recent version, because they implement features that I really need. Added to the existing features of Postgres, it's totally awesome.
But as I have moved toward "Big Data" and the market segment that these new-fangled (non-relational) databases target, I find myself wishing that Postgres would be able to run my vanilla query (*singular*) using all processors. As it is now, I have to either write some awful functions that query manually-partitioned subtables, or simply wait while it plods through all billion or so rows.
It would be nice if noSQL databases adhered to the promise in the name. They replace the query language with something sane and secure.
I should use this sig to advertise my book ISBN-13 : 978-1501515132.
A hallmark of NoSQL is horizontal scalability. The lack of schema in NoSQL was a brief rebellion against ivory tower DBAs that has since been regretted once it was realized that merely transferred the schema and schema versioning implicitly into the source code, and spread throughout it. Sounds like PostgreSQL got the bad part of NoSQL but not the good part.
Is Postgres now web scale?
Some people get overly optimistic about their start-ups or new projects. It's like planning on where to park all the beemers before you even get your first sale.
Table-ized A.I.
No HalfSQL movement?
Table-ized A.I.
Well, not the industry being referenced, but many enterprise accounting and inventory databases are PICK based because they're blazing fast and completely reliable. ADP is an example of a company that sells PICK based solution. OpenQM is the open source descendant of PICK
I am actually *using* this thing. Implemented a database with ~100K XML records, access them by arbitrary XPATh expression.
Of course "normal" access is slow, but once I agree with the customer on an access pattern, I can set up a functional index. Then we are at a couple millisecs per access (on very low-end hardware). And with GIN indexes, I can even set up things like "find all records where tag A or tag B or tag C equal one of "foo" or "bar". All for a handful millisecs. No database tuning whatsoever -- plain vanilla PostgreSQL 9.1 as it comes to us with Debian.
Of course you can't compare it with -- say -- Elastic Search, but as soon as I finish uttering "Java" my box is out of memory :-)
OK, on a more serious note: the usage patterns still are different: if you plan to have 100M biggish records, you'll probably want to throw a lot of boxes at the problem (unless the problem has a very specific structure). Then you'll probably be better off with Elastic Search or some such. OTOH if you want transactions, an SQL database it is. If you need both, you are in a tough place (cf. CAP theorem), so you gotta think hard.
I don't fucking care whether it's called SQL or noSQL if it's well-done. And PostgreSQL is damn well done. The community rocks too.
Actually PostgreSQL performs a lot better than MySQL/MariaDB, and looks after your data better. PostgreSQL has far fewer gotcha's (no database is perfect!), and is not the mess that MySQL is.
People started migrating from Oracle to PostgreSQL at version 8.0 or earlier, and now PostgreSQL is at 9.4 beta. Companies like Digg initially went to MySQL because Master/multislave was in core for MYSQL - now as of 9.0, PostgreSQL has that too.
With PostgreSQL supporting JSON, it allows people to use the NoSQL paradigm were it appears to be useful, without locking themselves out of a fully fledged relational database.
If your data is important to you, and you might need some serious DB querying stuff, then PostgreSQL should be considered.
As a big fan of SQL database, I've been watching this NoSQL hype for a few years now, and I'm still not impressed.
No doubt, there are a few scenarios where a conventional database isn't the best solution. But quite honestly, 90% of the people jumping on the bandwaggon would be served just as well with an SQL database - except that like so many things, you need to do it right.
I'm no database expert (but I know a couple), so my SQL isn't perfectly optimized and stuff, but even with a little bit of interest I see that putting some effort into your database and query design pays off massively.
And I've seen enough cases where someone scraped their SQL database and went NoSQL for absolutely no good reason. You think you're huge and SQL is too slow? Unless you just sold to FB or Google for a couple billions, you very likely are not as huge as you think. I'm running a PostGIS database doing fairly complex geography calculations on non-trivial datasets, and it's blazing fast, and whenever it isn't one hour with an SQL expert and some experimenting makes it so, because it always, with no exception, turns out that my SQL or my database design is at fault, not the database itself.
If you've got a billion users, I will grant you that you have special needs. But every NoSQL use I have seen has been a case of people moving database work to software code instead, mostly because programmers are plenty and cheap, while experienced database experts are not.
So I'm still amused and very little impressed, and I'm certain NoSQL will go the way of Java or every other hype ever - for a while it's everyone's darling, then people realize it still doesn't give us AI and it can't make coffee, and will start to figure out where it actually is the best solution and stop using it for everything else.
Assorted stuff I do sometimes: Lemuria.org
A small minority of companies, with very special needs, are using NoSql databases for a small proportion of their operations. Those companies do include some big ones, such as Google and Twitter, but still in absolute terms the numbers are small. A tiny minority of companies have moved away from relational databases altogether. But the numbers are statistically insignificant and are likely to remain so for decades. And the relational model does have some real and enduring benefits which will make it the right tool for many jobs far into the future.
Remember this is an industry that advances very slowly indeed. Your bank, and your utility companies, are still using programs written in COBOL - technology which is fifty years behind the curve.
I'm old enough to remember when discussions on Slashdot were well informed.
I prefer to use Mongo when prototyping - when the data structures are still in flux, and I want something quick, flexible, and - most importantly - right now.
That, and setting up MongoDB is a breeze: `aptitude install mongodb`, and you can start using it straight away.
"NoSQL" is a band-aid while programmer competence catches up with resource constraints. This old cartoon continues to be relevant, but mostly it's, "We don't know how to use this wheel, so we're going to reinvent it poorly. Eventually our duplication of effort will be complex enough that we're going to need to move something indistinguishable from the traditional system, but nobody needs to say that yet."
It's like the programming language rule that everything eventually looks like a reimplementation of LISP with syntactic sugar.
Well, yes and no. PostgreSQL had a text-only JSON data type since long time, and was able to index keys using expression indexes. That's nothing new.
The 9.4 improvements are that the (a) JSONB is stored in a binary form, and (b) a lot of ideas from HSTORE data type, plus new ones were implemented. That means that you can create "universal" index without prior knowledge of what keys will be interesting. So then you can ask for data containing arbitrary keys, sets of keys, values, documents etc. See http://www.postgresql.org/docs...
Sure, it's not perfect and the index may get somehow big, but well ...
I don't know whether angry tapir knows what relational means, but I see nothing in his post IMHO suggesting he has no clue. JSON is great for storing non-relational data (hierarchies, data without fixed set of columns, ...). Not all data are purely relational, it's often a mix.
(Todd Codd anyone?)
Close... acutally his name was Edgar Frank "Ted" Codd
Need to type accents and special characters in Windows? Use FrKeys
Actually, more than you think should probably use NoSQL. It isn't really any harder if you build it that way from the start and if your startup happens to get gigantic you won't have a relational database to migrate away from as one of your problems. You'll still have problems though, and even with NoSQL you need to "do it right" or it will still have issues when it gets huge.
And I might add that one of the most painful parts of migrating away from relational databases after you are already huge and bursting at the seams is that usually folks will have relied on the transactional consistency they provide for all the app logic and business processes. Suddenly wanting to change all that code to handle eventual consistency is not trivial at all, but if you were doing it all along because you started out that way... fewer pains.
No matter how much you optimize your schema and your queries there are limits to what one machine can handle. Depending on what your application or business needs are, this may happen MUCH sooner than a billion users. For many, merely tens of millions of really active users are enough to exceed these limits, and when you are a startup trying to grow and add features it is easier said than done to ensure that every piece of code you release is so perfect that you will not rock the boat at all, since one minor slip effects EVERYTHING. At that point your choices are custom sharding (expensive, painful, error prone). Or horizontally scalable NoSQL. Personally, I would just choose NoSQL from the start. It is not harder to use, and you have a lot more wiggle room to respond when you want to release features quickly and iterate over them to improve performance if you decide to keep them. And yes, you can use functional sharding and multiple relational databases, but sooner or later if you are successful, you will hit the same problem.
When we were looking into new options to supplement our MSSQL servers, we settled on Mongo. We were aware that Postgres will act as a document store in addition to being a traditional RDBMS, but our decision was largely based on 2 things: We acknowledge that we'll likely never be able to completely eliminate our use of MSSQL. So, if we need an RDBMS, it will still be there. The other main factor was that Mongo makes replication, failover, and sharding a snap, relative to other systems. We don't have a DBA to implement replication for us. So, the simplicity was a huge factor. There are billion ways to store your data, and they all come with positives and negatives. Postgres can pretty much be everything to everyone, but like any system where that's the case, it can be harder to configure (for me, a developer, anyhow... I'm sure DBAs are laughing at me).
I still need a list of these big sites that moved from SQL. No MySQL, no Postgre, no Hadoop...
Hadoop uses SQL? I don't think so.
"Somebody has to do something. It's just incredibly pathetic it has to be us."
--- Jerry Garcia
Pick and it's descendants are NoSQL by definition. It is not a relational database, it has no schema. It has a master dictionary and a subset of dictionaries for each account defined. Pick databases are accessible over SQL with an API/wrapper. By default, Reality implementations are generally an OS running in a VM process on a host operating system(ADP uses Digital UNIX/Tru64), while the wrapper allows it to communicate with the outside world.
You can use HIVE on Hadoop. It supports a subset of SQL.
You can't do everything with it, however. No in-equijoins, not much in the way of analytic support. But you can do joins, aggregations and some other stuff.
Still it's no replacement for a proper RDBMs when you want to get frisky with the queries. Hand writing map reduce jobs to do exactly what you need can be time consuming to write, and even more so to change with evolving requirements.
"NoSQL" is a pretty bad name actually. They should be called non-relational databases. In many cases you can use SQL or something like SQL on them.
People never use NoSQL to get away from the SQL language (although I don't like SQL at all). They use it to change the trade-offs in ACID complacence and to not have to keep their data completely relational.
Democracy Now! - your daily, uncensored, corporate-free
Nothing. Because the postgres community didn't mean this to be "aim at the NoSQL market." The fact that angry tapir puts that into an abstract on ./ does not make it 'community opinion'.