Slashdot Mirror


Why Some Devs Can't Wait For NoSQL To Die

theodp writes "Ted Dziuba can't wait for NoSQL to die. Developing your app for Google-sized scale, says Dziuba, is a waste of your time. Not to mention there is no way you will get it right. The sooner your company admits this, the sooner you can get down to some real work. If real businesses like Walmart can track all of their data in SQL databases that scale just fine, Dziuba argues, surely your company can, too."

12 of 444 comments (clear)

  1. Hardware is cheap. Developers aren't. by Anonymous Coward · · Score: 5, Interesting

    It's really that simple. A standard dual socket server with the latest CPU's from Intel or AMD can handle hundreds of requests per second; if one isn't enough, just add more hardware, one month of salary can buy you another node, a year can buy you a whole cluster of rackable systems or a chassis full of blades. If it takes a few months extra for a team to solve the problem the NoSQL way, that's a few months of extra salary costs and missed sales.

    Slashdot runs on SQL. I run a site of 1M pages daily (1/3-slashdot according to Alexa) with just a single system with 2x Xeon E5420, Django/PostgreSQL at 10% load. Unless you attract enough attention to require scaling past 10M pages a day, you're wasting your time reinventing the wheel with NoSQL, just stick with a standard ORM, launch your site and start convincing customers and generate sales. You can survive a slashdotting just fine without spending so much time on those exotic tools.

  2. Re:Article summary by RedMage · · Score: 4, Interesting

    We're using both - about five days from our "go-live", and things look good. We just use what makes sense for each part of our application.
    For us, this means PostreSQL for the parts that must be transactional ACID, and Amazon's S3 and SimpleDB for parts that don't. In practice, for the 1.0 release, this means things like notes, user accounting, and documents are in S3 and SDB. The rest is plain ole SQL.

    Not that there wasn't a learning curve with our developers - we're a bunch of old-time enterprise type developers, so "letting go" and moving out of the traditional SQL world took a little thought and proving time. We'll use the first few months to learn more about doing architecture this way.

    We've had the language wars - lets avoid the SQL/NOSQL wars please. I'm tired.

    --
    }#q NO CARRIER
  3. Re:Can't wait it to die? by Anonymous Coward · · Score: 3, Interesting

    Facebook.com, the highest-traffic site on the Internet, serves more than 95% of its data out of memcached. Twitter, Wikipedia, etc are major users too. And of course, Google serves its web index out of memory.

  4. Re:Article summary by squiggleslash · · Score: 5, Interesting

    There's a fairly obvious reason for NoSQL vs Pro-SQL, and it's this: SQL is absolutely the worst database query language ever invented... apart from all the others.

    Virtually no-one who's spent any time analyzing and working with large amounts of data has a good word to say about SQL. It was designed from the start as a language that would be integrated into others, and yet simple real world realities make that impossible, with 99% of implementations being of the "Build a large string, and pass that string to "the SQL connector" to be parsed and interpreted" form. Its handling of null and the empty string is incomprehensible and useless, in part because nobody involved ever had the cajones to do what needed to be done with both. There is no standardized set of data types in the real world. Simple issues with unstandardized case dependencies can make an application that works with Oracle and only uses standard "select" statements not work under, say, PostgreSQL. And these are the surface level technical issues: talk to any relational database guru and they'll come up with numerous philosophical issues too.

    To this you add another component that's always an issue: the entirely haphazard way in which relational databases are implemented on most operating systems, whereby the DBMS is another application, that manages its own files, and needs to be coached with kind words and a happy smile in order to get anything done. Does your app use a database for something back-endy, like, for example, MythTV does for its settings and lists of channels and TV programs? Well, either forget it, or be prepared to put your users through hell as they have to ensure that the entirely separate DBMS is installed and that usernames and passwords are set up for your application's use.

    And so, naturally, people hate them. With a passion. To the point that anyone sane is going to put it low on the list for any application, even when it's entirely appropriate. Of course your multiuser databases in your enterprise environment should be stored using an enterprise grade RDBMS, and as nobody's come up with anything better, you should be talking to it using SQL.

    ...and you should be talking to it carefully. Ideally, those writing the application core should be handing over the database access to someone who can abstract each query properly. Because SQL sucks. It just sucks less than anything else designed to do the same thing.

    --
    You are not alone. This is not normal. None of this is normal.
  5. Re:Article summary by ducomputergeek · · Score: 4, Interesting

    I don't have mod points, but I've found the same thing. It's the perfect development database if you think that your program is ever going to need to support Enterprise class stuff. On the small scale, I've found that it's fast enough. Is MySQL faster? Yes, but where I've tested it's not been enough to really matter compared to the other advantages of PostgreSQL. Primarily that it's ACID compliant. What we've found is that it works well until you start getting into databases that are GB in size. But then you can easily port the datatables to DB2 or Oracle and go. Especially if you designed the rest of the software to do this from the get go.

    In production, we moved all but one of our databases from MySQL to PostgreSQL. We were having problems with Innodb corrupted once every couple months. When it was announced that Oracle was bidding on Sun, we ported over to PostgreSQL, spent a couple weeks rewriting code, and we've not touched the Postgres database since. It's not corrupted and not even hiccuped once since we deployed. We run regular vacuuming and maintenance and that's it. It's been humming for well over a year and now is getting 400x's the use than we ever had with MySQL.

    The only thing that PostgreSQL was lacking has been HA support. There are number of 3rd party tools that run well, PGCluster, Slony, GridSQL, but this looks like PostgreSQL is going to support native replication, clustering, and HA with hot-standby...

    --
    "The problem with socialism is eventually you run out of other people's money" - Thatcher.
  6. I'm Still Fuzzy on NoSQL by RAMMS+EIN · · Score: 4, Interesting

    I'm still fuzzy on what NoSQL is supposed to be and what it is supposed to bring to the table.

    From what I've understood, it's basically a common banner for various different databases that all share the common property of not being relational databases and not providing ACID guarantees.

    If so, it seems to me that the whole NoSQL vs. RDMBS debate is about a false dichotomy. There are some applications where a relational database is the right tool for the job, and there are some where a relational database is not the right tool for the job. In some of those latter cases, one of the NoSQL databases may be the right thing.

    This is nothing new. Non-relational databases have been used on Unix for a long time, and are even a standard part of POSIX (see for example the manpage for dbm_open). It's also long been known that, for example, Berkeley DB can be a lot faster than an RDBMS - as long as your application doesn't make use of all the features an RDBMS provides. Lots of programs even don't use one of these database systems, but invent their own, custom format. Git is a very successful example of this.

    To me, it seems that what we are seeing here is loads of people who had learned to use relational databases for all their storage needs discovering that there are other ways to store data, and that one of those methods may work better than an RDMBS for a particular application. Well, yes. Does that surprise anyone? It sure doesn't surprise me. Does it mean that RDMBSes are now useless? Not at all. Does it mean you should use a non-relational storage system where this makes more sense? Of course! Now, can we please get back to work? I don't see the point of having a holy war over whether RDBMS or NoSQL is better, when common sense says that they both have their uses.

    --
    Please correct me if I got my facts wrong.
  7. Re:Article summary by TheLink · · Score: 4, Interesting

    The syntax might be crap, but it's far easier to get everyone to standardize on SQL to talk to DBs.

    "NoSQL" stuff is fine if your company is simple in structure - very few products/services, and it has to write most of that stuff itself anyway.

    When you have many different departments with their own different apps (in house and 3rd party), and they all want to access the same bunch of databases, SQL just becomes the "standard API or language" you use to talk to them. In contrast say you have some custom "NoSQL" DB, it's going to be harder to find stuff that talks to it (you might have to write your own connectors).

    It's just like "English", the syntax might be crap, but it's far easier to get 3rd parties and other departments to use it. In contrast if you use Lojban, despite its supposed advantages you're probably going to have to get translators (or worse - train your own translators) whenever you need to deal with outsiders who don't speak it.

    --
  8. Re:Article summary by Vancorps · · Score: 3, Interesting

    Given that Oracle has a java client and java is supported on OS/2 how did Oracle drop OS/2? Even with 10 and 11g you can still connect from a OS/2 box although I would say your application has some fundamental design flaws if workstations are directly connecting to a database.

    Also, some the biggest general ledger applications deployed are running on MS SQL, that includes Great Plains and Navision.

    As for Oracle Power Objects you have the same situation, Oracle has another product that achieves the same functionality and more and it evolved into that. Much like Oracle Forms and Reports 10g has no 11g version, Oracle didn't drop support for Forms and Reports services though, they came out with a new product and have a clear and rather easy transition path provided you have a good amount of Oracle infrastructure.

    MSSQL timestamp is a really weak argument as well as there is nothing that forces you to use it's timestamp which we'll agree is different from what you get with Oracle, MySQL, and Postgresql. We get around that by converting to strings since we work with multiple platforms. Each of them have serious strengths and of course, serious weaknesses. I personally believe that the only product worthy of such animosity is mysql because the developers clearly knew nothing about databases in it's design. Naturally they even admit that. They learned along the way and have created a flexible product but it has all the problems that Oracle had 20 years ago and the MSSQL had 15 years ago. When you rely on your application for data integrity you will run into problems again and again and again.

    Sounds to me like you weren't happy being forced off dying platforms, given how long Oracle extended support for both it seems you were quite stubborn. EOL for Power Objects was in 1995 and support actually ended in 2000. That is one seriously long transition period.

  9. Re:Article summary by jc42 · · Score: 4, Interesting

    ... the syntax of PROLOG, for example, seems much simpler, more powerful, and makes more sense to me.

    Yeah, wouldn't it be wonderful if instead of all the complex cruft usually needed to find the data you need in that morass, you could just write a prolog expression and let the interpreter resolve it? But when I mention this to Team Leaders, they inevitably look at me like I'm from Mars. They have no idea what prolog is or does. (And I'm actually from a planet much farther away than Mars. ;-)

    But when all is said and done, you can get familiar with most of SQL in a couple weeks.

    True, perhaps, and I did that years ago. But that doesn't deal with the major problem with SQL: In my experience, every relational database I've ever worked with was in the grips of a set of professional RDB priests, and you didn't do anything in SQL without their blessing. If they didn't approve of what you were trying to do (typically because they couldn't be bothered to listen to you), it wouldn't get done during your lifetime.

    So I've learned to cultivate them as an acolyte. I write my "prototype" to use flat files, typically small files full of name:value pairs, sometimes with the name part the file name and the value the contents, and a directory tree of multiply-linked files to classify stuff. I agree with their criticism of this, and say that I'd be happy to convert the code to use their DB when they have the time to help me get those subroutines working right. While they chew on that, I get the project working with the flat files, and get some users using it. When the priest finally face the fact that the project works without their help, they finally deign to help.

    But I've never seen them actually get the SQL working to the point that it can supplant the flat files. The parts that do work are always so slow that turning on the "useDB" switch makes it too sluggish to actually use. In some cases, I can get around this by writing "pre-pass" code to extract the common data sets from the DB and write it to flat files, which the interactive software can read through quickly.

    It has long seemed to me that SQL and RDBs in general are Good Ideas. But unless we can find a way to end the stranglehold of the DB priesthood in an organization, it's all sorta hopeless for a mere "developer" to even consider jumping into the mess. It's better to just develop stuff that works, and let the DB experts handle the task of porting it to the DB. That way, we developers can keep our hands clean of all the theology, and actually develop stuff that works.

    Of course, this is all heresy to the True Believers ...

    --
    Those who do study history are doomed to stand helplessly by while everyone else repeats it.
  10. Re:Article summary by BitZtream · · Score: 4, Interesting

    Considering that by the time you 'need' Oracle, the price of Oracle is a drop in the bucket.

    The only people that ever complain about the price of Oracle are the people who will never have the need to use it because they'll never have the traffic to it to require it.

    Sorry you haven't got to play with the big boys, but in general if you spend your time worrying about how much 'software costs' your business sucks. Software costs, even for Oracle, are trivial compared to the other costs that go into it.

    An Oracle DB serving internet facing customers for instance is going to cost an order of magnitude more for bandwidth in the first year than the cost of an Oracle license to deal with it.

    But you go ahead, keep pretending you have some sort of clue and are witty by pointing out its expensive. If you ever make it to that scale, the last thing on your mind will be the price of an Oracle license.

    --
    Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
  11. Re:Article summary by squiggleslash · · Score: 3, Interesting

    Define irony: A guy who clearly has no experience with large scale database system telling others how bad SQL is while using a tiny fringe asstastic software package as an example.

    Nah, irony is someone writing a petulant rejoiner to a comment claiming the author doesn't know what they're talking about when you haven't actually spent any time trying to understand the original comment.

    Virtually every assertion you've made it based upon a failure to even make an attempt to understand what you're responding to:

    1. "Define irony: A guy who clearly has no experience with large scale database system telling others how bad SQL is while using a tiny fringe asstastic software package as an example." - There's absolutely nothing in my comment suggesting I have no experience with large scale database systems, and the only "example" I give that involves a "tiny fringe asstastic software package" was a passing reference to MythTV to make a point about... why SQL is unpopular for non-enterprise work.
    2. "Perhaps you should investigate the various SQL standards out there before you talk out your ass. I have a large web app that runs on Oracle, PostgreSQL and MSSQL, with the same queries. Slightly different scripts to create the database to deal with the differences in stored proceedures, so theres a little bit of truth there, but I could have moved the stored procedures to a different location if I wanted to." - You're responding to a claim that it's impossible to write queries that work under all three implementations of SQL. In fact, my first paragraph says no such thing. One of the things it says that case dependencies mean that it's very easy to write standard SQL that doesn't work on different platforms. That's absolutely true, it's one of the reasons why you'll be hard pressed to find any enterprise development shop that developers and tests under anything other than the target RDBMS(es). In addition, the first paragraph also points out a range of other issues with SQL that your response doesn't cover, such as the handling of nulls and blanks. Your response does not, as you claim, prove that "Your entire first paragraph is based on 100% factually incorrect statements." That's basically a lie.
    3. "Your second paragraph is clearly written by ... well, again someone who has never used a high end database. Any high end database worth its salt is designed to deal with raw disk space for its tables..." At this point, I'm not even sure what crack you've been smoking to think that I implied anything contradicting that. I didn't even address how high end RDBMSes store data physically on a disk. You then go off on a tangent about how crap MythTV is without ever actually addressing the point being made.
    4. "As for the last two paragraphs ... why bother, you're clearly disconnected and the rest is just you talking out your ass. Perhaps one should consider that its not SQL that sucks since so many people are capable of doing things with it just fine. Perhaps you should look a little closer to home and consider that your inability to use it is what sucks." So you actually are under the impression that abstracting the underlying RDBMS and ensuring that the HLL is kept separate is... a bad thing? That someone proposing it is "disconnected" and "talking out of (their) ass"?

    It doesn't matter much I guess, but I've been working in an Oracle shop for about fifteen years now, and done my best to push free software alternatives such as PostgreSQL in recent years - and, more importantly, seen our application support people push SQL Server with much the same results. I'm directly familiar with the ability of, for example, PostgreSQL to "run the same queries as" Oracle when no thought has been put into the differences. And I'm directly familiar with the type of software that you start to get when 50 or more developers all "think" they're SQL Gods, writing their bits of our applications according to what they think is best pra

    --
    You are not alone. This is not normal. None of this is normal.
  12. Re:Article summary by Chitlenz · · Score: 3, Interesting

    Ummm FTFL?

    Timestamp equivalent * Eventually, MS will convert the current timestamp of a unique row number, to an actual date and time. * Use ROWVERSION instead of timestamp. Row version provides the same functionality and the same value as the current timestamp.


    MSSQL 2008 and above is fine, and we use timestamps almost to an atomic precision in medical imaging... eventually came right after that post ... in 2007. SQL Server Vs. Oracle/MySQL is the only fight worth wasting time on. Here's the thing about RDBMS. Not only has it been the standard for 20 years, virtually assuring their own persistence because by very nature they grow.. a LOT, but it is one of the few standards that actually has a solid foundation. You see, in this age of marketing driven products, there are still a few things out there quietly running the world. And I assure you it's not XML pages.

    my 2cents.

    --chitlenz

    --
    Imagination is the silver lining of Intelligence.