Slashdot Mirror


Moving From CouchDB To MySQL

itwbennett writes "Sauce Labs had outgrown CouchDB and too much unplanned downtime made them switch to MySQL. With 20-20 hindsight they wrote about their CouchDB experience. But Sauce certainly isn't the first organization to switch databases. Back in 2009, Till Klampaeckel wrote a series of blog posts about moving in the opposite direction — from MySQL to CouchDB. Klampaeckel said the decision was about 'using the right tool for the job.' But the real story may be that programmers are never satisfied with the tool they have." Of course, then they say things like: "We have a TEXT column on all our tables that holds JSON, which our model layer silently treats the same as real columns for most purposes. The idea is the same as Rails' ActiveRecord::Store. It’s not super well integrated with MySQL's feature set — MySQL can’t really operate on those JSON fields at all — but it’s still a great idea that gets us close to the joy of schemaless DBs."

55 of 283 comments (clear)

  1. Not getting RDMS by Anonymous Coward · · Score: 5, Insightful

    And in another three years they will switch to whatever is the coolest up-and-coming storage solution. Incompetent developers will always be incompetent developers.

    1. Re:Not getting RDMS by gbjbaanb · · Score: 5, Insightful

      true, just reading their blog

      Things like SQL injection attacks simply should not exist.

      HTTP API. Being able to query the DB from anything that could speak HTTP (or run curl) was handy.

      so sql injection is real bad, bad design of SQL... yet allowing any old HTTP javascript queries is somehow ok. Yes, incompetent developers indeed.

      They also say

      Why are we still querying our databases by constructing strings of code in a language most closely related to freaking COBOL, which after being constructed have to be parsed for every single query?

      apart from the concepts of query caches - and stored procedures - so what if the language is related to COBOL, javascript is closely related to C which is almost as old. And that has plenty of relations to Algol which is even older.

      So yes, it sounds like they havn't really got a clue. Great advert for their business!

    2. Re:Not getting RDMS by gorzek · · Score: 4, Insightful

      I think the main problem is application developers not understanding anything about database theory. The vast majority of databases I encounter are not normalized at all, and it's almost always because they were designed by a developer with no database background.

      Granted, I didn't come into this field with that background, either, but I made a point to learn it, and now I'm very cognizant of implementing sound database designs. This whole idea of throwing random strings of structured text into a database column, and then relying entirely on the program code to parse and use it... well, why the hell even use a relational database, then?

      Relational databases aren't suitable for every application, nor are "bigtable" and other NoSQL implementations. The problem is that developers use a particular kind of database without really understanding how to use it properly. If they can get data in, and get data out, that's basically all they care about. Never mind if they make it a maintenance nightmare in the process.

    3. Re:Not getting RDMS by arth1 · · Score: 2

      so sql injection is real bad, bad design of SQL... yet allowing any old HTTP javascript queries is somehow ok.

      HTTP isn't a subset of javascript - no javascript queries are needed for HTTP. Even for JSON and other javascript objects.

      That said, yes, the developers don't seem to "get it". An object/method based database query language, which they seem to want, has already been tried. Look where Informix is right now.

      Yes, parsing can be a bitch, and which is why using a structured database isn't always the right choice to start with. If you're just using it for data storage, it rarely makes sense.

    4. Re:Not getting RDMS by Xest · · Score: 5, Insightful

      "Why are we still querying our databases by constructing strings of code in a language most closely related to freaking COBOL, which after being constructed have to be parsed for every single query?"

      I couldn't agree with you more, this quote makes me want to vomit. Is this really how low the average competence of today's web developer has stooped? Between PHP developers not getting why PHP is a pretty shitly designed and developed language and stuff like this, I barely get how the web even runs anymore.

      To answer the original quote, the reason we're "still querying our databases by constructing strings of code in a language most closely related to freaking COBOL, which after being constructed have to be parsed for every single query?" is because SQL is a language based on mathematically sound principles, and which is supported widely, and known widely, and is processed by database engines across the globe that have literally decades of stability behind them, data in them and so forth.

      There's absolutely no reason to change SQL, because if you build a new query language that is based on the same mathematically sound principles of relational algebra then it will er... look just like SQL. The fact the kiddie (I can only assume he's a kiddie due to his blatant lack of knowledge and/or experience in the field) who wrote that blog post doesn't get this suggests he should absolutely not be trusted with your data as he'll only lose it.

      This is a classic example of someone bitching about something not because it's bad, but because they simply don't understand it and believe that rather than learn about it properly, it's better to bitch and hope you can somehow effect change by bitching.

      The advantage of most SQL/RDBMS is that they do adhere to the ACID principles, and for people who want to be able to have some degree of trust in their data source that's pretty fucking important. It's no surprise that they've moved over to MySQL though as it's one of the few RDBMS that is completely shit at adhering to the ACID principles and keeping uptodate with solid, stable implementations of modern database functionality.

    5. Re:Not getting RDMS by SQLGuru · · Score: 4, Insightful

      I completely agree. A lot of non-DB centric people think that they can do more in the app tier, effectively using their databases as glorified file stores. Why even have a database server in those instances? I'm not saying that everything should be done in the database, either, but take advantage of every tool you have.

      NoSQL has a place, so does relational. Learn their strengths and determine which is the best fit for your project. Then, learn how to use the tool to its fullest.

    6. Re:Not getting RDMS by serviscope_minor · · Score: 4, Insightful

      so sql injection is real bad, bad design of SQL...

      SQL injection actually has nothing to do with SQL.

      Exactly the same attacks happen in any system where you build up a string from user data and pass it off to an interpreter. SQL has nothing to do with it.

      Exactly the same thing used to happen with sudo shell scripts.

      Exactly the same thing happened with javascript injection in very early webmail systems.

      There are plenty of opportunities for code injection on poorly written PHP, too.

      --
      SJW n. One who posts facts.
    7. Re:Not getting RDMS by Lisias · · Score: 2

      COBOL can be a bad language, but the best paid jobs around here are for COBOL programmers.

      It's hard to find a position (someone must die in order to open up a position), but once you get it, it's for life. =]

      --
      Lisias@Earth.SolarSystem.OrionArm.MilkyWay.Local.Virgo.Universe.org
    8. Re:Not getting RDMS by K.+S.+Kyosuke · · Score: 4, Interesting

      There's absolutely no reason to change SQL, because if you build a new query language that is based on the same mathematically sound principles of relational algebra then it will er... look just like SQL.

      False. First of all, SQL is NOT based on mathematically sound principles of relational algebra. SQL took the mathematically sound principles of relational algebra and fucked them up. There should be no NULLs, there should be no natural ordering of "columns", there should be no possibility of having duplicate rows, there should be no possibility of inconsistent intermediate states in transactions (no deferred checking) etc. SQL has them all, and then some. Why? Because SQL simply ignores the relation model and "does what IBM and Oracle always did". That's not the same thing as "implementing the relational model".

      Second, there is a separation between the surface structures of a language and its foundations. I really don't think that a language based on relational algebra has to look like SQL. That's like saying that a language with nouns having singular and plural and verbs having tenses has to look like English. Nope, it doesn't have to at all. Just look and VB.NET and C#. Basically two front-ends to a virtually identical language semantics, only one of them does not avoid non-alphabetic structural delimiters like the plague (and is so much more pleasant for it).

      --
      Ezekiel 23:20
    9. Re:Not getting RDMS by gmack · · Score: 2

      That is a common reason for firing. A couple of years ago some programmers wanted me to support them with the boss on switching a project written in python to Java. Their justification? The python programmer called them a bunch of monkeys. No technical arguments at all.

      Unfortunately the boss sided with the monkeys and I was next on the chopping block for pointing out that a 200 Bingo player max using 3 machines (1 web 1 db, 1 backup db) was a design flaw.

    10. Re:Not getting RDMS by Nadaka · · Score: 3, Insightful

      COBOL can be a bad language, but the best paid jobs around here are for COBOL programmers.

      It's hard to find a position (someone must die in order to open up a position), but once you get it, it's for life. =]

      In the end, there can be only one.

    11. Re:Not getting RDMS by TheRealMindChild · · Score: 5, Interesting

      There should be no NULLs
      Then how do I, say, indicate the date of death for someone who hasn't died? An IsDead field? Really? (Yes, a NULL in a field is a shortcut for proper relationship, but a lack of relationship when using a linking table will still be represented by NULL)

      there should be no natural ordering of "columns"
      Does it really matter? The natural ordering of columns is the order in which you added them to the table. Ignore it. It isn't important, and not in need of a "solution"

      there should be no possibility of having duplicate rows
      Firstly, get to know your DISTINCT SQL keyword. Secondly, data in real life sometimes IS duplicate. What the hell should people do? Have a DuplicatedThisManyTimes field? Ugh.

      possibility of inconsistent intermediate states in transactions
      That is a property of the database engine, not SQL.

      Because SQL simply ignores the relation model and "does what IBM and Oracle always did". That's not the same thing as "implementing the relational model".
      Where do you get this shit? Are you telling me the function of foreign key constraints and referential integrity, and the good ol INNER/RIGHT/LEFT join keywords are just smoke and mirrors and everything is really just a chaotic bowl of soup? References please.

      --

      "When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
    12. Re:Not getting RDMS by Anonymous Coward · · Score: 4, Insightful

      That's not how debate works. If you can't take a position and defend it against questioning, without resorting to "go away and learn more", then you have no position and shouldn't have posted in the first place.

    13. Re:Not getting RDMS by plopez · · Score: 2

      SQL is nothing like COBOL. Once again they show how they are clueless rookies.

      --
      putting the 'B' in LGBTQ+
    14. Re:Not getting RDMS by Joey+Vegetables · · Score: 2

      GP is correct, and your understanding of the relational model appears to be - no offense - a bit lacking. To address your first example: people and deaths are different, though related, concepts. Ideally, they should have separate tables, plus a view. If someone died, he or she has a row in a Deaths table, which joins to the People table; otherwise, not; no NULLS necessary. When interacting with the data from outside the database, you use a view, which can be engineered to appear to contain NULLs, duplicate rows, and so forth. The views can be updateable, using triggers and whatnot, so you can treat them as if they were tables wherever it is convenient to do so, and they will behave the way you appear to believe they should (or the way your ORM tool believes they should); but, behind the scenes, the data will be stored in 3NF and therefore will be far less subject to insert, update and delete anomalies than they might be otherwise. Now, no one is holding a gun to your head and saying you *must* use the relational model. But I do advise you to understand it, and its benefits, and to use it where it makes sense, and, if you don't use it, to understand the tradeoffs you are making.

    15. Re:Not getting RDMS by tgd · · Score: 2

      I think the main problem is application developers not understanding anything about database theory. The vast majority of databases I encounter are not normalized at all, and it's almost always because they were designed by a developer with no database background.

      Or a developer who is experienced enough to know how bad an idea an overly normalized database is for most applications.

    16. Re:Not getting RDMS by Xest · · Score: 2

      "False. First of all, SQL is NOT based on mathematically sound principles of relational algebra."

      No, you've completely missed the point - I'm not saying SQL is an implementation of, and only of the relational model and nothing more, and nothing less, merely that those are it's foundations. SQL absolutely IS based on the principles of relational algebra - it's still ultimately based on much of the important set theory that underlies that when it comes down to it. The point being that sure, whilst SQL is far from perfect, it at least stems from far more solid principles than many alternative offerings nowadays which don't even consider looking at fairly sound mathematical principles as a starting point and so just end up a mess. This isn't to say you can't use them for anything - it's the same thing as PHP, sure these things "work", but you can't come crying when you inevitably encounter bugs that stem from the fact many such alternatives are poorly designed and have an abysmal foundation to their existence. At the end of the day, SQL RDBMS are still pretty much the most foundationally solid persistent data platforms we have, that are also practical to use and that's my key point here.

      It's also worth noting that because of these differing foundations, you can, if you choose to, use the subset of SQL features that do allow you to adhere to the relational model which genuinely does have a mathematically sound foundation. Thus, any issues you have with SQL allowing you to stray away from the relational model are entirely optional. Compared this to the alternatives, and many of these just don't even give you the option of ensuring your data is sound on mathematically sound foundations.

      I'd hence argue that you're being rather dishonest in saying "Because SQL simply ignores the relation model and "does what IBM and Oracle always did". That's not the same thing as "implementing the relational model"." as that's simply not true, SQL doesn't ignore the relational model it's entirely based on it, the only difference being that IBM/Oracle et. al. have extended it.

      "I really don't think that a language based on relational algebra has to look like SQL. That's like saying that a language with nouns having singular and plural and verbs having tenses has to look like English."

      Well, okay, you're right, if we're being pedantic then yes of course you could change it, but fundamentally my intention was that the SQL language is the way it is because it's intended to map closely to the mathematical operations that define relational algebra, precisely because those are it's foundations. If you start to move away from that you lose those foundations, and whilst not entirely the same thing as replacing the SQL language itself, you only have to look at the problems that arise from ORM implementations to see what happens when you move away from those principles - it's okay for some projects, but in many other cases ORM just gets in your way and forces you to mangle your data into a form that no longer makes sense. So yeah, move things around a bit, change a bit of syntax if that's really what you want, but you're still going to need your selection, projection, and your joins and then what? you've got a language that no one knows, isn't supported anywhere, and that ultimately doesn't change anything of any value because it's still just mimicing those core relational operations.

      I guess if you want to get right to the crux of my argument it's this - people like the guy who wrote the blog we're talking about seem to think that the only choices between SQL and alternatives are a few features here and there, and whether you like the syntax - they completely miss the point that there's far more to it than that, that theres fundamental differences in the confidence you can have of the underlying data storage methods, in the data retrieval methods and so forth. Many NoSQL implementations basically just do away with one or more of the ACID principles to achieve their speed benefits and so forth, yet many people usin

    17. Re:Not getting RDMS by Joey+Vegetables · · Score: 4, Informative

      From a purely pragmatic point of view, it may not seem unreasonable to model it that way. But you should be aware that you are trading one form of complexity for another, probably bigger one. For instance, now, if you want to know who was alive on some specific date, you have to write something like "WHERE DateOfDeath IS NULL OR DateOfDeath > @date." You also will not know for certain whether a NULL means "person is still alive" versus "person is dead but we do not know his or her date of death." When you try to compare different people's death dates any comparison to NULL will yield NULL and you will need special case logic in every such comparison. You will need tristate logic throughout any part of your application that does logical tests based on the date of death. Nullable values will sometimes require special treatment in your code, depending on the language (e.g., whether date/time values are considered to be nullable in that language). I could go on. I also could build you both tables, an updateable view, and a set of SPs to do your basic CRUD stuff on both tables plus "show me living people" and "show me dead people", in a LOT less time than it would take to handle all the code problems that would result from breaking 1NF. I am not an extremist on this subject, but I wear both DBA and developer hats, and when I'm acting as a DBA or in any other situation where I have control over the DB, I do try to get into 3NF, and then denormalize only if there are demonstrated reasons to do so. As a developer, I will sometimes take shortcuts if it's genuinely necessary, but, more often than not, I end up regretting them.

    18. Re:Not getting RDMS by complete+loony · · Score: 2

      I've worked on quite a few large-ish database applications (eg 800 - 2000 tables, some with multi-million rows), and I'd say I'm fluent with SQL. But the thing that annoys me most about SQL, from a maintenance perspective, is how much of the database structure ends up strewn around in your code base. SQL is *not* good at encapsulation.

      When a new requirement comes in that should cause you to change some of the primary relationships in your database, you have a look at how much code you'd need to change to do it properly, and end up just hacking in something really ugly instead. Maybe you should be able to use a foreign key, or many-to-many join table relationship, just by using its name (or something like that). Instead, every query must list the entire set of columns involved. And often provide hints to the database engine on which indexes to use first.

      And the problem gets much worse if you build your schema "properly" and heavily normalise the structure of everything. Because then any "simple" query easily involves 7 tables. And you probably need to code it into a stored procedure so you can build the data manually in a temporary table, since the database engine can't choose the right indexes and processing order and ends up scanning through millions of records to find the half dozen you actually wanted.

      Sure you could write a set of classes to build the SQL query strings for you, so you can encapsulate these details in one place. But then you'll probably end up with an ugly inner-platform that only the original designers really understand and your skills with using it wont translate directly into any other application or language.

      I'm not saying the NO-SQL folks have reduced any of the complexity of working with large complicated data sets. But SQL is not the silver bullet you make it out to be either.

      --
      09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
    19. Re:Not getting RDMS by HornWumpus · · Score: 2

      You've got it backwards. The highly normalized database is connected to transaction processing. Highly normalized databases have few lock issues and are optimized for transaction processing. Also TPS is narrow so you have good coders dealing with the relatively little code that bangs on it hard.

      The read-only database denormalized for simplicity and query performance is the data warehouse. That's where the report monkeys work.

      --
      John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
  2. Re:The decision the simple by Sarten-X · · Score: 4, Interesting

    That's actually a rather insightful point...

    If your application fits well with the methodologies of a traditional RDBMS, use a traditional RDBMS, and hire people who are trained and experienced in using those methodologies to their full potential.

    If you're dealing with the latest Big Data paradigms and designs, where you can sacrifice some of the rigidity of a RDBMS to gain some flexibility and cheaper scalability, use a NoSQL database, and hire people who aren't stuck in their old RDBMS ways.

    --
    You do not have a moral or legal right to do absolutely anything you want.
  3. Why not PostgreSQL? by JamesA · · Score: 5, Interesting
    1. Re:Why not PostgreSQL? by squiggleslash · · Score: 5, Funny

      Because it's an urban myth.

      The reality is there are only two SQL databases in the entire universe: MySQL and Oracle. You might have been told others exist, hell, you might even have worked on something called "SQL Server" in your .NET shop, but in reality: they don't. They're all figments on your imagination. Your imagination is SO determined to find better, more robust, faster, powerful, alternatives to MySQL and Oracle that an entire fantasy world comprised of "a successor to Ingres that makes MySQL look like a piece of crap" and "A Microsoft product that doesn't feel like a thirty year old mainframe product hacked onto a modern platform" develops in your head.

      C'mon, if these mythical products actually existed, sites like Slashdot wouldn't ignore them, right? Right?

      --
      You are not alone. This is not normal. None of this is normal.
    2. Re:Why not PostgreSQL? by vadim_t · · Score: 2

      That's all fine until you need to actually write to that table. With myISAM any write needs a table lock, and that makes performance drop like a rock.

  4. Nosql in Postgres by rla3rd · · Score: 4, Interesting

    You can get json support using the PLV8 extension http://code.google.com/p/plv8js/wiki/PLV8

    or altenatively you can use the hstore data type.

  5. Re:Has to be said by Anonymous Coward · · Score: 5, Funny

    MongoDB is Webscale. MySQL is not Webscale, because it uses joins. SQL also has impetus mismatch.

  6. programmers don't know how to store data by vlm · · Score: 2

    But the real story may be that programmers are never satisfied with the tool they have.

    Ah typo

    But the real story may be that programmers don't know how to store data

    They many not know because no one knows the business needs, but more often because they have no idea what they're doing WRT to data storage.

    IT training tends to cover data manipulation pretty well "how to add two numbers'
    IT training gets shakey on data structures "So, in junior level class we will talk about data structures, which is too bad because you've already developed at least two years of bad habits first"
    IT training tends to pretty much skip data storage "In a senior level class, you might talk about scalability, maybe in an optional class. Or maybe you'll take a semester of cobol instead"

    --
    "Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
    1. Re:programmers don't know how to store data by Zocalo · · Score: 2

      But the real story may be that programmers are never satisfied with the tool they have.

      Ah typo

      Possibly, but given how quick many programmers are to get into a fruitless pissing match over their favourite language it's quite apropos, no?

      --
      UNIX? They're not even circumcised! Savages!
  7. Re:The decision the simple by Anonymous Coward · · Score: 5, Insightful

    That's actually a rather insightful point...

    If your application fits well with the methodologies of a traditional RDBMS, use a traditional RDBMS, and hire people who are trained and experienced in using those methodologies to their full potential.

    If you're dealing with the latest Big Data paradigms and designs, where you can sacrifice some of the rigidity of a RDBMS to gain some flexibility and cheaper scalability, use a NoSQL database, and hire people who aren't stuck in their old RDBMS ways.

    The real key is for the person doing the hiring to understand which of those of methodologies fits their application.

  8. Re:The decision the simple by gstoddart · · Score: 2

    And most importantly, make sure you know the difference.

    Because I should think someone who thinks you should ditch your RDBMS when it's the thing you need to keep using is going to cause you more problems than they're worth. Of course, the opposite is true ... I remember someone who insisted in writing ER diagrams to describe our system, despite it not being an RDB, and not being accurately described by ER diagrams -- but to him everything was an ER diagram.

    It's not uncommon for geeks to push to use the latest stuff simply because it's the latest. (Or, as you point out, use something because that's what they've always used)

    I've actually seen someone suggesting we scrap an architecture to go with something he'd read recently -- despite having insisted we switch to the current architecture after reading about that.

    After a certain point, you just realize they're a technology magpie and tell them to STFU if they're not providing solid reasoning for why this is better in this context. After a while "because it's newer and better" becomes code for "shiny and pretty". Especially if these whims happen in shorter periods than your development lifecycle.

    --
    Lost at C:>. Found at C.
  9. Wikipedia and Slashdot use MySQL by tepples · · Score: 3, Insightful
    Anonymous Coward wrote:

    MySQL is not Webscale, because it uses joins.

    Then how does a non-webscale database power popular web sites such as Wikipedia and Slashdot? If you don't do joins in the database, you'll probably end up doing the equivalent of joins (using one value as the key in another table) in your application.

    1. Re:Wikipedia and Slashdot use MySQL by SuiteSisterMary · · Score: 5, Informative
      --
      Vintage computer games and RPG books available. Email me if you're interested.
    2. Re:Wikipedia and Slashdot use MySQL by Hognoxious · · Score: 4, Funny

      MongoDB can write its data to /dev/nul/ for extra performance.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    3. Re:Wikipedia and Slashdot use MySQL by Anonymous Coward · · Score: 3, Funny

      If /dev/null is webscale then I will use it.

  10. Re:The decision the simple by sycodon · · Score: 3, Insightful

    RDBMS systems can be flexible also. It just takes a bit of planning, a good understanding of your data and a well designed application...which you should do/have regardless of your storage solution.

    Call me set in my old RDBMS ways, but if I'm supporting it then I want to know what the hell is gong on with the data.

    --
    When Fascism comes to America, it will call itself Anti-Fascism, and tell you to give up your guns.
  11. Re:The decision the simple by TheSpoom · · Score: 4, Informative

    The real key is for the person doing the hiring to understand which of those of methodologies fits their application.

    This is insighful. I've worked extensively with RDBMS solutions and now quite a bit with NoSQL technologies. They each have their place. An entire article could be written on where each fits most naturally, but in general if you don't need to join between tables, need to throw data to your store at a high velocity (e.g. logging), and/or need a loose schema, a NoSQL solution works best. If what you're doing can be naturally modeled (i.e. users HAVE AND BELONG TO stations, stations HAVE MANY playlists, etc. etc.), use an RDBMS.

    One can see in the subtext of the GP that they may not get this, with their comment that people using RDBMS solutions are "stuck in old ways". It seems like they are saying that NoSQL is effectively always best. I'm curious why they think that. Nail, hammer, etc...

    --
    It's better to vote for what you want and not get it than to vote for what you don't want and get it.
    - E. Debs
  12. Native JSON fields by Xanni · · Score: 2
    --
    http://www.glasswings.com/
  13. Then what's it called instead of a join? by tepples · · Score: 5, Insightful

    In NoSQL systems such as MongoDB and CouchDB, what do you call the operation where you retrieve one document, pull an identifier out of that document, and use that identifier as the key to retrieve another document?

    1. Re:Then what's it called instead of a join? by mj1856 · · Score: 3, Interesting

      Not sure about MongoDB or CouchDB, but I have experience with RavenDB, which is absolutely fantastic. Instead of "joins" you have "includes" or "live projections". See http://ravendb.net/docs/client-api/querying/handling-document-relationships

    2. Re:Then what's it called instead of a join? by Anonymous Coward · · Score: 5, Funny

      Witchcraft.

  14. PICK by kibbey · · Score: 2

    Hop into the wayback machine and fire up any flavor of PICK. The database where schema is applied on use, not on storage. No length limits on fields and very fast on old hardware (really fast on new). Storing bits of xml and code are no problem. And for those users who simply must have SQL, many versions will support that too (UniData and UniVerse are two examples). It's not cool, not new, but it does work.

  15. Re:Is a DB even needed sometimes? by rtaylor · · Score: 4, Informative

    A CSV or XML or JSON file is a db (a DB is just structured data).

    Are relational DBs always required? Certainly not.

    The big benefit to a relational DB with lots of enforcement at the data layer is that you can have one or more applications reading/writing to it with minimal concern of data corruption.

    What isn't obvious is that second application is often aggregate reporting for management. "How many customers are using $foo and where do they live geographically". With a relational DB, I might knock that query out in a few minutes across millions of customers.

    With a flat XML file per customer spread across a number of servers, this could take days to assemble, particularly if $foo is nested deep in the structure.

    Having spent far too much time writing one-off scripts to gather customer data because the middleware didn't support that type of query, I've actually gone the other way and started shoving some business logic into the DB.

    Functions such as isCustomerPaymentOverdue are now in the relational DB with a very thin model in the middleware to allow for much easier and faster reporting.

    --
    Rod Taylor
  16. Re:The decision the simple by h4rr4r · · Score: 3, Insightful

    Or use a better DB like Postgres. How the MySQL still is popular I will never know. I think it is a conspiracy to prove FREE DBs suck.

  17. Urban Airship by jjohnson · · Score: 3, Interesting

    Urban Airship went PostgreSQL to MongoDB to Cassandra to PostgreSQL. http://wiki.postgresql.org/images/7/7f/Adam-lowry-postgresopen2011.pdf

    It's a good presentation because they're in love with none of them and are moving for specific reasons each time, handling different issues. It's not coders chasing the new hotness.

    --
    Anyone who loves or hates any language, platform, or manufacturer, doesn't know what they're talking about.
  18. Not quite true by Viol8 · · Score: 4, Informative

    If all your application is ever going to do is read and write to fixed sized record structured data with little relational (or any) attributes then COBOL will suit you fine as that's what it was designed for. Unfortunatly those sorts of apps are few and far between these days, but in its ever decreasing niche COBOL is still good.

  19. Re:Normalisation isn't a panacea by Xest · · Score: 2

    It depends on the task though, I'd wager 90% of SQL work that is done by developers day to day isn't in such a performance sensitive environment that it needs to favour performance over normalisation, and I agree with the GP, there's far too many developers out there that just don't do it and hence simply don't have the performance excuse. It really is just bad database design as a result of incompetence most the time.

  20. Re:Normalisation isn't a panacea by gorzek · · Score: 2

    I can definitely see the value in making an informed tradeoff, but like you said, a lot of the time it's not an informed decision--they just do it to make it work and don't really have the expertise to know which is the right way to go. I've definitely seen enough bad database designs to know that most developers just have no clue how to design them. The worst I've seen had bad designs and poor performance, and were built in a completely ad hoc manner without any eye toward maintainability, performance, or data integrity/consistency. The philosophy was just "make it work."

    I think developers need to realize that databases are a lot like code: first you prototype, then you throw away the prototype and do it right. (Then again, plenty of developers just keep the prototype and use it for production.)

  21. Re:Normalisation isn't a panacea by gorzek · · Score: 4, Insightful

    Yeah, it really depends on what you are doing. But any time you break normalization there should be a good reason. Performance is certainly a valid reason. "I'm too lazy to make a well-designed database," however, is not.

    If you find yourself breaking normalization all the time, then you've probably found a use case where a relational database isn't the best tool for the job.

    While there is a "right" way to use a given tool, there is no one tool that is right for every situation. People who get this backwards are zealots and will often make poor decisions.

  22. Re:Is a DB even needed sometimes? by serviscope_minor · · Score: 4, Insightful

    The big benefit to a relational DB with lots of enforcement at the data layer is that you can have one or more applications reading/writing to it with minimal concern of data corruption.

    Not just that, but good use of relations and normalization makes whole classes of bug impossible.

    --
    SJW n. One who posts facts.
  23. Re:Oh, the joy! by The+Moof · · Score: 2

    not using arcane tools

    I know database concepts are difficult for some people, but it's by no means magic.

  24. CouchDB just didn't work by Animats · · Score: 2

    a majority of our unplanned downtime was due to CouchDB issues

    Nowhere on the CouchDB home page is reliability even mentioned. And that's the real issue. Developing a reliable database system is a difficult design and programming task. It requires real software engineering. The hacks who write PHP and use JSON aren't up to a job like that. The "aw, we'll fix it in the next release" attitude doesn't cut it in databases.

  25. Re:Normalisation isn't a panacea by siride · · Score: 2

    And in many databases, there'd be more performance gains from proper normalization than pre-mature optimization. I'm working with a legacy database that has this problem. Proper normalization would probably make it lightning fast, but instead it's slow as fuck because too many concerns are put in one table when they should be put in several tables. Also, it uses functions to retrieve values, which is just...so wrong.

  26. COBOL is cool! by NotesSensei · · Score: 4, Funny

    In what other language would this statement compile without error:

    PERFORM makemoney UNTIL rich.

    (Note the the full stop at the end)

  27. Re:Has to be said by Alex+Zepeda · · Score: 2

    So the thing is, traditional joins (on, say, Postgres or MySQL) aren't blocking operations. You can run more than one at a time. MapReduce (as well as writes, any aggregation, and any use of JavaScript) are blocking operations on Mongo. They block the entire mongo process. The MapReduce case gets around this with a bit of cooperative multitasking (yielding every few hundred or thousand rows), but writes, aggregation, and other use of javascript do not. So there's already a much bigger need to distribute MapReduce on Mongo than there is to distribute a JOIN on an SQL database.

    Plus, MapReduce on Mongo is painfully slow, so you'll need to break things down into really small partitions to scale at all. Aggregating 800,000 documents (group by + sum) took me about 20 seconds with Mongo using the existing aggregation framework (which is universally credited with being faster than the MapReduce case). Porting the whole thing over to an SQL database allowed me to a.) not block the entire freaking process and b.) run the query in about 800ms.

    So, sure, we could have partitioned the data and spread it across multiple nodes. Maybe that would have been faster (but you can only run MapReduce across multiple nodes, using the existing aggregation framework you can only operate on one node). Dunno. But it would have been a lot more expensive since we would have required more hardware to accomplish the same thing that an SQL database is optimized for.

    The reason that "they" mock JOINs is because you simply can't do that efficiently with Mongo.

    --
    The revolution will be mocked
  28. Re:The decision the simple by Grishnakh · · Score: 2

    Also, I know what I'm doing.

    This line really doesn't count for anything. How many people are really going to say "I don't know what I'm doing", or "I'm incompetent"? Everyone thinks they know what they're doing.

    You may or may not really know what you're doing, we have little way to know for sure, but you saying it about yourself is meaningless.