Slashdot Mirror


Making Sense of the NoSQL Standouts

snydeq writes "InfoWorld's Peter Wayner provides an overview of the more compelling NoSQL data stores on offer today in hopes of helping IT pros get started experimenting with these powerful tools. From Cassandra, to MongoDB, to Neo4J, each appears geared for a particular set of application types, providing DBAs with a wealth of opportunity for experimentation, and a measure of confusion in finding the right tool for their environment. 'There are great advantages to this Babelization if the needs of your project fit the abilities of one of the new databases. If they line up well, the performance boosts can be incredible because the project developers aren't striving to build one Dreadnought to solve every problem,' Wayner writes. 'The experimentation is also fun because the designers don't feel compelled to make sure their data store is a drop-in replacement that speaks SQL like a native.'"

152 comments

  1. Also by Anonymous Coward · · Score: 0, Insightful

    From the Anything-Better-Without-Oracle department.

  2. One page by just_another_sean · · Score: 2

    less ads.

    Print version

    --
    Creationist Textbook Stickers Declared Unconstitutional by CowboyNeal
    1. Re:One page by drpimp · · Score: 1

      One word ...
      adblock
      or better
      lynx

      --
      -- Brought to you by Carl's JR
    2. Re:One page by just_another_sean · · Score: 1

      Well sure. of course, but this is one page!

      --
      Creationist Textbook Stickers Declared Unconstitutional by CowboyNeal
    3. Re:One page by Anonymous Coward · · Score: 0

      less ads.

      Print version

      No ads.

      I'm always dumbfounded to see someone on Slashdot not running ABP, yet I see it fairly often. You'd think the one thing that actually makes browsing the internet bearable to begin with would be at the top of every "nerd's" list.

    4. Re:One page by just_another_sean · · Score: 1

      I do use adblock, and noscript, but not everyone does and the main reason I prefer these links was stated in my subject, one page.

      --
      Creationist Textbook Stickers Declared Unconstitutional by CowboyNeal
    5. Re:One page by imthesponge · · Score: 1

      Some people are less comfortable with stealing.

    6. Re:One page by praxis · · Score: 1

      Do you also never get up during commercial breaks in ad-supported televisions shows lest you feel a pang of guilt that you stole?

      Here's how HTTP works:

      1) My browser, an agent on my behalf, requests a document on my behalf from you.
      2) Your server, an agent on your behalf, returns to me the document data.
      3) My browser parses the document, choosing to run or not run code that is included in your document as well as load or not load referenced elements in the document. For a variety of reasons, I may or may not be interested in requesting all referenced documents.

      How is choosing not to download all supporting documents stealing the one document that I requested and you gave me?

    7. Re:One page by Anonymous Coward · · Score: 0

      less ads.

      You really meant to say, FEWER ads. Right?

    8. Re:One page by batkiwi · · Score: 1

      Do you ever go to the toilet during the commercials?

      Do you ever skip over the classifieds in a traditional newspaper?

      If so, you're a thief.

    9. Re:One page by Eponymous+Hero · · Score: 1

      no, and you deserve to be put in your place for attempting to do the same with the parent.

      http://thesaurus.com/browse/less

      the 7th synonym offered for "less" is "fewer."

      less vs fewer

      fewer = things you can count. less = things that are too many to count. it is pretty easy to win this argument by stating that these ads cannot be counted, despite there being a finite number of ad space in the web page mentioned. refreshing said page will show you a different ad in the same space. at any given time there may be a finite number of ads in rotation, but throughout the course of time overall, the number of unique ads being displayed in that space is unknowable. less ads. fewer ad spaces.

      anyone can be a detail-oriented asshole if they really try, see? you're not that special. don't bother critiquing my response for grammar or spelling. i give not a fuck.

      --
      insensitive clod overlords obligatory xkcd car analogy russian reversals whoosh pedant fanbois ftfy in 3...2...1..PROFIT
    10. Re:One page by Z34107 · · Score: 1

      A grammar nazi set you off? You must really be new here. Take a deep breath and cherish this opportunity you were given to learn to speak English, since you evidently don't know what a "mass noun" is.

      --
      DATABASE WOW WOW
    11. Re:One page by captain_sweatpants · · Score: 1

      AdBlockPlus + AutoPager = problem solved. And you call yourself a nerd. Shame on you!

    12. Re:One page by Eponymous+Hero · · Score: 1

      no, moron, i just haven't done any trolling in a while. i also explained in retard terms to the retard i was responding to what a mass noun is. but you knew that, nice counter-troll attempt... sooo... go fuck yourself zealot. stupid-ass starcraft reference for a handle, what an idiot. thanks for stopping by! try not to suck any dick on the way to the parking lot!

      --
      insensitive clod overlords obligatory xkcd car analogy russian reversals whoosh pedant fanbois ftfy in 3...2...1..PROFIT
    13. Re:One page by RockDoctor · · Score: 1

      Do you also never get up during commercial breaks in ad-supported televisions shows lest you feel a pang of guilt that you stole?

      No ; I generally don't waste my eyeball time with the lowest-common-denominator retarded shit that advertising-paid TV puts on, in order to get the largest number of the stupidest people to buy the most unappealing crap for the highest price ... that advertising can support.

      I wish the wife would let me get rid of the TV and go back to using the radio like I did before we married. Or, failing that, pay for the damned thing herself.

      --
      Birds are not dinosaur descendants;birds are dinosaurs, for all useful meanings of "birds", "are" and "dinosaurs"
  3. Bend Over ... by Anonymous Coward · · Score: 0

    "providing DBAs with a wealth of opportunity for experimentation" they need to start targetting the right audience group of
    developers.

    DBA : We should use mongoDB
    DEV : Bend over ...

    1. Re:Bend Over ... by telekon · · Score: 2

      More typically, it goes:

      Dev: We should use MongoDB.
      DBA: THE END IS UPON US!!! The Beast and his armies shall rise from the Pit and make war against God!!! ZALGO!!! HE COMES!!!

      --

      To understand recursion, you must first understand recursion.

    2. Re:Bend Over ... by C0vardeAn0nim0 · · Score: 2

      actually:

      Dev: We should use MongoDB.
      DBA: BWAHAHAHAHAHAHA!!! NO !!! Oracle. get used to it or GTFO!

      --
      What ? Me, worry ?
    3. Re:Bend Over ... by FlyingGuy · · Score: 1

      No, it should go...

      DEV: We should use MongoDB
      DBA: Really? Here, have a nice big frosty glass of shut the fuck up. Now go back to your toy scripting languages and leave the data to those of us who actually understand data storage.

      That should be the end of the discussion right then and there. The problem with these script kiddies is that 99.5% of them don't fucking have a clue about data. They are the ones who still embed SQL statements, log in credentials and the like in their php/python/rails/whatever.scripting.language.is.popular.this.week code. They have never even heard of stored procedures and views and wouldn't know a constraint from a hole in the ground. Sadly, it is not really their fault. MySQL ruined many a dev because it was so utterly primitive for so many versions that they never had to take the time to learn a proper database like Postgres, Oracle, DB2, MS-SQL which would have forced them to actually learn about data storage and retrieval.

      MongoDB one of those fine databases that have managed to turn simple into complex eg:

      --- simple ---
      insert into users values('bob','123 Main Street','Springfield','NY');

      -- a mess of curly braces, colons, commas and quotes ---
      {
      "username" : "bob",
      "address" : {
      "street" : "123 Main Street",
      "city" : "Springfield",
      "state" : "NY"
      }
      }

      --
      Hey KID! Yeah you, get the fuck off my lawn!
    4. Re:Bend Over ... by Anonymous Coward · · Score: 0

      Right on!! But you are preaching to wrong audience. Most of the 'online' developer community (including /.) exclusively deal with OOO SHINY front end applications, having no grasp of basic concept of RDBMS. So when their shitty queries start taking long time as data start piling up, they turn in to cry babies instead of learning some basics.

      I have hardly seen any C / C++ developers complaining about database. Java/Ruby/Python/whatever on the other hand are just pussies.

    5. Re:Bend Over ... by Anonymous Coward · · Score: 0

      God what a drama queen. Hope you can back that up.

      Not too long ago the nosql DBs were curious exotic toys only useful in very specific circumstances by some very big players.
      Today the situation is similar but the toys have evolved and more applications with more 'players' are on the field.
      I'd suggest that the current trend of turning every bit and word ever in to some sort of database might make these things even more common in the near future.

      I remember reading a blurb from a google developer that said something to the effect of. "Yes, we could do it with SQL. Yes it work work fine.. But fuck it, it would be too expensive."

    6. Re:Bend Over ... by edumacator · · Score: 1

      Ok, I'm bracing for a crayon comment or some flaming, but I'm one of those script kiddies trying to move onwards and upwards.

      I've read a lo about data, but the stuff I've found is all over the place. Can you point me in a good direction to start understanding data better?

    7. Re:Bend Over ... by Anonymous Coward · · Score: 0

      I guess it's just his ego. (i.e. read the Postgres manual and experiment.) However I'm not fan of YAML either, and that's also the invention of these script kiddies.

    8. Re:Bend Over ... by epiphani · · Score: 3, Insightful

      No, it should go...

      DEV: We should use MongoDB

      DBA: Really? Here, have a nice big frosty glass of shut the fuck up. Now go back to your toy scripting languages and leave the data to those of us who actually understand data storage.

      That should be the end of the discussion right then and there. The problem with these script kiddies is that 99.5% of them don't fucking have a clue about data. They are the ones who still embed SQL statements, log in credentials and the like in their php/python/rails/whatever.scripting.language.is.popular.this.week code.

      Congrats. You're the reason we get devs storing images in databases.

      Either you have to educate your developers on what is appropriate to go into a relational database, or you need to get out of the way. Your attitude is exactly the reason NoSQL is picking up steam. I'm not a dev, but I've done dev work - nor am I a DBA, but I've done DBA work. And I can tell you, DBA's are often folks running around with a hammer: everything looks like a nail.

      Devs, on the other hand, are looking for a solution, and thinking like devs: I'll build the solution to my problem! Of course, they usually end up reimplementing stuff other people have done.

      If devs understood how full RDBMS's worked, database use would drop like a stone. If DBAs tool the time to understand requirements, database use would drop like a stone. NoSQL makes a _huge_ amount of sense. While you maintain your "script kiddies" attitude, the rest of the world will happily glide past you.

      RDBMS's are 90% misused, and a massive waste of money. NoSQL is an overraction to that fact. Sometime in the future people will swing back to the middle and realize that files in directories are a surprisingly good way of storing data -- and each will have its place.

      --
      .
    9. Re:Bend Over ... by Anonymous Coward · · Score: 2, Insightful

      insert into users values('bob','123 Main Street','Springfield','NY');

      I want to punch you in the head for not specifying the columns you're inserting into!

    10. Re:Bend Over ... by dido · · Score: 2

      The MongoDB record looks no more complex to me than the insert statement. In fact, the MongoDB record looks more readable, but what do I know, I'm probably one of the "script kiddies" you like to so disparage. I like to have my column names next to the data that actually goes into them, rather than some mess like insert into users (username, address, street, city, state) values('bob','123 Main Street','Springfield','NY'); that the true equivalent SQL would have been. By the way, I wonder why SQL uses such a syntax, when the SQL UPDATE statement is much more readable, and by the way, an update statement would look not much different from the MongoDB record, with equals signs instead of columns, and a few keywords instead.

      As is always in the world of software, there are some jobs for which NoSQL is in fact a very good idea, and others for which relational databases are better. If the fine folks at Google thought as you did and believed a traditional RDBMS was the only tool they could use then I doubt that Google would have grown to the size it has. They knew and understood that their problem did not map well into the concept of a standard relational database and acted accordingly. Of course, you also need to recognize when such an approach is warranted, as more often than not you'd be better off using a real RDBMS, and it would not be wise to shift to NoSQL databases just because you're driven by buzzword compliance.

      --
      Qu'on me donne six lignes écrites de la main du plus honnête homme, j'y trouverai de quoi le faire pendre.
    11. Re:Bend Over ... by Tenareth · · Score: 1

      The proper audience is BOTH.

      Dev's doing data structures is generally less than optimal (Disks have to spin? Just buy faster ones), DBAs doing logic flows is generally bad (This is the optimal data structure, so let's just change the business logic a bit). Both working together will build a much better application because it broadens the amount of concepts that can be taken into account.

      --
      This sig is the express property of someone.
    12. Re:Bend Over ... by Anonymous Coward · · Score: 0

      actually: Dev: We should use MongoDB. DBA: BWAHAHAHAHAHAHA!!! Tits or GTFO!

    13. Re:Bend Over ... by Anonymous Coward · · Score: 0

      The SQL insert statement allows bulk data adds, Not just one at a time. Hence the format.

      see : http://en.wikipedia.org/wiki/Insert_(SQL)

    14. Re:Bend Over ... by mcvos · · Score: 1

      MongoDB one of those fine databases that have managed to turn simple into complex eg:

      --- simple ---
      insert into users values('bob','123 Main Street','Springfield','NY');

      -- a mess of curly braces, colons, commas and quotes ---
      {
            "username" : "bob",
            "address" : {
                "street" : "123 Main Street",
                "city" : "Springfield",
                "state" : "NY"
            }
        }

      Is that what MongoDB code looks like? Looks perfectly readable to me. Cleaner and more structured than the SQL version. More verbose, yes, but highly usable, unlike SQL which always requires a couple of layers of abstraction and conversion and mapping in order to make it usable.

      You might have just converted a SQL user to MongoDB.

    15. Re:Bend Over ... by mcvos · · Score: 1

      I have hardly seen any C / C++ developers complaining about database. Java/Ruby/Python/whatever on the other hand are just pussies.

      C/C++ developers are used to cumbersome and arcane rituals. Ruby and Python (Java less so, but still more than C/C++) are supposed to make programming faster and more natural. A more natural way to store and access data makes a lot of sense there. You could call them pussies, but you could also say they're more focused on the goal itself rather than the arcane stuff around it.

    16. Re:Bend Over ... by mcvos · · Score: 1

      For me, bulk inserts only seem to work well in MySQL, not in any other DB system. (I'm probably doing something wrong; I know little about databases. I just want my data stored.)

    17. Re:Bend Over ... by fredan · · Score: 1

      You don't understand the benefits of Mongo DB.

      Please add a column to your users table. In your SQL DB you probably need to convert the table and have an global lock while doing so.

      In Mongo DB on the other hand, you just add the whatever you want to include:

      {
      "username" : "bob", "address" : { "street" : "123 Main Street", "city" : "Springfield", "state" : "NY" }, "an" : "new", "column" : "with some new data"
      }

    18. Re:Bend Over ... by justforgetme · · Score: 1

      Thanks for the laughs mate!
      But yes, it is really a mater of using your tools correctly...

      I once forced myself to take a day of because I saw an Intranet DB 20GB in size with just 20k rows in a total 14 tables...
      The 'hotshot' C++ programmer that wrote the scripts to the db had not only told the app to upload every image to a table field but also doing it multiple times for the same image instead of keeping some relational record and taxing the Intranet sever with many many DB retrieves and php overhead (because obviously that's the way you deliver an image right???) !

      The effect of this? Just a week after the app was delivered it's lookups slowed down to a crawl and getting me called in because 'Your Intranet has gone slow' Obviously the `hotshot` was just a contract and had taken off for newer heights and was (of course) unreachable. After looking login in to the DB server I noticed the blob types in every single table and went home to laugh my buttocks of

      Long story short I just expanded the Intranet app to accommodate the new functionality, adding a public interface for doing project uploads from outside the company network. This is what they used the C++ binary for btw... (like ssl hasn't been invented yet)
        which actually was what the Managers should have done in the first place.

      D@mn do I get angry when I think of incompetent management

      --
      -- no sig today
    19. Re:Bend Over ... by FlyingGuy · · Score: 1

      A fair question so here is a fair answer.

      I am assuming ( yes, yes I know... ) of course you have some knowledge of basic tables and indexs, etc.

      Start by reading and understanding Database Normalization

      Someone much wiser then myself once said, "You have to completely understand a set of rules before you can break them". I mention this in reference to data normalization.

      The people who are the best coder / data monkey combination have the innate ability to think in structures. This is not to say it cannot be learned, but it really is a way of thinking that is left & right brain.

      Realize that data is NOT TRIVIAL. Data is why we write code. Data drives code, not the other way around. If you need further proof other then my word go look at the source code for Linux. There are 10's of thousands of data structures that make it work and the code is designed to keep them updated and provide access to that data.

      Build a non-trivial set of linked lists, then write the code that manipulates it without modifying the data structures. This will be illustrative of the importance of data. Keep working until you can't get any farther. When you have reached that point, throw away all the code and then go and find the errors in your data structures because that is where the fault will be located and then perhaps the light will turn on.

      --
      Hey KID! Yeah you, get the fuck off my lawn!
    20. Re:Bend Over ... by FlyingGuy · · Score: 1

      Nope you are wrong. In Oracle you can alter a table while the entire system is in full use and transactions are flying like mad, no locks, no break in service no sweat. Oracle simply updates the data dictionary and as statements come through older records are modified on the fly and new records are simply, well, inserted.

      --
      Hey KID! Yeah you, get the fuck off my lawn!
    21. Re:Bend Over ... by FlyingGuy · · Score: 1

      Nope, now all that has to be converted into some sort of escaped string do to all the bloody text.

      --
      Hey KID! Yeah you, get the fuck off my lawn!
    22. Re:Bend Over ... by FlyingGuy · · Score: 1

      While this is somewhat proper according to SQL-92 and on what is the failure mode? Personally, I believe it violates the atomic nature of an sql statement.

      insert into users (fname,lname) values ( ('bill','smith'),('larry','jones'),('sally','brown'),('3lmer','fudd') ) );

      Now given a constraint that specifically only allows 'a'..'z' and 'A'..'Z' into the column lname, what part of this transaction fails? All of it or only (fred 3lmer) ? And if the entire transaction does not fail how does one determine which insert failed? "Insert" is an inherently atomic transaction and should, at least in my opinion, not be overloaded in this manner.

      --
      Hey KID! Yeah you, get the fuck off my lawn!
    23. Re:Bend Over ... by FlyingGuy · · Score: 1

      A perfect illustration why good DBA's are worth having on staff.

      --
      Hey KID! Yeah you, get the fuck off my lawn!
    24. Re:Bend Over ... by badkarmadayaccount · · Score: 1

      The "mess" seems like YAML to me. Not perfectly certain what it is - but I can parse it right now perfectly despite the two vodkas since the morning. What kind of shit are you on, anyway?

      --
      I know tobacco is bad for you, so I smoke weed with crack.
  4. not worth reading by rla3rd · · Score: 5, Informative

    Don' t bother reading this fluff. Wikipedia offers a better overview. http://en.wikipedia.org/wiki/NoSQL. Oh I forgot, this is slashdot, no one here reads the articles :).

    1. Re:not worth reading by houstonbofh · · Score: 2

      I just read it for the centerfolds.

    2. Re:not worth reading by Anonymous Coward · · Score: 0

      Not trying to be negative, but the wikipedia page basically tells nothing about NoSQL. Yeah, I know it is not "relational", but what does that really mean? Why would you see a performance increase? What are the pros/cons and why?

    3. Re:not worth reading by doublebackslash · · Score: 4, Informative

      The abridged version:
      Atomicity: actions or sets of actions complete or they don't. No half states. Ever.
      Consistency: The database has rules. Rules like, "this can only be X when X exists in this other table" or "You cannot put a picture of a jabberwocky in this column." The rules are always obeyed even if one transaction fails. The DB itself will still be clean.
      Isolation: Everything accessing the DB views it as if it were the only thing accessing the DB.
      Durability: If the DB tells you it happened that means that you could yank the network jack, axe the power, or any other Bad Thing(tm) and so long as the disks are still there and intact your data also will be.

      That is SQL. NoSQL: Pick three, or two.

      Is it faster? You bet your ass it is. The limitations are, generally, that the DB won't do things like JOINs for you, or perhaps you have to deal with the idea of a half state, etc. Aside from ACID guarantees being, generally, broken the DB might act more as a key->value lookup (think a dictionary or encyclopedia, but with data). It might not have rigid fixed columns (some SQL databases do this too, but it is not a standard feature and generally comes with more cost vs a NoSQL that offers it).

      NoSQL is useful, though, if you have a tremendous (REALLY REALLY huge, I mean it has to be worth it!) data set or some strange demanding special need. Some things don't need isolation because the actions are intrinsically isolated (Slashdot comments, for example, are just appended and only one column needs to be mutated (the moderation)) . Durability might not need to be met at the disk level, you might be comfortable with writing it to two node's memory (Cassandra even lets you return after it is in the target node's memory and after it has been flushed to the network send buffer. You know, to kill those pesky nanoseconds of latency). If your nodes are good and isolated this might be fine. Atomicity might not be a big deal.... though I can't think of any that don't provide THIS. Atomicity is really rather important almost everywhere. Getting rid of fixed tables or "relations" (foreign keys) makes consistency a non-issue. Consistency is one of the first things to be tinkered with in most of these NoSQL things, though it is not 100% gone (still can't put that jabberwocky in that int column!)

      So by trading off some guarantees for a more simplistic DB one can gain speed and some degree of burden can be lifted from the programmer to work within the confines of that guarantee system. However, an ACID SQL system is universal (can store anything and meet any guarantees you require, but not necessarily quickly). NoSQL systems only work for some workloads and requirements. Almost (but not quite) anything can be shoehorned into them but weather it is a good idea remains a question to ask before you dive right in. If you can see gain from NoSQL then it might be a good idea, but don't paint yourself into a corner where you trade a working system of moderate speed for a blazingly fast system that has subtle (or blatant!) flaws which effect your company or customers.

      Hope that helps!

      --
      md5sum /boot/vmlinuz
      d41d8cd98f00b204e9800998ecf8427e /boot/vmlinuz
    4. Re:not worth reading by nabsltd · · Score: 1

      That is SQL.

      No, that is a relational database with a lot of extras that protect the data.

      SQL is a query language that could be used against any collection of data (with the correct parser and engine).

    5. Re:not worth reading by doublebackslash · · Score: 1

      I abridged too far =p
      I do know it isn't SQL, I should have said "This is a standard ACID Compliant database like MySQL, PostreSQL, Oracle, and SQLite".

      --
      md5sum /boot/vmlinuz
      d41d8cd98f00b204e9800998ecf8427e /boot/vmlinuz
    6. Re:not worth reading by shutdown+-p+now · · Score: 1

      Yes, but in the context of "NoSQL" this is a reasonable simplification, since that term itself really means "non-relational".

    7. Re:not worth reading by angel'o'sphere · · Score: 1

      That is SQL. NoSQL: Pick three, or two.

      NoSQL has absolutely nothing to do with ACID.

      True is that most NoSQL DB systems blur the points they enforce in ACID ... so you can emphasize on one or the other.

      However you also can strictly enforce ACID in NoSQL DBs.

      So what is the difference? They don't use tables, they don't define types per column and they don't support SQL as query language ... thats it.

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    8. Re:not worth reading by Anonymous Coward · · Score: 0

      MySQL until recently used a non-ACID storage type as it's default database store. It's still available and unfortunately used a lot (people like the wrong answer as long as it's fast apparently). MSSQL would have been a better example of an ACID complaint database.

    9. Re:not worth reading by Tenareth · · Score: 1

      The key with ACID was that it allowed applications to offload a lot of basic logic of data consistency to the database, and that was a great thing.

      But I think NoSQL is coming out from the fact that there are still times you don't need/want to do that, usually involving massive amounts of data that can be processed in chunks that can just be "done again" if something goes wrong (sort of like Map/Reduce recovery).

      --
      This sig is the express property of someone.
    10. Re:not worth reading by oldhack · · Score: 1

      The parent is a troll. There is no centerfold.

      --
      Fuck systemd. Fuck Redhat. Fuck Soylent, too. Wait, scratch the last one.
    11. Re:not worth reading by Anonymous Coward · · Score: 0

      damn... how have I wasted so much of my time and miss that?

  5. Re:NoSQL is garbage, plain and simple. by Anonymous Coward · · Score: 5, Interesting

    If you view it as a SQL replacement, then yes, utter garbage. But if you take it for what it is, then no.

    The problem is there is a fad surrounding NoSQL and young, ignorant, inexperienced developers think RDBMs are for old farts who refuse to get with the times rather than viewing it as a different tool for solving a different problem. If you want/need ACID properties, you go with SQL. If you don't, NoSQL may be appropriate.

  6. Neo4j Information incorrect by Anonymous Coward · · Score: 0

    As Michael Hunger points out in the comment on the article, it seems like the article author Wayner did almost no research on the Neo4j graph database. Some of his points are flat-out incorrect.

  7. In b4... by Anonymous Coward · · Score: 3, Informative

    This discussion is likely to lean towards "OMG NoSQL IS SO RETARDED!". So let me just say that if you don't care about NoSQL, then fine. If MySQL/Postgres/Oracle/MS-SQL fit your needs, then fine.

    That doesn't mean "NoSQL" databases are useless.

    I've had exposure to both MongoDB and CouchDB so far. CouchDB is the newest experience, as part of a Chef installation. Yes, it is a very immature product, and yes it has a long way to go, but it's very simple to configure and it does it's job with very few resources. I don't personally have a need for CouchDB myself, but I can see why people use it for certain specific needs (I.e. I can understand why Chef uses).

    MongoDB is a little marvel for certain applications. In my current and previous jobs we've used MongoDB for Syslog collection and SMTP mail logging. MongoDB is excellent for this sort of thing: each log entry is a single entry in the collection, the data is NOT relational in any interesting way and the insertion rate is far beyond anything a traditional relational database engine could manage on the same hardware at the same resource utilisation. Even better you can write some quite clever Map/Reduce functions on top that allow you to do some amazingly deep inspections of the log data, so you can produce on-demand data as well as graph out long term trends.

    NoSQL is a NOT a replacement for traditional SQL databases, but it sure is useful for stuff where SQL databases struggle.

    1. Re:In b4... by Anonymous Coward · · Score: 1

      NoSQL may not be retarded, but the article is. To start with, they didn't cover two of the major offerings, H-Base and Project Voldemort. From everything I've read, Voldemort is one of the few that will actually scale really well, so ignoring it makes me really suspect of the research that went into the article and makes me think that they're just trying to capitalize on the NoSQL buzz word by writing an article with a brief summary of the first few options they found.

    2. Re:In b4... by Anonymous Coward · · Score: 0

      Needs more buzzwords and didn't tell me how it would provide synergy to my project.

    3. Re:In b4... by bhcompy · · Score: 1

      Can't talk about NoSQL databases without including PICK. Hell, it predates SQL by years. And scaling is what it does best.

    4. Re:In b4... by mcmonkey · · Score: 1

      What's an example of where NoSQL is useful? I'm not a DBA or SQL guru, but I do work with traditional relational databases, and I'm having trouble thinking of a scenario where I'd want NoSQL.

      I did a little research and the example I found was Twitter, and it sounded like a mess. You have a list of feeds with their followers, and a list of followers with the feeds each follows. It sounds nice for finding who follows a feed or for finding which feeds someone is following.

      The issue I see is the duplication of information. Every time someone changes which feeds they follow, the number of data updates to make is doubled. And what happens when the 2 lists get out of sync? How much extra resources are spent making sure the feeds-to-followers list is consistent with the followers-to-feeds? Any time you store a piece of information in 2 places, it's just a matter of time until the 2 don't agree.

      If that's poster child for NoSQL, I can see why some people are skeptical.

    5. Re:In b4... by fortyonejb · · Score: 1

      Well technically you're not allowed to mention Project Voldemort by name, so they couldn't really cover it.

    6. Re:In b4... by shutdown+-p+now · · Score: 1

      NoSQL does not provide synergy, it's optimized for performance. If you need synergy, you should use a traditional full-featured RDBMS. ~

    7. Re:In b4... by angel'o'sphere · · Score: 1

      And what happens when the 2 lists get out of sync? How much extra resources are spent making sure the feeds-to-followers list is consistent with the followers-to-feeds?

      Who would care about that?

      If you and I follow Lady Gaga and Paul Mc Cartney, and both of them publish a new tweet, you see Lady Gags new message before Pauls, and I see Pauls before Lady Gagas .... who the fuck cares?

      Big volume NoSQL DBs have one goal: they are eventually consistent.

      It does not matter if I and you see the exact same result at the exact same time. As long as we see the same "pig picture".

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    8. Re:In b4... by sacridias · · Score: 0

      Retarded, no. Most people like to use the tools that best fit the job. 1) Well Known and established by all, in other words, support is everywhere, people that can manage it cost less and can be found at all skill levels. 2) Universal. SQL fits into so many places, where as nosql fits into a few small areas, and tends to be limited to perform well in just those areas. Meaning you need multiple databases for each section of your code based (not always a bad thing). 3) Stable. SQL has been around, optimization is stable, years of theories built on the core. 4) Non-experimental: SQL is known, NoSQL is a gamble, go to the owners and say hey we want to experiment, we are not sure that it will be better, but it may, it will only cost you a lot of time as we learn this new technology. Professional NoSQL will be used only where it can be fully proven to present great advantage, not just possibly give us advantage.

    9. Re:In b4... by Anonymous Coward · · Score: 0

      I just gave you three examples of where NoSQL is used in the real world. Your mental model of how Twitter works is also badly incorrect: you're trying to apply a relational model a non-relational dataset.

    10. Re:In b4... by angel'o'sphere · · Score: 1

      What kind of synergy do you refer to?

      The only thing where SQL excels in my opinion is the ability to write ad hoc queries. Random queries no one thought of before and they work ... in NoSQL you need to know the layout / hierarchy and have a good idea where to peek in and navigate from.

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    11. Re:In b4... by shutdown+-p+now · · Score: 1

      Re-reading the comment I replied to in this context might help with that whooshing sound.

    12. Re:In b4... by mcmonkey · · Score: 1

      If you and I follow Lady Gaga and Paul Mc Cartney, and both of them publish a new tweet, you see Lady Gags new message before Pauls, and I see Pauls before Lady Gagas .... who the fuck cares?

      If the list of feeds I follow includes Lady Gaga, but the list of Lady Gaga followers does not include me, then when I check my account, it looks like I should get Gaga's tweets. But when Gaga tweets, it won't get sent to me.

      Big volume NoSQL DBs have one goal: they are eventually consistent.

      Ah, I get it now. It's perfect for something like Twitter, where your users are your product and your only goal is to maximize your number of users. This allows Twitter to handle the maximum number of feeds and subscribers by ignoring quality.

      But for a service where the users are the customers, that is if I don't deliver quality data I don't get paid, this doesn't work at all.

      And I don't mean that sarcastically. For something like Facebook or Twitter, "eventually consistent" is good enough. Of course, that's only if I don't think about how Twitter is becoming part of the emergency warning system. If the campus PD are sending out an alert because a Columbine or Virginia Tech type situation, I'd like to know sooner than "eventually".

      But if my 'friend' needs to be at the gym in 26 minutes, yeah I can wait for that news flash.

    13. Re:In b4... by An+Onerous+Coward · · Score: 1
      --

      You want the truthiness? You can't handle the truthiness!

    14. Re:In b4... by angel'o'sphere · · Score: 1


        Of course, that's only if I don't think about how Twitter is becoming part of the emergency warning system. If the campus PD are sending out an alert because a Columbine or Virginia Tech type situation, I'd like to know sooner than "eventually".

      You know it soon enough. That is not the point.
      In this situation lets assume 1000 people give a warning. Lets assume the cluster has 100 nodes. And the "persistance rule" is: if ten nodes have it stored it is considered persistant(Quorum).
      Now 10 thousands or more users fetch data from the cluster. Every write transactions is pending until 10 nodes confirm persistence. But every read transaction is fetching from random (or usually not so random master node), so you might get old data.
      In other words: thousands of concurrent running write transactions are all partial committed. The reader randomly picks up data fragments from more or less random nodes.
      That means the raw idea what is going on is transported to every reader.
      Ofc, as you stated above: you might miss the tweet/message of your sister. But millions of other people will see your sisters message. Long long long before a traditional RDBM system could do that.

      And in the end, you will see your sisters message as well. However: the whole point was about: "get the fuck out of the danger zone!!!" For that you don't need to wait for your sisters tweet, the thousands and millions other tweets already told you so.

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    15. Re:In b4... by angel'o'sphere · · Score: 1

      LoL your sarcasm is astonishing ... /bow

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    16. Re:In b4... by Anonymous Coward · · Score: 0

      I'm sorry: did you seriously just invoke "post Columbine" as an argument against NoSQL databases? I must lol and I have no words.

    17. Re:In b4... by julesh · · Score: 1

      Any time you store a piece of information in 2 places, it's just a matter of time until the 2 don't agree.

      create table users (
            uid int not null primary key,
            username varchar(255) not null,
            passwordhash varchar(255) not null,
            unique (username)
      )

      When I insert into this table, a reference to the generated row is stored in two places: the primary key index and the unique username constraint index. Is it just a matter of time until the two don't agree?

      Why would this be different for a NoSQL system that stores information in two different ways to allow it to be found more efficiently?
           

    18. Re:In b4... by vegiVamp · · Score: 1

      Well, the very basic usecase is a key-value store. No relational overhead (and no sql parsing!) means it can be blitheringly fast.

      In general, if your data is highly structured and internally consistent, you'll be well off with relational databases. If you want very fast lookups, your best option used to be a hierarchical database (LDAP, for instance), but that's a bit of a bugger for updates. NoSQL can also fit that bill, but there's quite a few very different implementations that make it more or less suited for specific purposes. Cassandra, for one, is a column store - think of it as a multidimensional matrix, up to five levels deep, I believe; while you can obviously also do that in SQL, you're going to have rather interesting joins and related slowdowns, while this will be a lot faster for some types of lookup.

      I'm convinced that there is a place for NoSQL, just as I'm convinced that there's the usual shitload of blithering idiots who are going to use their one trick to solve every problem they encounter in exactly the wrong way.

      The idea predates the name, though. things like Memcached have always been useful. eAccelerator and similar projects have also always had their own built-in key-value store.

      I also think NoSQL is a bloody stupid name that's costing them a hell of a lot of points with proper DBAs, even if they now retroactively pretend it to mean "Not only SQL".

      --
      What a depressingly stupid machine.
  8. Use an IMDG as a front by SpaceCracker · · Score: 1

    Read Nati Shalom's blog for an interesting article (http://natishalom.typepad.com/nati_shaloms_blog/2011/07/real-time-analytics-for-big-data-an-alternative-approach.html) about how to impliment an application using an In Memorg Data Grid as a front for the data and for real time or near real time analytics. The data can be persisted to a SQL or NoSQL database of your choice, depending on what best suits your application's needs.

    --
    sigo ergo sum
  9. Really, wikipedia? by Anonymous Coward · · Score: 3, Funny

    Key-value store

    Key-value stores allow the application to store its data in a schema-less way. The data could be stored in a datatype of a programming language or an object. Because of this, there is no need for a fixed data model. This is generally of interest to friendless sperglords only.[16] The following types exist:

    Crowdsourcing at its finest.. Although, I suppose the comment is accurate?

    1. Re:Really, wikipedia? by indeterminator · · Score: 1

      It has to be accurate, there's even have a citation for it.

      Btw it's not there anymore (if it ever was).

    2. Re:Really, wikipedia? by MikeyC01 · · Score: 1

      Btw it's not there anymore (if it ever was).

      I can vouch for him, it most definitely was there ... I shoulda grabbed that screenshot I was gonna make

    3. Re:Really, wikipedia? by Anonymous Coward · · Score: 0

      Btw it's not there anymore (if it ever was).

      I can vouch for him, it most definitely was there ... I shoulda grabbed that screenshot I was gonna make

      You guys do know that wiki has history, right?

    4. Re:Really, wikipedia? by Anonymous Coward · · Score: 0

      I, for one, welcome our friendless sperglords.

    5. Re:Really, wikipedia? by MikeyC01 · · Score: 1

      Nope, but I do now :)

      I guess you do learn something new every day!

    6. Re:Really, wikipedia? by Anonymous Coward · · Score: 0

      history... http://en.wikipedia.org/w/index.php?title=NoSQL&diff=prev&oldid=440705295

      seems the (other) AC added it just for the /. comment

  10. Mysql ITSELF is a "NoSQL" solution by mcrbids · · Score: 4, Interesting

    Sure, some solutions are faster than MySQL out of the box by skipping much of the language parsing and stuff that any SQL solution has to do. But that's not to say that they are actually more efficient at key retrieval.

    For example, one developer found that the best no-sql solution was.... MySQL, which excels at simple key retrieval. He was able to best MemCached by a factor of almost 2.

    Use the right tool for the job.

    --
    I have no problem with your religion until you decide it's reason to deprive others of the truth.
    1. Re:Mysql ITSELF is a "NoSQL" solution by Billly+Gates · · Score: 2

      The issue with SQL is with joins particularly. MySQl is not a noSQL solution to this problem. If you do not use them and just need a single database you will be fine with traditional SQL. NoSQL wont be a benefit. If you host a simple website you wont run into that scalability problem.

      Now imagine your a systam analyst who needs joins to do things, such as comparing a pricing database with a sales order database to see if a discount worked and by how much? This is where you need join. Now imagine the size of both databases are 1 terabyte? Also imagine you have to pull this data from a regular ethernet connection shared by 100 other users only offering 10 mpbs speed? Also imagine the database is distributed among a cluster of computers and the RDMS needs to wait on the other servers to pull the whole table? See the performance problem?

      Can noSQL offer a solution where I could do this?

      My above example is why companies love Oracle and people doing analysis or statistics or even accountants need joins. The problem is with many databases and big iron and probably fiber optic connections and switches is that it gets very very expensive and is too much for a startup. The licensing costs then come up as well. Zdnet (dont have link) did an article showing it would cost $650,000,000 for Google to use Oracle to host Youtube. I can see why they went with their own solution.

    2. Re:Mysql ITSELF is a "NoSQL" solution by MemoryDragon · · Score: 1

      Given how bad some versions of mysql support sql itself you can qualify them as nosql as well :-)

    3. Re:Mysql ITSELF is a "NoSQL" solution by Anonymous Coward · · Score: 0

      Can noSQL offer a solution where I could do this?

      well can it?

    4. Re:Mysql ITSELF is a "NoSQL" solution by Anonymous Coward · · Score: 0

      People love Oracle because even if it's running on a local harddrive it's as slow as a 10mbps connection shared by 100 other users so no one notices the difference? Well worth the 650 mil for sure.

    5. Re:Mysql ITSELF is a "NoSQL" solution by Anonymous Coward · · Score: 0

      Can noSQL offer a solution where I could do this?

      For the most part, no. The majority of the NoSQL solutions are not relational at all and do not provide for joining data. So you break your query into sub requests and pull the data together in the application, and this will not be quicker than a decent SQL solution.

      MySQL is a great, nay amazing, hobby database for people that are just starting out in the SQL world. But it has problems with scalability, particularly when using anything other than innodb. The latest versions of PostgreSQL will be faster for all but the simplest queries, and simple SELECT COUNT(*) FROM ... queries, and it is much much more scalable when you throw more concurrent connections and hardware at it. Couple that to the more advanced features it offers and for serious projects it should be a no brainer.

      Where NoSQL excels is in truly massive scalability with relatively simple datasets where consistency and reliability of data are lesser concerns that overall concurrent performance. The example given by someone else earlier in the thread where they were using MongoDB for logging is a perfect example of a good use case. For the most part if you lose a small amount of log data then for most websites it's not going to be the end of the world, and the speed performance boost, lack of strict schema, and ability to write complex map reduce queries against that dataset outweigh the drawbacks. Session handling is another good example where speed is likely to be more important than reliability.

      As ever it's a case of selecting the right tool for the job, and weighing up the pros and cons of each.

    6. Re:Mysql ITSELF is a "NoSQL" solution by Anonymous Coward · · Score: 0

      Seen your nick I can detect trolling...

      First your point with regards to Google is ridiculous: Google is one of the richest company on earth, both in war chest, market cap, revenues and net income. If Google cannot afford Oracle, then not enough companies can to keep Oracle afloat.

      Google have chosen to build their own technologies because other technologies simply couldn't cut it seen their special needs.

      No regarding your queries: it's fine if that's what *your* job is about and what *your* company is about. But you have to realize most business are SMEs, not Fortune 500. And DBs of 1 terabyte needing SQL are not that common in SMEs.

      Nobody gives a shit if noSQL can solve your particular query on your particular dataset.

      What's important are the problems that noSQL can solve faster and cheaper than SQL. And guess what? There are such problems. And that's why companies like Google and FaceBook are using, amongst other, noSQL DBs.

    7. Re:Mysql ITSELF is a "NoSQL" solution by Aceticon · · Score: 1

      I read his article.

      DON'T DO WHAT HE DID!

      Although his conclusion is sane, the way he went about to make it happen is overly complex.

      In his specific scenario (running always the same SQL queries by primary key but with different parameters) he found out that CPU time spent in SQL parsing and Query cost estimation were resulting in CPU-bound throughtput for MySQL.

      He then proceed to "fix" this by getting some library that allows direct access to MySQL's underlying database bypassing the SQL layer and rewritting his code to use this SQL-less way.

      This is WAY too complex.

      Simple solution: Use SQL Prepared Statements with bound parameters (i.e SELECT bla, bla2 FROM blatbl WHERE id=?). Prepared Statements only do SQL parsing and query cost estimation cost once, when the statement is created, thus using them would've removed the CPU cost without needing to add yet another layer to the software (which adds maintenance costs) or re-write it.

      A lot (if not most) of performance issues with SQL databases that I have seen are issues of programmer ignorance rather than inherent problems of the DB engine or SQL itself.

  11. some comments by Anonymous Coward · · Score: 1

    first off, you have to really really understand your dataset before committing to either an sql or no-sql solution. this is because the main theoretical difference, as i see it, in sql, one basically generates a result set and the "game" is to find a particular record (or records) within result set, whereas with nosql, you basically already have your "object" (or key) and the "game" is to find what the object connects to. a subtle yet extremely important difference.

    im towards the end of an 8 month project that i started with mysql, and switched to mongodb about a month in. why? because we are dealing with facebook data, and with facebook data, you start with the "id" (profile id) and nothing else. so it made sense to use nosql, or else the end result would have been implementing a hash table in mysql.

    one totally amazing aspect of mongodb are the embedded documents. i use them to create a embedded "connection" for each key, and i can query an object (hash)'s connections and figure out what relates to what. it is extremely powerful. the key is deciding what is a connection and what is an object. for example, a user is an object, a community is an object, but a users role within that community is a connection. so you can group your objects (or "documents") into "collections" that have connections. in facebook's case, a page is an object but a "like" is a connection.

    but, really, dont use nosql just because its cool (its not even a new idea, is it!?). its certainly a really neat and novel way to program a database, but could be your downfall if you dont understand your data set first.

    1. Re:some comments by Anonymous Coward · · Score: 0

      and one final thought: i have yet to find a framework that can fully support querying objects based on its connections; for example, trying to fit a URL request structure like /123456789/users

      where 123456789 is your community id, into your favorite MVC framework is a giant headache because the mapping to controllers is not straightforward (do the controllers act as objects or connections!?). while weve already built our own queryable interface to handle this, it was all custom work. keep this in mind when deciding on a nosql solution.

  12. Re:NoSQL is garbage, plain and simple. by fusiongyro · · Score: 3, Interesting

    Yeah, the problem is that you want and need ACID, even if you don't know what it means. Very, very rarely, you may find yourself in a situation where availability demands are too great for systems with the ACID property, and then you should consider using one of these non-relational systems. The problem from where I'm sitting, is that too many young, ignorant, inexperienced developers think that their shitty little website needs to be prepared for handling millions of hits per second, and jump to two conclusions: one, that the problem is their database (and not the way they're using it), and two, that ACID should be thrown out the window to fix it.

    All other things being equal, you are much more likely to be implicitly depending on ACIDity than in a situation where demand is great enough that choosing NoSQL is worth the trouble you're going to get into.

  13. Funny story by Anonymous Coward · · Score: 0

    So, Netflix won't work on my Roku. Get "internal services error" messages. Google gets me to this two month long thread. Been going on since mid June and still isn't fixed. There is some problem with the Netflix "instant queue"; looks like the server has a cache that is out of date somehow. Can be fixed by altering or deleting entries from a web browser. Problem pops up with several different clients. Thinking to myself; this is a caching problem in the Netflix web services stack; probably some multi-tier coherency problem and reckless programming. Things like NoSQL come to mind; Digg and Twitter learned the hard way in public too.

    Then this story appears. More muddled thinking about databases. I decided to make the effort to see if my blue sky guesswork about Netflix and their screwups have any basis in fact. Result of Google query #1 ("netflix nosql"):

    This is Yury Izrailevsky, Director of Cloud and Systems Infrastructure here at Netflix. As Netflix moved into the cloud, we needed to find the appropriate mechanisms to persist and query data within our highly distributed infrastructure ... move beyond the constraints of the traditional relational model ... high availability ... trumps strong consistency ... we have found ourselves braving the new frontier of NoSQL distributed databases.

    22 weeks from that blog post to first damage.

    This is just the sort of unthinking buzzword driven nonsense I have come to associate with all things NoSQL, the technology of celebrity wannabe PHBs. The results speak for themselves.

    1. Re:Funny story by kiatoa · · Score: 1

      I've thought I'd seen a problem with our Netflix queue. I just assumed my wife had messed it up somehow :)

      --
      90% of the wealth is in 2% of the pockets. Bummer to be in the majority.
  14. Chicken/Egg Problem (with NoSQL) by Manip · · Score: 2

    We want to jump on the NoSQL ship. I won't bore you with all of the details but briefly put SQL databases and tables are too restrictive for our work. Unfortunately because there are SO many NoSQL solutions, and none of them are backed by big names nobody here has the balls to sign off on one. Unfortunately, and ironically, NoSQL's biggest downside is the lack of cross compatibility. Once you make that call you're stuck with it good or bad.

    The other issue, is that because all of these solutions are relatively young the toolsets simply don't exist for many of them. No libraries, backup solutions, third party support, etc. I wish we'd see someone like Microsoft, Oracle, IBM, or any big name roll out some kind of complete solution (in particular XML compatible). I know a few big Cloud solutions exist but again we come back to being locked into a solution.

    1. Re:Chicken/Egg Problem (with NoSQL) by bhcompy · · Score: 3, Informative

      Not every solution is young. PICK is a NoSQL db that predates SQL. It's descendants are supported and cross-compatible to a degree. NoSQL is a generic term. You need a specific database. For a PICK based solution, I'd look at Reality. Reality has been around for decades and is highly supported and has many features for compatibility with modern databases and modern operating systems. OpenQM is GPL licensed and of the same class. jBASE might be a more recognizable descendent.

    2. Re:Chicken/Egg Problem (with NoSQL) by Anonymous Coward · · Score: 0

      We want to jump on the NoSQL ship. I won't bore you with all of the details but briefly put SQL databases and tables are too restrictive for our work. .

      I find it odd that you're considering NoSQL because you find relational databases too restrictive. You'll find post-modern databases far more restrictive.

    3. Re:Chicken/Egg Problem (with NoSQL) by PCM2 · · Score: 1

      I won't bore you with all of the details but briefly put SQL databases and tables are too restrictive for our work.

      Care to make a case for that?

      Perhaps your work is too chaotic and disorganized for SQL tables?

      (BTW, C. J. Date would take issue with anyone who thinks "tables" are part of the relational model, but I digress...)

      --
      Breakfast served all day!
    4. Re:Chicken/Egg Problem (with NoSQL) by Anonymous Coward · · Score: 0

      every copy of windows comes with an industrial stength journaling nosql database. Extensible storage engine for NT - engine used by exchange and AD for their backends

    5. Re:Chicken/Egg Problem (with NoSQL) by bhcompy · · Score: 1

      what about a premodern one?

    6. Re:Chicken/Egg Problem (with NoSQL) by shutdown+-p+now · · Score: 1

      ESENT is just an ISAM database, no? What's "NoSQL'y" about it? I don't think it allows much schema flexibility (extra fields for individual records), for example.

    7. Re:Chicken/Egg Problem (with NoSQL) by angel'o'sphere · · Score: 1

      I doubt it is easy to make a case for that. As you coin it. However everyone I met last 10 years who is using NoSQL DBs made a gut decision. When I met the shop they always could show me a few things where I agreed that it is not really possible to do with a traditional SQL DB.

      E.g. when you have to write giga bytes per second to the DB you are out of luck with any of our days RDBSs.

      Keep in mind, NoSQL DBs are usually optimized for write performance and for the "exact retrieval path". There is no join involved. Always where you would join in an RDBMs you use redundancy and a precomputed hash to pick "exactly" the data you want. That is lightyears faster than ordinary DBs, but well your data storage is a mess. NoSQL is write once, never update but read often. SQL is read, write update all the time.

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    8. Re:Chicken/Egg Problem (with NoSQL) by MemoryDragon · · Score: 1

      It really depends on what kind of data you have. In the 90s when OODBs where the next big thing I was in a project where they tried to shoehorn tabular data and operations into an OODB, the project failed utterly, thanks to non existing well working query languages etc... entire simply sql ops became a major pain.
      Schema updates forget about them every second one broke the existing db and data etc...
      I assume with nosql the situation is rather similar, blazingly fast for certain use cases but utterly unusable for others.

    9. Re:Chicken/Egg Problem (with NoSQL) by PCM2 · · Score: 2

      NoSQL is write once, never update but read often. SQL is read, write update all the time.

      And yet most MySQL installations (Web apps, anyway) are: read all the time; write some; update seldom. That's why MySQL became a popular database for Web apps -- it was faster for that model than Oracle (on the same hardware). SQL or the relational model wasn't the problem. The implementation was the problem.

      I'm sure there are some cases where NoSQL is absolutely game-changing -- but those cases seem rare, and where they have occurred, the companies that really need NoSQL seem to be the ones who invented it (as you might expect). But "Google uses it so I should" is a poor argument; you are not Google, no matter what your VP of sales likes to think.

      E.g. when you have to write giga bytes per second to the DB you are out of luck with any of our days RDBSs.

      I suppose that's true, but can you really process gigabytes of data per second? Maybe this is a case for data warehousing, and you don't even use a traditional database to capture the data. Er, wait -- maybe I just gave a case for using NoSQL. But in this case, NoSQL isn't a replacement for a RDBMS, it's an adjunct to one, so I guess all I'm really saying is that it gets tiresome to read discussions of NoSQL this, NoSQL that, when most folks seem to have a poor understanding of the dimensions of their own problem spaces and they've chosen a tool before they've figured out how they'll use it.

      --
      Breakfast served all day!
    10. Re:Chicken/Egg Problem (with NoSQL) by angel'o'sphere · · Score: 1

      Well, you basically got the point what NoSQL is all about.

      As I mentioned in a different post, "NoSQL" does not necessarily mean "no" SQL but mostly it is referred to "not only" SQL and means you mix your storage strategies.

      Imagine facebook, 100 millon users concurrently online. 1 million of them is writing a 100 characters comment on "something" per hour. That is 100 MB data to store per hour. And no one cares if he reads it just in time, 1 min after posting or 10 mins after posting. In other words: everything a traditional (and that has nothing to do with the query language) DB has to offer is not relevant. Why should all my friends get an XXX when they load my page, just because some random comment is not "acid committed"???

      Everyone of them will see my *now* comment tomorrow anyway ...

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    11. Re:Chicken/Egg Problem (with NoSQL) by Anonymous Coward · · Score: 0

      Have you looked at the existing solutions from the major vendors? I've become a big proponent of DB2's "PureXML", which allows you to mix SQL and NoSQL rather seemlessly. You can query subelements via XPath, set indexes on attributes, have constraints, etc. All within a column of a traditional table.

      Our need is to persist an object graph that, in 3NF, is something like 100 tables (many with 1-many and many-many relationships). The DB does an ok job of this relationally, but it's a no brainer that getting your data from one table is going to have advantages vs. going to a lot of tables.

    12. Re:Chicken/Egg Problem (with NoSQL) by tcr · · Score: 1

      We want to jump on the NoSQL ship

      These comparisons might be of interest...

      --


      Information wants to be beer.
    13. Re:Chicken/Egg Problem (with NoSQL) by Anonymous Coward · · Score: 0

      You should checkout Versant ( www.versant.com ). It is a public company. Its database is running stock exchanges and global airline reservation systems. It is ACID capable. In benchmarking, it is slightly slower than MongoDB when single threaded ( non-concurrent) which means 10X+ faster than something like Hibernate/Oracle, when concurrent it is faster than MongoDB ( presumably because of their single lock model ). You don't need to do ANY mapping. It has standards interfaces so it is plug and play with some ORM tools. It has odbc/jdbc driver for traditional tools like Crystal Reports....slow but useful.

  15. learn something useful first by roman_mir · · Score: 3, Interesting

    First you need to learn something useful, like understand a normal database, like PostgreSQL, SQLLite, DB2 or whatever your heart desires (not MySQL, that's just not right.) Once you really understand the normal databases and you understand your requirements only then you can make a statement by going 'nosql' something, otherwise it's most likely for most scenarios is counterproductive, you are not all FBs out there.

    1. Re:learn something useful first by roman_mir · · Score: 1

      Oh, and before you get on my case, I know that FB uses MySQL. The point is you are not all in need of huge quick data caches, and if you are serving static pages from a dynamic source, you are doing something else wrong altogether.

    2. Re:learn something useful first by Billly+Gates · · Score: 1

      I don't know.

      This guy made a compeling case to use MongoDB over MySQL.

    3. Re:learn something useful first by roman_mir · · Score: 1

      excellent argument. It wins the Internets.

    4. Re:learn something useful first by angel'o'sphere · · Score: 1

      First you need to learn something useful, like understand a normal database,
      First you need to learn something challenging, like implementing your own data base.

      like PostgreSQL, SQLLite, DB2 or whatever your heart desires (not MySQL, that's just not right.)
      like VMS on VAX with its build in DB, or Mumps or PICK (not SQL, that is not right).

      Once you really understand the normal databases and you understand your requirements only then you can make a statement by going 'nosql' something,

      Once you really understand the true databases and you realize how much easier they fit your requirements, you can justify going for 'SQL'.

      otherwise it's most likely for most scenarios is counterproductive, you are not all FBs out there.
      Otherwise you pay big money to Oracle for no reason.

      Sorry, FTFY ...

      If you never have used or had learned in school / university what "other DB paradigms" exist (coming back en vougue under the lable 'NoSQL') then you don't qualify at all to give a hint about databases.

      Claiming that only SQL (and RDBMSs) is right is like claiming only Windows is the right OS. It simply shows you never saw any other OS and have no clue at all.

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    5. Re:learn something useful first by Billly+Gates · · Score: 1

      I am not a database programmer nor a DBA.

      However I have worked with database software that needed to
      1. Compare several tables in different databases
      2. Do relational logic to analyize relationships, hence relational RDBMS.

      A typical business task at work would be to figure out if a discount worked and by how much with certain stores in only a certain section of the country. Gee, I would need a SQL join (I hear the booing of the noSQL evanglists on that) to look at the orders database, the pricing database, as well as the inventory and store database which are on 2 different tables. Explain how I could do this without a RDMS? I could be full of it, but all the *lack of scalability* arguments the noSQL crowd uses can not produce the same results without joins. Frankly, I need to compare several terabytes of data and write software that triggers and records a log about it so I can let the executives know what is going on in business. Joins are a necessary evil and Oracle does managing very large data quickly quite well as slashdotters hate them.

      All I see noSQL databases are good for is storing and retrieving data. I need to compare it, view it, perform logic, and most importantly compare several tables. I also have to work with MS Access or Crystal reports so my boss can pay me. What is the point of storing data if you are not going to use it? Until someone can tell me that a noSQL database can do these things it is all hogwash.

    6. Re:learn something useful first by angel'o'sphere · · Score: 1

      NoSQL DB does not imply you can not join ....

      Until someone can tell me that a noSQL database can do these things it is all hogwash.

      That is your misconception. There are countless more reasons for NoSQL than for SQL. Every situation where you can calculate your key, and that means in an extended way "can calculate the exakt disk address" of the data to retrieve, NoSQL is several magnitudes faster.

      Typical NoSQL is not to REPLACE your SQL/RDBMS solutions, it is to ACCOMPLISH them. However with our days hardware you can put everything into NoSQL DBs with joins etc. and you are magnitudes faster than Oracle on a high end system.

      After all NoSQL does not mean no SQL but not only SQL.

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    7. Re:learn something useful first by roman_mir · · Score: 1

      Claiming that only SQL (and RDBMSs) is right is like claiming only Windows is the right OS. It simply shows you never saw any other OS and have no clue at all.

      - while my advice is actually something that's useful to people who may otherwise be going in the wrong direction, yours is just stupid and pretentious and doesn't even apply to me, since I did enough work for AT&T, Bell Canada, Symcor, IFDS to have worked with some things, you may not even recognize as databases.

      Yes, for the majority of people RDBMS is correct, both from their business perspective and the skill sets necessary.

    8. Re:learn something useful first by angel'o'sphere · · Score: 1

      Yes, for the majority of people RDBMS is correct, both from their business perspective and the skill sets necessary.

      I doubt that.
      Either there are no DBAs there used to be 20 - 30 years ago or business demand increased far far far more than DBs could follow.
      Last 20 years I never saw any DB that could meet demand of the business.
      That includes a "cluster" of 4 M4000 servers and lots of attached terabytes of storage. In this case based on Sybase, not Oracle.
      The majority of people is just storing records into one single table. Fetching them with date and/or key. There is no join involved or anything.
      The situation I'm in is: we have 4 main servers, 4 fail backs, and as they are at the edge of keeping up with the load, all "analyzing" is done on read only copies that are kept up to date with some DB magic.
      Franky: you could invest $300.000 into a competent DBA and ditch 2 (+ back ups) $4M systems (not including maintenance) however, the true solution is: don't abuse the DB for stuff, a single file can do.

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    9. Re:learn something useful first by Anonymous Coward · · Score: 0

      The majority of people is just storing records into one single table. Fetching them with date and/or key. There is no join involved or anything.

      Care to back that up with some evidence?

  16. Re:NoSQL is garbage, plain and simple. by Vanders · · Score: 1

    I need to dump millions of lines of syslog output to a structured datastore. I don't give a toss about ACID: I just need to know that the write succeeded. A NoSQL like MongoDB does the job brilliantly.

  17. Don't forget lotus notes by acomj · · Score: 1

    I was a notes programmer a decade ago... (wow...) I went to a talk on CouchDB and It all seemed strangely familiar.

    Basically lotus Notes is a NoSQL database with an email and calendar program attached. Of course anything was better than "lotus script" but I can see why this stuff is very appealing. I think some of the couchDB developers are former notes developers are involved in the NoSQL movement.

    1. Re:Don't forget lotus notes by kiatoa · · Score: 1

      Whilst a captive user of Lotus Notes at IBM I frequently grumbled about it. In retrospect I really didn't appreciate how good it was and how much easier it made my life. I regularly synced my mail to Linux and to Windows and was able to seamlessly work offline. If it was an easy install on Linux I'd seriously consider dropping the $100 or so for a copy and I don't own *any* commercial software.

      The "slosh data around model" has a strong appeal and Notes seemed to mostly do it pretty well. In a similar way the ideas behind freenet appeal to me also. Well, except the lossy bit.

      --
      90% of the wealth is in 2% of the pockets. Bummer to be in the majority.
  18. Re:NoSQL is garbage, plain and simple. by Anonymous Coward · · Score: 0

    Exactly, assuming that everybody needs ACID for every datastore is not a valid assumption and if you think it is, you have a very limited imagination.

  19. Re:NoSQL is garbage, plain and simple. by NoOneInParticular · · Score: 1

    If your data is worth something out of this single application, you need relational and ACID. Syslog records might not qualify as being worth something. I heard flat file works fine for those.

  20. Breaking the backs of DBAs by EmperorOfCanada · · Score: 4, Interesting

    One the many reasons that programmers that I know are adopting these technologies is that it breaks the back of the in-house DBA. Often there are a few in-house DBAs with certifications up the wazoo who squeeze themselves into every project that has to store data(all projects). But somehow their word becomes the final word. Getting a table added to a schema can take days or even weeks and might not be approved at all. Suddenly with MongoDB or whatever the DBA has no possible input. One can make all kinds of arguments for and against relational systems and how valuable a DBA is to the long term health of a datastore but from many developer's / project manager's perspective a modern DBA often acts as a brick wall to on time on budget.

    1. Re:Breaking the backs of DBAs by Billly+Gates · · Score: 1

      Yeah and when data is lost and the middleware app crashes then who is at blame? I doubt the DBA as he/she did not implement it. The manager would have some explaining to do to IT on why he thought he could circumvent the DBA and corporate policies.

    2. Re:Breaking the backs of DBAs by Tenareth · · Score: 2

      One of the main reasons for this is that the DBAs are the ones that keep the production environment functioning. Devs get to put in whatever random thought that crosses their mind and when it breaks in production and data is lost, or clients are impacted they just shrug and say "Odd, didn't expect that".

      A 'modern' DBA should be trained in whatever development cycle that dev is using, which may include Scrum/Agile, in which case the process would be integrated and the delay of implementation would be greatly reduced, but not eliminated. It really isn't a bad thing to stop and think about the big picture from time to time.

      The issue is when the management sets up a reward system for DBAs to be roadblocks (this is usually done by crucifying a DBA for a database failure, even if it is proven to be a poor design from Development) that creates the type of environment you are talking about. It is a perfectly valid response to management to be protective of their job. The issue isn't the DBA, it is the structure around the technology groups.

      --
      This sig is the express property of someone.
  21. nHibernate by Synerg1y · · Score: 0

    anybody use this? I believe it falls under this category... we're about to get a large system that uses it, the developers say its not too bad, but I've use ADO.NET since 1.1 .

    1. Re:nHibernate by Anonymous Coward · · Score: 0

      nHibernate is an ORM.

    2. Re:nHibernate by Xest · · Score: 1

      NHibernate and ADO.NET are tools for interacting with a data store, this article is about the data stores themselves, not the tools of interaction.

      NHibernate is an ORM, more of a competitor to Microsoft's Entity Framework than to ADO.NET.

  22. Re:NoSQL is garbage, plain and simple. by shutdown+-p+now · · Score: 1

    I think what GGP is saying is that it is reasonable to assume that ACID is needed by default, with proof required that it is not the case. Which makes sense for the same reason why assuming that something can be written in some high-level programming language makes sense, before deciding that, no, it really has to be in C.

  23. but MongoDB is the web scale by improfane · · Score: 1

    Intersting article.

    This is a funny Q&A session on Mongo DB which raises a good point.

    --
    Slashdot needs Geekcode | Can anyone recommend any good SCIFI? My tastes: Foundation, Startide Rising, CITY, Ringworld,
  24. MongoDB by Spykk · · Score: 1

    If you are curious about the benefits of using MongoDB there is a good explanation here.

    1. Re:MongoDB by Anonymous Coward · · Score: 0

      That was amazing.

    2. Re:MongoDB by Anonymous Coward · · Score: 0

      thanks you!

      that was very informative =)

  25. faster, cheaper... by decora · · Score: 1

    but what about that third pillar? the quality thing?

    1. Re:faster, cheaper... by Tenareth · · Score: 1

      He did state he was coming from the view of a Developer. Quality is a QA/IT/DBA concern, not Development.

      --
      This sig is the express property of someone.
  26. Amazon SimpleDB by aclarke · · Score: 1

    The article didn't cover Amazon SimpleDB (http://aws.amazon.com/simpledb/). SimpleDB is part of Amazon AWS, so it's cloud-only. However, if you're planning to deploy on AWS anyway, it makes for a formidable option.

    1. Re:Amazon SimpleDB by julesh · · Score: 1

      Only a total nut would intentionally choose a solution that ties you to a single hosting provider who have acquired a reputation for kicking off clients they don't like.

  27. Why do they all have retarded names? by Anonymous Coward · · Score: 0

    Really - why do they all have retarded names?

    They should have called it fuck-you-SQL instead of noSQL

    1. Re:Why do they all have retarded names? by mcvos · · Score: 1

      At least they have a name, rather than merely a generic description, like MS SQL Server.

  28. No mention of RavenDB by cjjjer · · Score: 1

    No mention of RavenDB http://ravendb.net/ or does it not fall under the NoSQL category?

  29. Cassandra @ ClubCompy by BeforeCoffee · · Score: 1

    We use Cassandra for all the user management and virtual file system storage at ClubCompy, It is so blazing fast compared to SQL for both read and writes, and it is very scalable. I've had a node of my storage cluster go down and whole system stays up with no data loss, and it can repair itself once I bring the downed node back up.

    Coding to Cassandra is pretty challenging, you have to do all of your data modeling in code or use the new CQL to access the cluster. I wrote about my experiences recently, where I have started using Google's Protocol Buffer to give me more flexibility in how I store my data and describe my column families: Coding to Apache Cassandra with Google's Protocol Buffers

    Dave

  30. MV databases (PICK etc) by Anonymous Coward · · Score: 1

    DO NOT use PICK. I've been using it for 3 years, and the kindest thing I can say about it is that it is a cool idea implemented by an ugly hack. Library & inter-communications options just suck.

    1. Re:MV databases (PICK etc) by Anonymous Coward · · Score: 0

      This. Please DO NOT use PICK. Having had to use/maintain/migrate from systems based on it, I can honestly say I would rather sweep streets than use it.

  31. wouldn't fly in NASA by decora · · Score: 1

    probably wouldn't fly in the linux kernel!

  32. "Something OLD, vs. something new..." by Anonymous Coward · · Score: 0

    "And, NoSQL sounds like something done long ago by 'Big Blue' (IBM)" (ISAM).

    E.G.-> ISAM uses hash tables rather than B-Trees (as relational DB's do), & "Lo and Behold", so do NoSQL based databases!

    Now, IF I am "off" here, please... feel free to correct me!

    (However, imo @ least? Well... NoSQL DB engines don't really sound "all that new" to me, & sounds like a "return to yesteryear" in ISAM methods, or a variation thereof mostly!)

    APK

    P.S.=> BOTH types of DB engines (relational, or ISAM (or this "new" NoSQL stuff that sounds an AWFUL LOT like ISAM to myself @ least)) have their places... so, use the one that fits your data-processing requirements model(s) best - "right tool for the job" type thinking...

    ... apk

  33. tweeeeeet! by ewertz · · Score: 0

    Penalty -- use of "sales" and "think" in the same sentence. 15 yards!

  34. Careful by KalgarThrax · · Score: 1

    Using such a DB can be a two edged sword. Especially when wielded by bored CTOs who have nothing to do but try new tech without "sweating the details."

    They key thing I have taken away from the experience of using such a DB is that typically, software architects will start migrating or build new functionality in earnest, only still succumbing to a relation schema in the end.

    Except the schema is backed by a non-relational database now. Which causes very, very high amounts of pain.

    This is not to say correct use of NoSQL DBs is not possible. I just have yet to see it.