Slashdot Mirror


Cassandra 0.7 Can Pack 2 Billion Columns Into a Row

angry tapir writes "The cadre of volunteer developers behind the Cassandra distributed database have released the latest version of their open source software, able to hold up to 2 billion columns per row. The newly installed Large Row Support feature of Cassandra version 0.7 allows the database to hold up to 2 billion columns per row. Previous versions had no set upper limit, though the maximum amount of material that could be held in a single row was approximately 2GB. This upper limit has been eliminated."

5 of 235 comments (clear)

  1. If you have more than 30 columns by loufoque · · Score: 1, Insightful

    ... then you're doing it wrong

    1. Re:If you have more than 30 columns by Anonymous Coward · · Score: 2, Insightful

      - Not everyone answers every question. There is skip logic involved and there are loops, sometimes nested. These would lend themselves well for a multi table relational approach but the data does not come out of the data collection systems like that (most of them anyway). Would be nice to normalize, but as mentioned, there are new datasets every week, most of them having 1000s of columns. Good luck with normalizing all that before your deadline.

      - Normalized data is not as easy to use in statistical applications. SPSS, the 800 pound gorilla in stats land, only supports flat data, for example.

      - There are things called multiple response questions, sometimes having 100s of options, sometimes 1000s. Ergo 100s to 1000s of columns per question. "Which car models have you ever owned" + every single car model produced in the last 40 years is a good example. Of course there are alternatives such as blob fields and bit shifting, or storing only max 20 answers (first car, second car, etc) but it costs time to convert them. And these formats are also harder to use in statistical analysis, even in flat data.

      In a world where you have complete control over the provided input, and the required output, you are right. In the real world, not so much.

  2. Why? by Xoc-S · · Score: 3, Insightful
    Only a completely de-normalized flat-file database would need anything like that number of columns. That would mean many duplicate pieces of information, and a complete maintenance nightmare. The only purpose I can see is to have views of existing normalized data for fast searching, but that would be read-only data.

    This is a feature in need of an application and I can see very few applications.

  3. Yes and the funniest thing about all this is by Giant+Electronic+Bra · · Score: 4, Insightful

    That we had all of this stuff 30 years ago. It was called 'network' databases, which were pretty much the standard sort of technology before RDBMS came along and everyone realized how incredibly much better relational algebra was for the vast majority of problems. As with many other things older ideas eventually resurface with new names and a few more features. There are times when this kind of facility is useful. Nothing wrong with it. The vast majority of cases though where I've seen people using something like Cassandra or Big Table were ill advised. A properly optimized RDBMS with correctly designed schema can handle all but a few edge cases. Most of the hype these tools are generating is based on a lack of real understanding of how to properly use databases combined with people believing myths about other technologies and helped along by the industry's short memory span. The best part though is that when something turns into a giant mess guys like me can make nice money fixing the mess. lol.

    --
    "Malo periculosam, libertatem quam quietam servitutem." -- Jefferson
  4. Re:Typical applications? by AlXtreme · · Score: 3, Insightful

    Dear $DEITY, the number of times I've seen (mostly) PHP crapplications use CREATE DATABASE and CREATE / ALTER TABLE, often with ingenious naming schemes, instead of simply inserting new rows. Certain people shouldn't be allowed to touch databases.

    If anyone needs me I'll be sobbing over my coffee.

    --
    This sig is intentionally left blank