Horizontal Scaling of SQL Databases?
still_sick writes "I'm currently responsible for operations at a software-as-a-service startup, and we're increasingly hitting limitations in what we can do with relational databases. We've been looking at various NoSQL stores and I've been following Adrian Cockcroft's blog at Netflix which compares the various options. I was intrigued by the most recent entry, about Translattice, which purports to provide many of the same scaling advantages for SQL databases. Is this even possible given the CAP theorem? Is anyone using a system like this in production?"
It would be a lot easier to talk about solutions if you said which limitations you run into.
Is your dataset to large (large tables), are you having to much joins, too many transactions per second? In short, what is the problem we're trying to solve here?
My money is on "No one here likes SQL" and "There aren't any exports on RDBMs to help us get things set up properly".
The small startups are using NoSQL because there is, more and more, a push in the web app market to store data which does not fit into any schema.
There is no such thing as "data which does not fit into any schema", just like there is no such thing as data which cannot be encoded into binary. All data necessarily has a schema. However much or little of the schema you may choose to model in your (SQL or other type of) schema is, like the rest of software engineering, a design tradeoff.
The various NoSQL approaches do not solve the full generality of data management problems the way SQL databases do. They are narrower in scope, and as is generally the case, they can achieve better performance by virtue of doing less. They can be much faster with certain data access paths, but at a cost of the fact that other data access paths become prohibitive.
The frustrating thing for many of us is that the NoSQL spin on data management is about where mainstream data management was in the 1960s. As the field matured, it learned many important lessons, all of which are now being tossed out the window by people saying "oh we don't need that" but of course, they just haven't needed it yet. As these problems become apparent to them, they will spend the next decades of their lives reinventing what the data management field figured out in the 80s and 90s. Until then, they'll be making beginner mistakes, like thinking that their data somehow doesn't fit into any schema.