Enthusiasts Convene To Say No To SQL, Hash Out New DB Breed
ericatcw writes "The inaugural NoSQL meet-up in San Francisco during last month's Yahoo! Apache Hadoop Summit had a whiff of revolution about it, like a latter-day techie version of the American Patriots planning the Boston Tea Party.
Like the Patriots, who rebelled against Britain's heavy taxes, NoSQLers came to share how they had overthrown the tyranny of burdensome, expensive relational databases in favor of more efficient and cheaper ways of managing data, reports Computerworld."
Just use flat text files --- no need for expensive db's .... think of the freedom!
"i lost my dignity on a slippery wiener"
There is a time and place for SQL. There is a time and place to avoid SQL.
SQL is great for financial data. SQL is terrible for genetic data.
This is a boring sig
When you get a lot of morbidly obese nerds with no life to program for you.
Meanwhile SQL users get laid.
Go fork yourself!
Seems to be a silly thing to be against. Relational databases and the stuctured query language may not be perfect, but I bet these people could die in their 90's and people will still be using relational dbs and sql.
If you want to tout open or cheap dbs and more lightweight types of storage/db servers, then they might have some points, but being against sql is just plain dumb.
I've seen strong reactions from various camps with regard to concern over saying no to SQL. I'm not sure why people freak out over it. First, you have to strike out toward new things if you want to progress the world. Second, SQL hasn't caused people to stop using spreadsheets or Access databases. Third, there are groups that get together to dispute that the earth is round; insisting that it is flat. Or that gray aliens are visiting earth regularly and probing our anuses.
Bring on the next fascinating data technology. SQL will continue to have a major place for many years to come, no matter what happens.
The problem is the performance of transactions and persistence and distribution of data techniques, not
whether we are using a logic-like STRUCTURED QUERY LANGUAGE to ask for data matching certain conditions.
The latter is still, and will continue to be, very useful.
It's just that now that we can assume local clusters and WANs worth of co-operating data stores, there
are probably better, more performant ways of implementing persistence, replication, distribution of data
than traditional RDBMS implementations.
The two concerns: The logical model of how we QUERY for data (or combine it in bulk), which is the core of SQL,
and how we persist it and retrieve it quickly, now have more options for being separated.
Where are we going and why are we in a handbasket?
It's pretty easy to say "yes" to alternatives without saying "no" to SQL.
Just because a crowbar can pull out a stubborn nail better doesn't mean they should replace all the hammers. Then what would we put nails in with? Different tools for different jobs.
Porquoi?
If I was to read the article, I bet somewhere someone would be wittering on about Key Value Datastores.
The brainchild of a generation brought up on high level collections, they learn one (in this case Map) and apply it to everything.
Sadly SQL, and RDBMS, works for most people. It maps object data well (oh whaaaa, i have to do foreign keys - GROW SOME FUCKING BALLS YOU LAZY GRADUATE!) and it is well understood. And with abstractions like LINQ to query them, even the lazy dumb Windows .NET programmer doesn't have to strain their brain to learn SQL.
And when you have terabytes of specific unique data, you clearly should go away to work out how best to store it. Even a RDBMS/SQL solution is too generic for all problems.
I'm not seeing anything that offers a real advantage over using advanced features like one finds in postgres combined with memcached. Some of my program likes to think of its data as a structured object while other parts like seeing that data as rows in a table (they even link up to other tables through foreign keys!).
Saying no to SQL and relational databases is just fine if you've got something better to replace it with. However I know of no such thing. The reason they're popular is that they are so powerful for data storage. If something better came along you wouldn't even need to say no to SQL. You'd just say yes to the newer better rival.
These posts express my own personal views, not those of my employer
SQL is not a database, it is a standard interface to a feature set commonly associated with relational models. Before everyone standardized on SQL, there were other relational query languages. The "No" part of "NoSQL" refers to the fact that some basic elements of relational implementations cannot be usefully expressed using a much simpler distributed hash table model.
All the "NoSQL" does is eliminate all the parts of traditional relational databases that do no scale -- discarding the bottleneck rather than fixing it. These are things like joins and external indexing. Unfortunately, discarding those things means you discard a lot of very important functionality as a practical matter, notably the ability to do fast, complex analytics. Adopting the NoSQL architecture runs contrary to the trend toward more real-time, contextual analytical processing. There are a great many analytical applications that are not amenable to batch-mode pattern-matching, and the NoSQL model is a lot less applicable than I think some people want to acknowledge. In its domain, it is a great tool but it has many, many prohibitive limits. We are essentially trading power for scale.
That said, do not take this as an endorsement of traditional SQL relational databases either, as they have a number of serious limitations themselves. As just mentioned, a number of the core analytical operations those models support are based on algorithms that scale poorly. The SQL language itself has mediocre support for many abstract data types (e.g. spatial) and data models (e.g. graph), which in part reflects the inadequacies of the assumed underlying database algorithms (e.g. B-trees) that are implicit in SQL. The inability to efficiently do event-driven/real-time applications is also more a reflection of the access methods used in databases than any intrinsic weakness in SQL; SQL may be clunky for that purpose, but that is not the real limiter.
A truly revolutionary deviation from SQL would usefully implement a superset of the features SQL supports, not take them away. Of course, we would need access methods more capable than hash tables and B-trees to useful implement those features, which is a lot more work than discarding features that scale poorly. NoSQL is a stopgap technical measure for that small subset of applications where the serious tradeoffs are acceptable.
Note that most of these solutions come from the interwebs, social networks, etc. And it isn't so much anti-sql as it is anti-relational database (sql != rdb).
The basic premise is that we need different solutions that: can scale very high for very narrowly scoped reads & writes, don't need to perform ranged queries / reporting /etc, and don't need ACID compliance. And that may be the case. Sites like slashdot, facebook, reddit, digg, etc don't need the data quality that ebay needs.
On the other hand, ebay achieves scalability AND data quality with relational databases. And when I've worked with architectures that scale massively and avoid the relational trap for better solutions - they inevitably later regret the lack of data quality and complete inability to actually get trends and analysis of their data. It *always* goes like this:
Me: So, is this thing (msg type, etc) increasing?
Developer: No idea.
Me: Ok, so lets find out.
Developer: How?
Me: I don't know - typical approach - lets query the database.
Developer: It'll take four+ hours to write & test that query and then days to run. And when it's done we might find that we wrote the query wrong.
Me: What?!?
Developer: We had to do it this way, you can't report on 10TB databases anyhow
Me: What?!? Are you on crack? there are dozens of *100TB* relational databases out there that people are reporting on
Developer: well, we probably don't need to know what that trend is anyhow
Me: I'm outta here
Check out "Window Aggregates" etc in Oracle and PostgreSQL 8.4
Over-the-top Response Guy! Giving "Over-the-Top Responses" since 1970.
First: my mantra: Data belongs to the organization, not the application... if the app fails and data is accessible then we all go on - if the data fails or is locked away - what was the point of the app again?
In a SQL database then data is understood by the organisation, DBAs and data architects. If left to app developers taking an app-centric approach to data... I get nervous quickly.
So long as the data is just as definable and accessible as current SQL databases then all good - give me an app with some odd-ball storage then it is bye-bye.
Epic Fail. You're wrong. It in now way results in a "Cartesion Product". That would be a "Cross Join", not an "inner join". From my experience, people who complain about SQL and relational database, are, for the most part, ignorant. They really don't even understand what they are saying or what they are talking about. I've seen so much abuse and misunderstanding of relational data and SQL in my career, that I just have to laugh at this sort of thing.
Over-the-top Response Guy! Giving "Over-the-Top Responses" since 1970.
That is one view. It's nice and all, but incomplete. The issue is performance.
Any time you're dealing with a large quantity of data, it's always easiest to process or filter where it's located. Transmitting it, processing it, and transmitting back changes adds an unreasonable amount of overhead. Hence, SQL is a "Query" language. In other words, you have the RDBMS do reasonable data processing and filtering of records for you. Your application should only need to specify the operations performed, and should only process data if your computation is particularly unusual. This makes feasible computations that would otherwise be entirely unreasonable. (note that an application working on the same machine generally has the same issue as one working on a separate system. SQL servers present the application with a stream of data - pipe, socket, etc)
My opinion: SQL is horrendous. It's a pain to use, and many basic data transforms cannot be described in that language (at least without some huge, awful, convoluted command == maintenance nightmare).
I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
So a bunch of Excel users got together for dinner in San Francisco - why is this news?
#DeleteChrome
...What democracy is to methods of government.
The worst ever devised excepting everything else that has ever been tried.
Any insufficiently advanced magic is indistinguishable from technology.
Use the appropriate tool. Always. There are tons.
Don't use a relational database to try to represent hierarchical data. Don't try to use LDAP to do analytics. Think of the performance implications before you have more than two users accessing your system. Data storage is a very different animal, you are often (though not always) I/O bound. This is very different from being limited by the amount of instructions you can deal with per unit of time. Don't think otherwise because it will bite you in the ass.
And still I see people making the same stupid mistakes over and over. But it's pretty simple really:
A solution designed to be generic will ALWAYS be slower than a solution that is customized. This shouldn't be surprising. If you have serious performance requirements (ESPECIALLY if they are coupled with huge amounts of data) then a custom solution is definitely something you should look into. At some point you will run into a brick wall and find out that there is stuff you can't do with the solution you have in place. This is natural. Custom solutions to hard problems always lead to restrictions in terms of future features. Always. You will NEVER be able to anticipate all features that you would like to have. (Yes, this is true for Google as well. No they don't have any special kind of magic dust that they sprinkle on their things there, they do the best they can and then they get bitten in the ass too, just like everybody else.)
I've had a wonderful time, but this wasn't it -- Groucho Marx
See, I don't think there is ever a good time or place for SQL.
SELECT text FROM mild_introductory_statements WHERE id=random();
Anyone who says so has never had to use it.
SELECT text FROM statements_indicating_superior_experience WHERE id=random();
I like to compare it with JavaScript.
SELECT text FROM unrelated_tool WHERE id=random();
It's a language that is difficult to refactor, maintain, and while it's a standard, the standard is so vague that it's useless.
SELECT text FROM seemingly_valid_yet_unsubstantiated_objections WHERE id=random();
Like JavaScript, people are trying to build other languages on top of it to hide its shortcomings -- for javascript you have tools like GWT, and for SQL you have HQL, Linq, etc.
SELECT text FROM wrongheaded_causal_analysis WHERE id=same_one_as_two_queries_ago();
Not to say that there is anything wrong with relational databases, we just lack a good tool to interface with them.
SELECT text FROM reasonable_sounding_parthian_shot_to_obscure_trolling WHERE id=random();
Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
Trash SQL in favour of coding all your data access needs. Welcome back to 1973, guys.
It's not like we could do parallel SQL in the 1980's. Or that you can't do parallel SQL in a compute cloud today.
No, It basically seems like they don't want to pay software vendors any money for database technology. That's mostly what the arguments boil down to. Oracle RAC is very scalable, arguably easier to do at massive scale than MySQL - but you have to pay Oracle money. For an Internet startup, I can understand why you'd take your chances with "roll your own". For an enterprise... I think not.
-Stu
You can use SQL with flat files.
SQL is going to be around for a long time, because it's useful as an "API" - as a protocol or layer of abstraction.
Programmers can write all sorts of programs in all sorts of programming languages and then use SQL to talk to the DB. If the DB changes a bit, they can often use the same SQL or modify it slightly.
You often see lots of grumbling and cursing in various companies because people actually end up doing that and companies end up with lots of stuff hooked to the DB - MS Access, perl, python, ruby, java, radius servers, openvpn, accounting and finance stuff...
They grumble, but the fact is the database is being used. The data has become more useful.
If you have your database locked up behind some new fangled protocol that only 20 people in the world know, it's not going to be as easy to do that - and often each bunch will start creating their own databases and you end up with a different mess, and a mess that's not as useful.
Having everyone use SQL to talk to the DB is not actually a bug it's a feature.
One man's impedance mismatch is another man's layer of abstraction.
How many Googles or Yahoos are there? Like, 5. Let them do whatever broken things they want -- it works for them... for now. It's still expensive, probably just as much as "big iron". Not to mention the countless engineer hours and hosting/electricity costs for their "scale out" systems. It's what happens when you let a bunch of ivory tower PhDs solve real engineering problems.
In the end, the rest of us serious enterprise engineers will allow Oracle, Microsoft, and the people who have been doing this for 30 years to optimize their code to run on multicore mainframes ... which is where massive computing belongs. Then we query it with a few lines SQL instead of convoluted algorithms in some "Map Reduce" environment, and you move on with our lives.
SQL syntax sucks, is inconsistent, and just non-standard enough at its corners that it's completely annoying to write anything for more than one DB. Also lacks various features which logically _should_ be there, because of the relational back-end. SQL is a toy, and though I'm the guy everyone in the office turns to if they want to write a query that does more than SELECT * FROM sometable, that doesn't mean I have to like it.
But that's not the fault of relational databases. The relational logic makes sense, and we'll be seeing it referenced in countless "new ideas" that come along for years, just as ideas which Lisp already had in 1970 will be touted a new features on for the next millennium (you hear? PHP can do Lambda functions as of yesterday!)
SQL sucks, but SQL is NOT what makes something relational.
-- 'The' Lord and Master Bitman On High, Master Of All
While you're here, can you fill in the following form.
I would like my pony to be:
[ ] female
[ ] male, entire
[ ] male, gelded
with coat colour:
[ ] white
[ ] black
[ ] brown
[ ] piebald
[ ] other, please specify _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Derby 10.5, meanwhile, still has a tiny footprint, and can do most if not all of the SQL you will ever want for a typical Java application, along with features like the ability to do live backups, live table compaction from within the application while running, and now at last the ability to do cursoring in SELECT statements. Installation and configuration are simple.
I actually think that the actual problem is that we old C programmers actually learned programming and data structures, and as a result know a lot about the kind of problems for which SQL is well suited, while a lot of modern programmers learn a lot of theory about OO, but don't actually learn to program. Therefore, they have to try to reinvent wheels that were in fact designed in the 70s, and have no idea of what tools are available and how they map onto typical real-world application level problems.
From scarped cliff or quarried stone she cries "A thousand types are gone, I care for nothing, no not one."
SELECT text FROM thank_you_for_sharing_your_views_but _you_have_not_seen_the_schema_my_friend
UNION
SELECT text FROM same_goes_for_point_two__if_you_lack _the_source_code_what_then_do_you_really_know
UNION
SELECT text FROM besides_it_got_plus_five_funny_so_neener_neener_neener;
Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear