SQL and NoSQL are Two Sides of the Same Coin
An anonymous reader writes "NoSQL databases have become a hot topic with their promise to solve the problem of distilling valuable information and business insight from big data in a scalable and programmer-friendly way. Microsoft researchers Erik Meijer and Gavin Bierman ... present a mathematical model and standardized query language that could be used to unify SQL and NoSQL data models."
Unify is not quite correct; the article shows that relational SQL and key-value NoSQL models are mathematically dual, and provides a monadic query language for what they have coined coSQL.
microsoft research rocks but the product division usually sucks !
Jehovah be praised, Oracle was not selected
coSQLInjection (cSQLI)
Has a nice ring to it.
An inverse tachyon pulse would disperse the relational quantum silica into a focused warp field, thus purging all forms of slipstream space based SQL databases from subspace.
Resistance is futile. Your technological distinctiveness will be added to our own. You will become one with the morgue
...is that SQL sucks as a language. It's not terribly expressive, the ordering of arguments is inconsistent, and whoever designed the way JOIN works should be in jail.
Frankly, I'd like to see SQL die and get replaced with something more modern. We don't program in Cobol anymore, so why the hell are we still using SQL?
There's no -1 for "I don't get it."
no but when you are working with objects
it is cumbersome to have to write into unchecked string.
Jehovah be praised, Oracle was not selected
Nothing new in computer engineering since 1980. Prove me wrong.
Epic M$ hate bro. There's a new web board forming called "Slash-dot" and I hear they're going to open up a comment section. You should check it out, I think you'd do well there. BTW, do you think there's anything to this "Y2K" problem or what? Either way I'm going to "party like its 1999"! :)
Or those developers never heard of a reporting database (for lack of a better term) or a data cube. I remember having a reporting database server. The hardware was even setup differently. The reporting server was mostly reads while the transactional database system was a hell of a lot more writes. We had different RAID systems on each server. One for better write performance the other favored reads. Every night the last days data was copied to the reporting system. All the data cubes were updated. All the reporting tables were also updated. The reporting tables often summed up the data to make it easier for the marketing people to understand.
Think about it, SQL is proven to be sufficient to store any relationships between data (though it may be ugly). SQL can be captured onto a spinning disk by using one of those fancy DB engines. noSQL get's rid fo the fancy DB engine, so why is it surprising that SQL and noSQL can be shown to be equivalent? And coSQL then amounts to adding an interface layer that can be translated down to either representation. Jeeesh -- what passes for interesting these days!
They don't need the 3 e's any more - they just make up bullshit like this, spam it everywhere until people start to use it, then surface their submarine patents.
would be NoConsistency. Oh wait, they have eventual consistency... pretty good for twitter where no one cares to get stuck in a partition missing that gorgeous (fake) blond posting that she's cleaning her teeth.
Greenplum supports both SQL and MapReduce (NoSQL) native.
no but when you are working with objects it is cumbersome to have to write into unchecked string.
Then you're doing it wrong.
Truth is like the sun. You can shut it out for a time, but it ain't goin' away. - Elvis Presley (source: imdb.com)
I really like NoSQL. It's a great tool to use when deciding whether I should hire a given software developer, or whether I should move on to the next candidate. All I have to do is ask the person what he thinks about NoSQL. If he gives a positive response, I send him on his way. If he points out its many flaws, I'm often tempted to hire him instantly. After all, those who dislike NoSQL the most generally know how to write good SQL queries, and they know how to use relational databases properly. They're the kind of people I want to hire, even if the position doesn't involve databases much. It just goes to show that they care about quality, that they care about knowing how to use their technology well, and that they care about doing the job properly.
"Relational SQL and key-value NoSQL models are mathematically dual, and provides a monadic query language for what they have coined coSQL."
I'm glad somebody clarified that! (Time to RTFA.)
main() {1;}
There are only 2 types of languages:
- those people bitch about, and
- those no one ever used
An SQL statement walks into a bar. He sees 2 tables and asks "May I join you?"
When you have to support legacy data and applications any way you do it, you are doing it wrongly.
Jehovah be praised, Oracle was not selected
I have seen the pipe used as a field marker since adding a columns was to costly at the time. Try your fancy orm for a join on this
The quantity is overwhelming and also conveniently broken into an overwhelming flood of users with relatively trivial needs, localized data access patterns, and little need for synchronization. These consumer mass-media aspects are what allow the embarrassingly parallel solution that is "scale out".
If I am running a NoSql solution it is because I need every bit of speed I can muster. Putting a additional layer on top of that does nothing to reach that goal.
Got Code?
I thought the whole reason why NoSQL is "better" than SQL is it's based on column based storage, while most SQL databased are row based storage. Couldn't you make a column-based database that uses SQL as a query language? There is nothing wrong with SQL as a language, there are just some workloads where column based storage is faster (mostly data analytics).
+1
Weary, or wary? (or maybe both?)
:-) The team is actually in a product group and we are doing this for real :-))
My I'm just being a nit-picky coder here, but I don't get why they call it noSQL, when they are really referring moving away from relational databases?
When I first heard of "NoSQL", I thought, "Great! SQL is a terrible syntax with all it's six letter words and easy dangerous mistakes. I would love to have a superior syntax for interacting with the relational databases that are central to my work!" But "NoSQL" should be called "NoRelational." It is kind of strange that you are changing the whole paradigm of the database around and you are describing it as changing a superficial feature. It would be like calling emails "no pen" writing.
Democracy Now! - your daily, uncensored, corporate-free
Rich!
See my sig. There are some parts of SQL that bug the snot out of me.
Every time I start to have faith in humanity, I ruin it by driving to work between 7 and 8 am.
Microsoft researchers Erik Meijer and Gavin Bierman ... present a mathematical model and standardized query language that could be used to unify SQL and NoSQL data models."
Maybe we should add...
"in an [impossible] exploratory effort to embrace, extend and extinguish..."
at the end of the sentence as they have done in the past.
TFA
CONCLUSION
The nascent noSQL market is extremely fragmented, with many competing vendors and technologies. Programming, deploying, and managing noSQL solutions requires specialized and low-level knowledge that does not easily carry over from one vendor's product to another.
Notes:
1. nothing to embrace in this phase... actually too many to embrace, thus not yet a "standard"
2. can't stop to note that all the examples are LINQ-based. Is this an attempt to grow LINQ in a "standard"?
Questions raise, answers kill. Raise questions to stay alive.
SQL and noSQL have so little in common other than the letters S Q and L that their paper is just nonsense.
Dressing in math doesn't make nonsense anymore respectable than otherwise -- funny? as in the following? probably.
I read this so long ago in print, I thought I may not find it on the web, but here it is. Enjoy gobbledygook shrouded in math (the first of the many URLs I found).
http://komplexify.com/epsilon/2009/02/06/imperturbability-of-elevator-operators/
MS Research could mainly be a tax shelter and may do as a side effect some god stuff, ocassionally.
That's a very interesting article, and I'm going to have to look up the research and read it a lot more carefully. But I'm worried that a lot of their analysis just assumes too strongly that relational model = SQL.
For example, their claim that SQL is "not compositional." They define "compositionality" like this:
What we observe here is SQL's lack of compositionality—the ability arbitrarily to combine complex values from simpler values without falling outside the system.
Leaving aside that "compositional" is an odd word to use for this, the first problem here is that the relational model is in fact agnostic about this so-called "compositionality" of column's value's types. The relational model, strictly speaking, doesn't forbid you from having composite-typed columns.
Some, some proposed purely relational solutions to the problems tackled by outer joins is to allow non-base columns to have relations (i.e., sets) as their values. To put it in more SQL-like terms, you could have queries whose result sets had columns whose value was also a multi-row result set. This sort of thing solves the Figure 4 problem from TFA—you would have one row in the result, with Title="The Right Stuff" and Keywords={"Book", "Hardcover", "American"} (a set-valued Keywords column in the result). We can even sketch a SQL-like query for this (not actually valid SQL):
Or this, with a fictional "SET" aggregate function: (again, not actually valid SQL):
Are you adequate?
Per my subject-line above, ISAM database methods were in use before SQL, & used COBOL as an accessing language to said data (along with other programming languages as well):
"Actually COBOL predates SQL by about 10 years" - by garyebickford (222422) on Thursday April 07, @06:12PM (#35751050)
Indexed Sequential Access Method was in use with COBOL before SQL ever was...
APK
P.S.=> Just some "FYI", but I am subject-to correction too, so IF anyone knows differently, let me know... apk
Could I pick your brain since you have a bit of NoSQL experience?
How does indexing work in NoSQL? Are there EXPLAIN-type tools available? (EXPLAIN in MySQL tells you whether your query is using indexes or table scans, and can help you understand why your query is slow.)
I'm pretty flexible with SQL. Can you do just about any query you could with SQL? ("Find all customers who have bought at least $100 of stuff over the last year, but who haven't bought anything this year.")
I'm not a lawyer, but I play one on the Internet. Blog
I'm thinking about wading into the noSQL waters. Help me out:
If authors aren't normalized, does that basically mean you don't have a separate datastore (table, whatever) for authors? E.g., a publisher might want to keep track of author name, address, etc.
Here's another classic example: country codes vs. country names: (ca, Canada), (us, United States).
If you want to be able to use both, you'd would (classicly) store "ca" in your User table (for what country he lives in), and then have a separate Countries table that tells you what "ca" stands for.
How do you approach that in NoSQL (assuming you want to make use of both codes and full country names)?
I'm not a lawyer, but I play one on the Internet. Blog
For people who have worked with NoSQL (assuming you've worked extensively with SQL before):
1. For someone wanting to either scratch an itch, or come up with the Next Great Thing, would you recommend NoSQL-type solutions to do the standard save data coming in over the web, and later retrieve it, possibly rejigger and summarize it, and feed it back over the web when a user needs it thing?
2. Is NoSQL generally considered faster than SQL equivalents? At runtime or development time?
3. Is there a concept of DB design? Or is it just made up as you go? By doing additional .insert()'s?
I.e., you start off by .create(product). Then add fields: .addAttribute('name', 'Magic Rock') .addAttribute('manuf', 'Rock Emporium')
Then you add in detail for the manufacturer table: .addAttribute('name', 'Rock Emporium') .addAttribute('st', '123 Main St') .addAttribute('state', 'NY')
Is that how it works?
4. And can you change field names later (ALTER TABLE)?
5. What about aggregate functions (MAX, GROUP BY, HAVING)?
The whole thing seems awfully gooey.
I'm not a lawyer, but I play one on the Internet. Blog
WTB Signed tie-dye
From what I have learned about the uses for and abilities of NoSQL, its a compromise you make when affordable scalability is required to stay in business. It is nowhere near as powerful as the RDBMS/SQL combination, however it is much cheaper to run. Don't believe anyone who tries to tell you there are things you can do with NoSQL that you can't do with SQL. That is complete bunk. Maybe it makes speed cheaper, and scaling easier, but those decisions should be forced by application demand and budget constraints, not application design. I am most interested in NoSQL as a way to store denormalized data in a pre-cache for light write, heavy read applications. Any other use would probably be due to desperation to scale to keep up with demand.
Having a bookmark to Google does not make you an expert on everything.
A "Reporting Database" huh.
Try "Data Warehouse"... Sounds like classic ETL to me.
dnuof eruc rof aixelsid
OLTP vs DSS. Yep. Normalize and De-Normalize based on purpose and performance. NoSQL is just another tool in the toolbox. If there were one single magic tool, they wouldn't keep inventing new ones.
Why? MongoDB is web scale, we don't need anything else!
"16MB (fuck off, MiB fascists)" - The Mighty Buzzard
Modern commercial RDBMS systems have extremely complex intelligence to manage execution cost including self optimization cost, concurrency, partitioning, escalation, versioning, distributions, index selection, data caches, auto parallelization..etc.
When I see people talking about alternatives be it object stores, key/value, log...etc I ask where is the intelligence... Where is the billions spent on R&D in these new systems? They all appear dumb imitations that always ask you to sacrifice something..be it consistancy, concurrency, model restrictions or for a human to exactly define semantics or access patterns to enable a specific solution.
Is there really a way to create a new **general purpose** data system at least as powerful and useful as the RDBMS without spending on the order of a billion dollars on optimizer design?
For some applications csv flat files run circles around the RDBMS... In the general case flat files suck.
I believe generally the same applies across the board. Yes given a specific problem you can provide a specific solution that is faster better cheaper however the shortcut would not enjoy general applicability.
Reading the introduction is kind of bizarre. Apparently the motivation for the work is to reduce the NoSQL market to a few very profitable suppliers.
select if (understands=1, 'Proceed', 'Learn more') from programmer_knowledge where topic = 'ACID';
select 'Many NoSQL advocates don\'t get it. Yes it really is *fundamental* for many applications.' where 'Eventually consistent' not like 'Consistent';
select concat(name, ' is not a bank nor processes orders, trades, stock, enterprise data etc') from applications where type = 'Social Media';
select if (essential=1, 'Use relational DB', 'Consider NoSQL') from system_requirements where requirement = 'ACID';
Composite value types violate 1NF (as defined by some people in the field.) since they are isomorphic to relations.
*reads up*
Ah. This is the story of how every now and then the kids rediscover DBMs, that's it.
Religion is what happens when nature strikes and groupthink goes wrong.
Examples. Your post is worthless and will be considered bullshit until you give us actual examples.
SQL is widely used because it works very well. It is powerful, mature, and found everywhere. I'm a database programmer by profession, and I believe SQL is the perfect tool for the job. Of course it has limitations, but that's why you have procedural SQL (which is another great tool for its job).
Let me guess: the only experience you've had with SQL is database 101. I was bored in that class too, but when you're working in the real world, it becomes a lot more interesting. So please, stop spouting off like a teenager crying to his parents to buy a shiny new sports car, when the minivan suits the job perfectly because it was actually designed for the purpose.
"here, here" "hear, hear"
The first is an outing of belittling sarcasm, whilst the second is more like twisting your mustache approvingly but loudly.
Seeing this is /. i chances are he was posting sarcasm.
Hivemind harvest in progress..
I'm also loving this thread. My solution: stored methods.
Inserting data should (only!) be done via a method that sits inside the database. This method also writes a crosstable matching Client_Id with LatestTransaction_Id. Voila.. and for any existing data its just a onetime batch conversion. Doesn't get any faster. Also, there is NOTHING stored twice, and with proper stored methods the chances of this crosstable getting out of sync is zero.
If you want data, store data. If you want information, extract / combine it from data.
Hivemind harvest in progress..
Your sig has nothing to do with SQL. That alone tells me more about your skill and understanding than any other opinion you might hold.
2. can't stop to note that all the examples are LINQ-based. Is this an attempt to grow LINQ in a "standard"?
No, it's an attempt to promote understanding and usage of monads. LINQ is arguably the most widely used implementation of monads, it's just that many people don't realize it.
Brian Beckman's Don't fear the Monads
An excellent article explaining how LINQ is extensible to work with any monad
A video by Erik Meijer explaining the duality of IEnumerable/IObservable and IQueryable/IQbservable, as stated in the original article
All my liberal friends think I'm a conservative, all my conservative friends think I'm a liberal.
SQL - Structured Query Language
Seeing that his sig is talking about the language (granted, it's a subset), I'd say it's your understanding.
Meijer and Bierman think in terms of modeling from an OO, hierarchical perspective. Midway through TFA they say "starting with a natural hierarchical object model". They've missed the whole point of the relational data model, which is proven to be more general than a hierarchical model. The first comment in TFA succinctly drives the point home:
"The relational model was shown by Codd to be capable of efficiently expressing any kind of knowledge that can be expressed with graphs. In fact, it was his analysis that paved the way for the near-extinction of the network and hierarchical models that prevailed at the time. "
If you've enough gray hair to have ever built a large OO database, you realize that schema migration is a nightmare that rapidly becomes intractable as the data volume increases.
The closed-world problems mentioned by Meijer and Bierman are real, and apply as much to a hierarchical data model as to a relational one. In the relational world, there is no such thing as a distributed foreign key, and in a hierarchical world, the vector may reference something in no-man's land as well.
Meijer and Bierman also focus on normalized data models and fail to mention other relational modeling techniques such as the star schema that is commonly used in data warehouses. They seemingly equate the SQL language, a normalized data model, and an RDBMS, while ignoring the R(elational) in RDBMS.
Problems such as tackling unstructured data are real issues and are not solved by either a hierarchical or a relational model.
In the late 70's IBM created a relational query language called BS-12 (Business System 12) that looks to be more flexible than SQL. Perhaps IBM felt it was too "mathy" to sell to executives and went with SQL instead.
There is also the "Tutorial D" family of query languages based around Chris Date's popular textbook. I personally don't like its syntax structure and don't think it fits well with dynamic typing, and thus created my own draft relational query language called SMEQL (LOR fans will like the name). It borrows from BS-12 and functional programming.
Here's an example that returns the top 6 earners in each department
And a brief guide to the primary operators:
* calc(table, columnTable) // similar to SELECT clause in SQL // similar to WHERE clause in SQL // roughly similar to GROUP BY in SQL // sorts or produces sequence numbers
* filter(table, expression)
* group(table, columnTable)
* join(table_1, table_2, expression)
* leftJoin(table_1, table_2, expression)
* orderBy(table, columnTable, [sequenceColumn])
* union(table_1, table_2)
Table-ized A.I.
Citation needed
The Tao of math: The numbers you can count are not the real numbers.
No, it's an attempt to promote understanding and usage of monads. LINQ is arguably the most widely used implementation of monads, it's just that many people don't realize it.
Beg to differ. Emacs with heaps of lisp macros, XPath
yah! i agree there must be some new modification required. anyways i was working on my sql server 2008 and one day later i got the corruption in the database that was due to a malicious w32 virus. after removing it the database was corrupted and i used the sql database recovery to get back the corrupted database. thanks to the people from http://sqldatabaserecovery.webs.com , i actually get my every database objects back.