The NoSQL Ecosystem
abartels writes 'Unprecedented data volumes are driving businesses to look at alternatives to the traditional relational database technology that has served us well for over thirty years. Collectively, these alternatives have become known as NoSQL databases. The fundamental problem is that relational databases cannot handle many modern workloads. There are three specific problem areas: scaling out to data sets like Digg's (3 TB for green badges) or Facebook's (50 TB for inbox search) or eBay's (2 PB overall); per-server performance; and rigid schema design.'
Microsoft Access is here!
So... every time I open my inbox in Facebook, it has to search through 50TB of data? That sounds like a design problem. What has always floored me is why people think everything needs to be stuffed into a database. Terabyte sized binary blobs? You know, there's a certain point where people need to stop and actually think about the implimentation.
#fuckbeta #iamslashdot #dicemustdie
With regard to scalability, it strikes me that the problem isn't so much SQL but the fact that current SQL-based RDBMS implementations are optimized for smaller data sets.
The performance claims will probably be disputed by Oracle whizzes. However, the "rigid schema" claim bothers me. RDBMS can be built that have a very dynamic flavor to them. For example, treat each row as a map (associative array). Non-existent columns in any given row are treated as Null/empty instead of an error. Perhaps tables can also be created just by inserting a row into the (new) target table. No need for explicit schema management. Constraints, such as "required" or "number" can incrementally be added as the schema becomes solidified. We have dynamic app languages, so why not dynamic RDBMS also? Let's fiddle with and stretch RDBMS before outright tossing them. Maybe also overhaul or enhance SQL. It's a bit long in the tooth.
More at:
http://geocities.com/tablizer/dynrelat.htm
(And you thought geocities was de
Table-ized A.I.
I think I've heard of non-relational databases before. There's a particularly famous one, in fact. What could it be? Let's see: first started shipping in 1969, now in its eleventh major version, JDBC and ODBC access, full XML support in and out, available with an optional paired transaction manager, extremely high performance, and holds a very large chunk of the world's financial information (among other things). It also ranks up there with Microsoft Windows as among the world's all-time highest grossing software products.
....You bet non-relational is still highly relevant and useful in many different roles. Different tools for different jobs and all.
I'm a huge PostgreSQL fan and took classes in formal database theory in college. I'm saying this as someone who understands and thoroughly appreciates relational databases: I'm starting to love schema-less systems. I've only been playing with CouchDB for a few weeks but can certainly see what such stores bring to the table. Specifically, a lot of the data I've stored over the years doesn't neatly map to a predefined tuple, and while one-to-one tables can go a long way toward addressing that, they're certainly not the most elegant or efficient or convenient representation of arbitrary data.
I'm certainly not going to stop using an RDBMS for most purposes, but neither am I going to waste a lot of time trying to shoehorn an everchanging blob into one. Each tool has its place and I'm excited to see what niche this ecosystem evolves to fill.
Dewey, what part of this looks like authorities should be involved?
In the olden days you didn't have centralized message stores. That's largely a relic of PC-based networking schemes like Novell, Lotus Notes and Exchange. The Unix model used individual mailboxes (in fact, the whole breakdown was for all of a user's data being in their own hierarchy). Obviously the Unix mailbox scheme wasn't that great as we started saving many megabytes of data, so you create indexed systems, but each user's mail is still effectively independent. I've used Pine to navigate my old mbox archives and it can move through even unindexed email at speeds that put bloated monsters like Exchange to shame.
Clearly the issue with scalability in general is simply one of optimization. If you're returning relatively small pieces of information, then an RDBMS is the way to go. If all your databases are basically blobs, well then it's probably not going to be that effective. I still feel that blobs are heavily abused.
I think part of the problem with RDBMSs is simply that a lot of people don't use them properly, and create the bottlenecks through bad design.
The world's burning. Moped Jesus spotted on I50. Details at 11.
There was a similar story on Slashdot a few months ago:
http://tech.slashdot.org/story/09/07/02/219247/Enthusiasts-Convene-To-Say-No-To-SQL-Hash-Out-New-DB-Breed
Table-ized A.I.
We didn't start with relationship databases. RDBMSes were responses to the seductive but unmanageable navigational databases that preceded them. There were good reasons for moving to relational databases, and those reasons are still valid today.
Computer Science doesn't change because we're writing in Javascript now instead of PL/1.
That is indeed suspicious. But if they want to sell clouds, then make a RDBMS that *does* scale across cloud nodes instead of bashing SQL. (SQL as a language doesn't define implementation; that's one of it's selling points.) It may be that since there's not one out yet, they instead hype the existing non-RDBMS that can span clouds.
(I agree that SQL could use some improvements, such as named sub-queries instead of massive deep nesting to make one big run-on statement. Some dialects already have this to some extent.)
Table-ized A.I.
Collectively, these alternatives have become known as NoSQL databases. The fundamental problem is that relational databases cannot handle many modern workloads.
I'm sceptical. Why is the problem worse now then in the past? Relational theory in practice is abstracting the data such that a human/application can understand it as logical constructs. How the data is PHYSICALLY organised is a matter of implementation - the relational theory doesn't place any constraint (!) on how the data is organised/retrieved/updated - except that by giving a broad design pattern , duplication is minmised, and so then is processing overhead. MPP (Parallel Processing) lends itself quite neatly to any large set of data - many implementations will continue to scale linearly above the PB size (e.g Teradata). Looks to me like a sales pitch.
I was an admin on a system that spread the data across 10 database servers. Each server had a complete set of some data, like accounts, but the system was designed so that ranges of accounts stored their transaction type data a specific server, and each server held about the same number of accounts and transactions. As data came in, it was temporarily housed on the incoming server until a background process picked it up and moved it to the 'correct' one. This is a very simplistic view, but the reality was that it worked quite well. Occasionally, there was a re-balancing that had to be done. But it was very scalable. The incoming data wasn't so time sensitive that if it took a few hours to get moved, everything was still OK. When an 'online' session needed data, it knew which server to connect to to get it. Processing was done overnight on each server, then summarized and combined as needed.
.. .people have been coming up with innovative ways to solve these problems for a very long time.
So yes
And they will continue to do so.
I rarely read replies, it's my opinion and if you thought about your opinion a little more, I'm OK with that.
Let's not forget where the bottleneck is - the I/O. It's expensive but once you build a fast and solid storage system, correctly configure it and partition your data properly over a sufficiently large number of hard drives, RAIDs, LUNs etc., you might be able to use SQL. We run a database of 10TB on MS SQL with hundreds of millions of records with an equal rate of reads and writes and could not be happier.
That's very clever and all (and I'm sure quite effective), but it doesn't address the original issue: RDBMSs suck at scaling. We should be able to throw a rack of servers with a load balancer and a SAN at the problem and have it go away. We shouldn't have to rewrite our application logic to scale it out any more than we currently have to write special code because our hard drives are in RAID5 (read: not at all).
The storage engines and their indexing should take care of all of this nonsense automatically. You might have to help them out by being a bit more specific than key `user_id` (`user_id`) (your stock tickers are a good example), but fundamentally the code that helps scale out a database should be part of the database and not the application that's using it.
But, life isn't so kind to us. Oh well, maybe in time.
How are sites slashdotted when nobody reads TFAs?
Worse, sharding and other such solutions usually end up requiring the application to know way, way too much about the back end structure, how tables are split, where they are split, and so on.
And your solution to improving the storage engine doesn't help. At some point in a RDBMS you need to do joins and so forth, and that assumes that the machine doing the join is capable of doing so AND of handling the load and the number of transactions being tossed at it. Hence we start getting into clusters and other solutions that again need to be understood and managed.
The NoSQL solution let's you toss your request out to the "cloud" and get an answer without needing to know clusters, shards, tables, or really anything on the physical implementation side of the fence.
Any sect, cult, or religion will legislate its creed into law if it acquires the political power to do so.
Most RDBMS implementations on the web are generally only used to store data and perform very basic queries such as get and store operations. Personally I don't really see the issue of using one for a web applications since they are proven to work well and with the right design and caching solution are more than capable of handling a popular website such as Digg or Facebook. The only real issue with these sites is to prevent bottlenecks you would generally need to throw more hardware at it than may be necessary (although memory is very cheap these days so its a non-issue for most companies).
Memcached has shown to really help solve many performance issues for relational databases since the database won't constantly perform complex queries to grab data, it will just pull the result from a hashed index stored in memory. MemcachedDB http://memcachedb.org/memcachedb-guide-1.0.pdf is looking very promising to use to get rid of a RDBMS all together for certain data such as user sessions since it focuses on performance rather than functionality. Even then I think it all really boils down to choosing the right tool for the job, if there's data that you know is going to be a performance bottleneck in the database, you look for more creative solutions to store and process that data. There's nothing stopping you from running two or more different types of databases for the task at hand.
I bet it can't find old messages at the speed I do with X1 + 10 years of Exchange-based email (more than 250,000 messages). I stuck with Pine through the end of the 90s, when everybody else I worked with was switching to Netscape Communicator. I wouldn't go back now though.
Hmm... Before 1979, market share for RDBMS was TINY. It really didn't begin to "serve us well" until the mid 80's.
Why is there an "insightful" mod and why isn't it "-1"? If I wanted insight, I wouldn't be reading
Wow a "object oriented" database discussion again. I've never read one of these :P I've only been doing this 15 years and I've lost count of these talks a long time ago.
What is the difference between schema less and schema rigid anyways. I don't see what that has anything to do with performance. The real issue is uptime and transaction support. People want to add a column or index without taking the system down. That is different then dealing with PBs of data. Most table structures can easily deal with that much data.
If you have a DB that is big you have lots of outs. Pay...get Enterprise version of whatever. Break it into many DB/tables and merge together. Archive. Archive I bet will get most people by. Does eBay really need all that bidding info for items over a few weeks old...only for analysis maybe. Move that old stale data out of the active heavily hit data tiers.
The fact remains that MySQL should be able to scale to TBs of data. The fact that it can't is a failure of the product. All the others have been for a while. Why can't it...I don't know...the fact that it uses a F'in different file for each index on a table. If you don't understand how old school that is start using Paradox. Just because it is open source doesn't mean it has to be so damn out of date. Please for the love of god save multiple tables/indexes in the same pre sized file...god.
Google has all the power to go and use something different. Google gets to cheat. Google is a collection of pretty static data. They scan the internet a lot, but imagine if every time you did a search Google had to scan every web page on the planet, index them, and then give you search results. That would be impractical for sure. So for now they just store big collections of blobs and a big fast index for searching keywords and links to pages. Impressive none the less, but it's not like your typical app. GMail is...funny that it is one system they've had problem with. Even then EMAIL DOESN'T CHANGE. It's user specific, but it's still f'in static. GoogleTastic if you ask me.
The fact is people are using RDBMS right now to solve real world problems. Some start up is finding a way to tweek MySQL to do something cool and then posting it on a blog...then all of the sudden RDBMS is dead. RDBMS is fine, it will be fine for at least 10 years if not longer. In that time it will evolve as well so that it will be around for even longer. MySQL in 5 years will have online index addition, performance hitless online column addition, partitioning, geo indexing, XML columns, BigASS table support, Oracle RAC like support, and a thousand other features that some RDBMSs have today and some will not see for even longer. Then developers that spent all that cash developing custom shit will revert and post comments like this one.
That's the way it goes in software development. The middle tier gets bigger, gets inept, custom shit comes out, it gets integrated into the middle tier shit....continue;
Instead of pronouncing death start talking about how dated a 2 dimensional result set is. JOINs should return N dimension result sets similar to XML with butt loads of meta data. ODBC/JDBC are dated...so updated them.
select u.login, ul.when from users u join user_logins ul as logins.login ON ul.user_id = u.user_id where u.name = 'me' should equal something like a nested XML packet instead of duplicated crap when there is more then one user_logins.
Can we agree that SQL is a high level language for capturing the set theory query logic and is COMPLETELY INDEPENDENT of the engine and physical storage that actually generates the query plan and makes the heads fly to cache and return data?
Structured
Query
Language
not
Stupid
Quixotic
Layout
(Of tables, pages, indexes, drives, heads,spindles, SANs, etc...)
Right?
"Knowing everything doesn't help..."
Has this ever occured to you: Maybe people just choose not to answer you? :)
I've seen OLAP systems in the 100TB range which work fantastically well on Oracle.
Object databases could be a nice idea, but not for performance or scaling reasons. An object oriented database would be beneficial as a method to sidestep ORM. So you can, effortlessly and without any significant amount extra work persist the state of your objects.
Then you can build POxOs to represent your objects and just implement a few lines of code to have them persisted.
Not sure if anything like that already exists. I certainly don't know of anything in the C# world, but I expect there's some funky named java project which does it.
MS-Access had some really great features: it could be accessed with both SQL and with a blazingly fast (because almost running on the bare OS) ISAM-style library. I am still missing anything like it on Linux. SQLite is a file-system database, but why on earth should it parse full-blown SQL at runtime and why on earth should my program write another program in SQL at runtime just to load some data? Get serious. Parsing and building SQL is just overhead, and especially parsing SQL is no easy and light task.
Since I switched to OO programming, most (95%) of my queries are "This table/index. Number 5 please." In essence that is the get/put method, or the ISAM style method. I really would like something like that to exist on Linux. The closest thing around is MySQL's HANDLER statement, but that can only be used for constant data (because it does dirty reads) and for reading only.
SQLite could even be faster if it just accepted some basic "get row by index" and "put row by index" commands that do not try to parse, optimize or outsmart anything. The problem with "modern" databases is that they are either "SQL" or "NoSQL". That's awful. Some programs speak SQL (because of compatibility, because it is a reporting program or just because the programmer does not know anything else) and some programs are better off with direct row management. That does not mean that the data should not be accessible by both programs. I really wish that the regular SQL databases would develop ISAM-style access methods. Programming would be a hell of a lot easier then, and the programs themselves would speed up significantly was well.
This is no idle remark. I worked a lot with MS-Access and most rants about it being slow comes from the fact that most programmers treat the file-system database as a server. So it must emulate itself as a server and do a lot of household parsing and does not even have a physical server to relieve its load.
But if you know how to program a file-system database with ISAM-style methods, MS-Access is by far the fastest database I ever encountered. No Joke. Really. It can be fast because there is no need to do all these household jobs to just dig up a row.
Nae king! Nae laird! Nae yurrupiean pressedent! We willna be fooled again!
its simpler to switch to a different rdbms when your queriees are already in sql.
It's mostly just human ignorance and laziness.
Deleted
You are aware of PostgreSQL's hstore: a type representing basically a name-value mapping (think Perl hash or Python dictionary). You can put an index on it answering queries like "find all records where the field has a mapping "foo => bar", or contains mappings {foo => bar, baz => grumble} and more.
Cool stuff.
E-Mail servers associate data with only one index: the e-mail address.
...Valid points, except for your use of the word "one". My email can be retrieved by my email address, but also selected by the folder that it's in, sorted by sender, subject, date or priority, and searched by keyword.
There are only a couple of handfuls of thing that need to be indexed, but certainly more than 1.
I work on a very large db2 system. Enterprise systems cost money because they work. There still seems to be this ignorant self absorbed counter culture which believes big iron and similar (anything about look what I can build in my basement) isn't cool so it cannot work.
Between radix, sparse, derived, encoded vector indexes I can pretty much serve up anything my partners want, whether they are native or foreign db2 ,jdbc or odbc connected. With the tools I have at my disposal I can analyze statements presented by developers to insure I have the access paths needed for their work and guide them to better data retrieval. I can tell if their choices result in full table scans, index probes, hash tables, rrn tables, etc. If I need support its a phone call away.
I do not care who my client is, data is my job. As such I need tools which are so reliable that only concerns I have are, just what is my customer doing and how can I make their request better. When they query 5tb tables and don't even notice a delay I think I am doing just fine.
* Winners compare their achievements to their goals, losers compare theirs to that of others.
SQL is hardly a hammer - a hammer only has one general use. It's not a Swiss army knife either - a lot of fairly low-grade tools that are convenient in a pinch. If anything, a Swiss army knife is a spreadsheet.
An RDBMS is more like a well equipped workshop that you build and equip at the site of likely problems. It will take far more work to set up than buying a leatherman tool. However, it will solve almost any unanticipated problem you throw at it, once it is built. That is the beauty of an RDBMS, and why businesses and governments like to build both workshops and relational databases.
Of course, there are circumstances where a RDBMS is not called for. If you are doing anything that needs to be highly optimized for just one thing, and will only ever be used for just that one thing, then you do not use an RDBMS. (e.g. an FPS). Much like you wouldn't use a workshop if all you are going to be doing is manufacturing widgets - then you need a factory.
I guess the analogy kind of breaks down there, because a workshop isn't efficient enough to run as a business compared to a factory - it is support infrastructure. For many, many things though, an RDBMS can be the core of a business information system and can also quickly and conveniently answer questions that weren't thought of at design time. Their RDBMS problem domain will only increase as computing power grows, unlike more specialized systems. I would not be surprised at all if SQL is still dominant in a hundred years time.
If I have seen further it is by stealing the Intellectual Property of giants.
SQL databases if designed properly DO handle enourmous datasets. the problem starts when you have wits designing the database and then managers attempting to use the DB for purposes it wasn't meant for.
If you mod me down, I will become more powerful than you can imagine....
I mangled the first link.. http://tinyurl.com/ybepcqr/
Database size is usually not an issue for modern RDBMS, such as Microsoft SQL, Sybase ASE, Oracle, or IBM's DB2. I am running an ERP on Sybase with 3 TB worth of data, a datamart on Microsoft with 5 TB, a Patient Record System on Microsoft with 20 TB, a HR system with 2 TB, and a Patient Accounting system on Oracle with 8 TB of data. All of these systems talk with at least one other system, usually with the assistance of SSIS (Thank god for SSIS, our ETL is heavy lifting, approx. 5 TB a night of incrementals). With enough server hardware, we can scale up to very large levels easily. We forcast out our data size needs out for the next three years and have been very accurate, not running across SAN issues.
Only systems we have had issues with in the area of data size is MySQL and Informix.
In God we trust, all others require data.
Reading your post, I was half-expecting to see "educated stupid" crop up. You're claiming that 40 years of accumulated wisdom managing petabytes worth of data is worthless, and that your "advanced programming techniques" can do a better job? Put a sock in it. Maybe it's you who has the problem here. After all, you can't grasp that people actually do model data relationally, do produce object-relational layers that work just fine, and do produce systems that process vast amounts of information relationally.
What's next? Will you advocate that people who advocate antiobiotics don't your "advanced medicinal techniques" based on the four humors?
Oh, and by the way: "data" is already plural, you numbskull.
That works great until you decide to use the R in RDBMS and actually join some tables. Plus you'd be using all sorts of dynamic SQL to allow every query to pick the appropriate table, putting yourself at risk of SQL injection vulnerabilities. You don't want a bunch of interns coding dynamic SQL against a system big enough and important enough to warrant this kind of data partitioning.
If you really need to split a single table's data across multiple file/disk systems, use a DBMS that supports this at the physical storage level, rather than forcing you to do it logically with 256 tables. SQL Server, for example, allows creating file groups, which can contain multiple files on different file systems. Assign a table to a specific file group, and it will get spread across all those files. Or if you need finer control, use table partitioning which allows you to pick which file group each specific range of data is stored in. This works great, because the data is physically stored as though it were in multiple tables/indexes, allowing you to very quickly narrow your searches based on the partitioning key, and thus isolating all the I/O to a specific partition.
But 256 separate tables? Egad. It's irritating enough working with our ERP system, which splits most data into separate "open" and "historic" tables. If I had to deal with 256 of them, I'd probably quit.
Right. Don't forget PostgreSQL too. Really, the problem here is MySQL. Hell, look at the "tips and tricks" comments for this story: they all deal with ways to work around deficiencies in MySQL (and old versions of MySQL at that.)
The guy who recommends using the first two characters of the MD5 hash to select a table is particularly hilarious. Doesn't he realize that's what a database index already does, and that databases (even MySQL) will do that for him?
I agree with you.
I've never understood why people like the GP have such attitudes. They scoff at pre-existing UNIX methods, and then insist that various forms of obscenely, excessively complex, perverted evil such as C++ or XML are somehow preferable and superior.
Although I love using Linux and FreeBSD, and also engaging in shell scripting and simple forms of programming, whenever I see these being discussed in a public forum recently, I've noticed that my immediate response is usually to become angry.
This is because there is now apparently an entire generation of chronically elitist, misinformed, horribly uneducated and misguided programmers, who engage in chronological snobbery and various other subjective logical fallacies while expressing derision towards the UNIX philosophy, (which is, as you rightly point out, the product of 40 years' worth of accumulated experience) and then advocate such horrors as the aforementioned C++ and XML as preferable solutions.
C++, XML, and "object oriented," programming are nothing other than facilitators of elitism. I've never come across a single advocate of C++ who was not a condescending elitist of the worst possible kind. They use its' degree of needless complexity as a means of gratifying their ego, and they thus fight tirelessly to ensure that the means of said ego gratification is preserved.
Inflating epeens, however, is the only thing that C++ really accomplishes.
... that too many developers and integrators will just use an SQL database by default without considering whether or not it is appropriate for the task. I see so many databases where there is little or no hint of any relationships even being involved. Some forums, for example, store postings in a database where the message content is a blob and it is indexed by a number. To get a post, look by number. While an SQL database can do this, so can many other database types. There's no complex relational searching with this; it's just basic indexing (with maybe a tree of index relationships). I'd sooner do this with a B-tree based filesystem.
now we need to go OSS in diesel cars
Actually, the real problem is that MySQL sucks. Sure, you can patch over some of its suck with Memcache, but at somepoint your still stuck waiting 30 seconds for a query to return, no matter how optimized you make it. Yes, it's trivial to get Oracle and MSSQL to scale to billions of rows, but those cost money no one is willing to spend. NoSQL is wonderful in that it scales easily and is free.
Sure, you have to denormalize your data, but you probably already were to try to squeeze the last bit of performance out of MySQL.
You want people to use RDBMS? Make a free one that doesn't suck donkey balls and they will.
CODASYL Hierarchical Databases are faster for large complex databases. I've supported extremely large databases and user bases with 3 second or better end-to-end response times for over 300,000 real-time customer service rep users with such software. These databases allow precise physical positioning; including the ability to group related child record rows on the same physical page. One I/O can retrieve the entire set. They also support hash or other custom indexing that directly yields the physical page address instead of wading thru relational index pages to get there. Tool support is not as good and it takes someone who understands them to get the best results. Functionality such as producing report output is more work. But they work great on large datasets.
I can quickly find anything on my desk using my index of food wrappers and containers.
I know that report was done about the time I ate that snicker's, ah found it.
I only look human.
My mother is a halfling and my dad is an ogre, so that makes me an Ogreling
At the ACM site Michael Stonebraker wrote an article titled "The "NoSQL" Discussion has Nothing to Do With SQL" where he discusses how the NoSQL group is solving real problems, but using a name.. that well.. really has nothing to do with the problems getting solved.
http://cacm.acm.org/blogs/blog-cacm/50678-the-nosql-discussion-has-nothing-to-do-with-sql/fulltext
For anyone not familiar with Stonebreaker..
http://en.wikipedia.org/wiki/Michael_Stonebraker
Great article from someone who truly knows what he is talking about.
MongoDB starts as a service on my MacBook and on my local network I always keep services for Sesame (RDF data store, SPARQL endpoint), MongoDB, and CouchDB running.
It is easier to use NoSQL datastores (when they are appropriate) if you always have them running, have client libraries in place, etc.
If you want to use a relational database, you don't have to stop to install it, get client libraires, etc. I think the same 'ready at hand-ness' shoud apply to whatever NoSQL datastores that meet your needs.
The world rely on IMS
Are all of those in Java? What about people who want something efficient and scalable without running JVMs everywhere? Have some of them been ported to Mono?
There's something to be said for using the right tool for the job. A general purpose database will be optimized for the general case, not for your specific problem. Large databases spanning multiple servers, taking extreme traffic, are sufficiently outside the scope of normal database operations, that a custom solution can be the only way to do it.
I suggest:
* figure out exactly what needs to be solved ;-) )
* check if existing solutions solve it
* if not, then develop a solution (and if you're in Canada, claim IRAP and SR&ED for it
Reasons why my suggestion would not always work:
* risk, both financial and project
* skills bias (preference to change the problem to match what skills are available)
* technology bias (preference to change the problem to match a specific technology)
* vendor bias (preference to change the problem to match a specific vendor)
SR&ED
One word, alpine. :)
-- SouNerd.com
The reason you can't just keep throwing more boxes at it boils down to CAP -
Consitency, Availability, and Partition Toleranace.
In designing any distributed system, you can only get two out of three.
http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
I too get annoyed by database luddites, especially the ones who are in there because they have no social skills, no desire to co-operate with others, and who know all the MS latest terminology but don't, for instance, actually understand how indexes work because they have never really learnt system programming. But valuable corporate data does need to be protected; its loss or corruption costs profits and jobs. SQL is a proven language with a strong track record that is largely portable and, except when queries are generated by some hopeless automated query generation engine, can be made human readable and checkable. Way to go for corporate data.
If you had a nickel as you suggest, you probably wouldn't have enough to buy lunch in a decent restaurant. If you had a nickel for every swear word uttered by every dba or IT manager sweating blood trying to overcome data loss or corruption, you might be able to retire as you suggest.
Remember: social networking applications are not mission critical business processes, and they do not have significant SLAs to meet.
From scarped cliff or quarried stone she cries "A thousand types are gone, I care for nothing, no not one."
SQLite is a nice alternative for embedded systems. The whole distribution is less than half a meg. Works quite well for the opposite side of the spectrum covered by TFA. Smaller than Access, smaller than darn near anything, a fully self contained SQL environment expressed in a file. For the Big Huge scale (petabytes) look at Google's BigTables.
Do not mock my vision of impractical footwear
No. The EAV model creates a row-centric view of attributes. My suggestion keeps the traditional column-centric view intact. Other than being more careful about implied types when comparing and asterisk usage, most SQL will look just like it does in a "static" RDBMS. This is not the case with EAV's; they completely change the way one queries.
Table-ized A.I.
PostgreSQL at least, and probably other databases, has a generic "key-value store" data type: http://www.postgresql.org/docs/8.3/interactive/datatype.html. With it, rows can contain some strictly-typed data (such as IDs, types, other metadata) and also contain a field (or many fields) which store all other loosely-typed data. And since it's PostgreSQL, all data is safe, can be replicated, you can have complex indexes, full text search, etc.
-- Sig down
...One database per registered user.
I am not devoid of humor.