Moving From CouchDB To MySQL
itwbennett writes "Sauce Labs had outgrown CouchDB and too much unplanned downtime made them switch to MySQL. With 20-20 hindsight they wrote about their CouchDB experience. But Sauce certainly isn't the first organization to switch databases. Back in 2009, Till Klampaeckel wrote a series of blog posts about moving in the opposite direction — from MySQL to CouchDB. Klampaeckel said the decision was about 'using the right tool for the job.' But the real story may be that programmers are never satisfied with the tool they have."
Of course, then they say things like: "We have a TEXT column on all our tables that holds JSON, which our model layer silently treats the same as real columns for most purposes. The idea is the same as Rails' ActiveRecord::Store. It’s not super well integrated with MySQL's feature set — MySQL can’t really operate on those JSON fields at all — but it’s still a great idea that gets us close to the joy of schemaless DBs."
And in another three years they will switch to whatever is the coolest up-and-coming storage solution. Incompetent developers will always be incompetent developers.
If you want to hire twentysomething engineers who have party during the week and have gadgets falling out of their pockets, go with CouchDB or Cassandra or MongoDB.
If you'd rather have people in their 30's and 40's with families, then go with MySQL or Oracle.
Couch DB is *not* webscale!
PostgreSQL 9.2 beta improves scalability, adds JSON
http://www.h-online.com/open/news/item/PostgreSQL-9-2-beta-improves-scalability-adds-JSON-1573815.html
You can get json support using the PLV8 extension http://code.google.com/p/plv8js/wiki/PLV8
or altenatively you can use the hstore data type.
This makes me want to avoid your suggestion. If you presently don't have the Larry problem it's best avoided.
Help stamp out iliturcy.
I haven't touched MySQL in years, but do they have an XML field type? MS SQL Server does (as does many other RDBMSes I bet, I mention SQL Server as that's where my current experience lies) which allows you to essentially keep the table schema-less but still allows you to perform complex queries on the contained data.
Wouldn't that be better than a TEXT field type?
But the real story may be that programmers are never satisfied with the tool they have.
Ah typo
But the real story may be that programmers don't know how to store data
They many not know because no one knows the business needs, but more often because they have no idea what they're doing WRT to data storage.
IT training tends to cover data manipulation pretty well "how to add two numbers'
IT training gets shakey on data structures "So, in junior level class we will talk about data structures, which is too bad because you've already developed at least two years of bad habits first"
IT training tends to pretty much skip data storage "In a senior level class, you might talk about scalability, maybe in an optional class. Or maybe you'll take a semester of cobol instead"
"Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
"...affected by a lot of the same maturity issues that made CouchDB tough for us to work with"
The matury issues were sitting behind the keyboard.
It seems to be a knee jerk reaction amongst a lot of developers and designers that as soon as your app starts requiring persistent data beyond ini values a database is needed. Why? For large but simply structured data something like json or XML or even a flat csv file is perfectly adequate. Performance can be an issue during searches but if for example you have a fixed record size with key sorted data then finding a given key is simple (binary chop or similar).
It seems to me that reaching for a DB is the easy way out taken by a lot of oders and they end up paying for it in maintanability, bugs and support.
MySQL is not Webscale, because it uses joins.
Then how does a non-webscale database power popular web sites such as Wikipedia and Slashdot? If you don't do joins in the database, you'll probably end up doing the equivalent of joins (using one value as the key in another table) in your application.
do they have an XML field type? MS SQL Server does [...] which allows you to essentially keep the table schema-less but still allows you to perform complex queries on the contained data.
But how does it index the data in the XML or JSON fields? How does it, say, tell an element containing a number from an element containing text? Does it act like SQLite, which is dynamically typed (and thus can store text in any field) but can be told to prefer to compare and index certain columns as numbers, dates, text with Unicode collation, or binary data?
with big data comes great responsibility.
PostgreSQL 9.2 (now in beta) includes native JSON fields:
http://www.h-online.com/open/news/item/PostgreSQL-9-2-beta-improves-scalability-adds-JSON-1573815.html
It's also available as an extension for the current 9.1 release:
http://people.planetpostgresql.org/andrew/index.php?/archives/255-JSON-for-PG-9.2-...-and-now-for-9.1!.html
http://www.glasswings.com/
In NoSQL systems such as MongoDB and CouchDB, what do you call the operation where you retrieve one document, pull an identifier out of that document, and use that identifier as the key to retrieve another document?
Hop into the wayback machine and fire up any flavor of PICK. The database where schema is applied on use, not on storage. No length limits on fields and very fast on old hardware (really fast on new). Storing bits of xml and code are no problem. And for those users who simply must have SQL, many versions will support that too (UniData and UniVerse are two examples). It's not cool, not new, but it does work.
We're using a four node Riak cluster, in a production environment spanning two datacenters, and have zero downtime because we can lose two nodes (for example, a whole datacenter) and still read/write. Maybe CouchDB can do that; maybe it can't. Maybe our setup isn't even the best. But, as many people have already pointed out, there is a time and a place for a NoSQL solution, and you can have "near zero" (or, so far for us, zero) downtime. Here's the catch: We are using Secondary Indexing and MapReduce, in Riak, but for searchable analytics we fire off a stripped dataset to a smaller, less-redundant MySQL store.
Urban Airship went PostgreSQL to MongoDB to Cassandra to PostgreSQL. http://wiki.postgresql.org/images/7/7f/Adam-lowry-postgresopen2011.pdf
It's a good presentation because they're in love with none of them and are moving for specific reasons each time, handling different issues. It's not coders chasing the new hotness.
Anyone who loves or hates any language, platform, or manufacturer, doesn't know what they're talking about.
So they love NoSQL but had a bad experience with CouchDB. And the solution is... move away from NoSQL??? Ridiculous. CouchDB is certainly NOT the only solution out there, and they all have their strengths and weaknesses.
I've recently switched from MS SQL Server to RavenDB http://www.ravendb.net/ and I am never going back! Many of the things they found wrong with CouchDB and MongoDB are superior in RavenDB.
39,999 actually, sorry. Dave had an appointment. He sends his regrets and all that.
If all your application is ever going to do is read and write to fixed sized record structured data with little relational (or any) attributes then COBOL will suit you fine as that's what it was designed for. Unfortunatly those sorts of apps are few and far between these days, but in its ever decreasing niche COBOL is still good.
Yes it makes sense up to a point , but it starts to suffer from the law of diminishing returns and at some point having to do complicated multi-table joins actually slows down your queries so much that it becomes simpler and faster to suffer duplicate data than normalise to the Nth degree.
... the joy of schemaless DBs.
You mean working with a file system and not using a DB at all, not needing to pay a DBA, not dealing with corrupted databases, not using arcane tools, etc.?
I jest, but not entirely. Clearly there are purposes for which databases are the right toold for the job. I'm most definitely not convinced that big blob storage is one of them.
Put my fist through my alarm clock with its ding-dong death inside my ear. - The Blackjacks.
Clues to the source of some of Sauce Labs' problems can be gleamed from their list of Maintenance headaches:
Oh please doing an HTTP GET periodically is tricky ??? No it's not. With couchdb if your database is very dynamic you should either index periodically and very frequently. This creates a quantified and controlled performance demand on the server. Ideally read servers should never be the write server and replication should be filtered. Using Couchdb naively will lead to failure. Don't use javascript views, python views are 3-4 x faster, erlang views are 7-10 faster. Used in the right way and following the many tips that you can get from the Couchdb community will make Couchdb not just a great database but a great application platform.
And yes 1.2 is a great improvement.
Currently I'm using coucdb, mongodb, and MySQL all in one high profile project handling terabytes of data and millions of hits. Each has it's use. When it comes to reliability and performance all three DBs are NOT my problem.
a majority of our unplanned downtime was due to CouchDB issues
Nowhere on the CouchDB home page is reliability even mentioned. And that's the real issue. Developing a reliable database system is a difficult design and programming task. It requires real software engineering. The hacks who write PHP and use JSON aren't up to a job like that. The "aw, we'll fix it in the next release" attitude doesn't cut it in databases.
They do cross-browser testing. Right? I can't really figure out what they sell or why I would want it. The real question, "Why haven't I started a mismanaged company with ambiguous products?"
Sig. Sig. Sputnik
Consider mounting your "nul/" directory on a different partition.
Typing is just another constraint, like foreign keys, and various other domain constraints. I cannot see any valid argument for having foreign keys but not type constraints. It jest seems bizarre to me. It's not like they could be optional or anything.
They are optional. It appears you can enforce static typing for a column with constraints like CHECK(typeof(x)='integer'). I'd give more details, but the document that Wikipedia cites about such constraints is a printed publication of which I happen not to own a copy.
In what other language would this statement compile without error:
PERFORM makemoney UNTIL rich.
(Note the the full stop at the end)
And someone with experience in the entity-relationship modeling that underlies a relational database schema is likely to see everything as "a lot of related data". For example, a Slashdot comment is related to a parent (a story or another comment), the user who posted it, and the moderations done to that comment.
I thought the article referred to here particularly interesting because it puts forth the idea that, essentially, NoSQL and Relational DB's are just two ways of looking at the same thing, and thus are both equally valid. Right now they are used situationally (at best) but I think you will see a convergence at some point with something "new" that can do both equally well using a common core and different wrappers around it.
-- Senior Software Engineer, Attorney appearance services, locallawyerapp.com.
They simply won't have an entry in the hasDied relation (table), assuming that's relevant data for the business.
See or maybe DBdebunk
It is kind of a weird word, isn't it? Almost as shemales.
like not having to declare varchar sizes versus text versus CLOB versus BLOB.
What's the difference between a text column and a character large object column? MySQL doesn't appear to have distinct types for the two.
They're all stored and indexed and searchable in the same way (not that you'd want to index any sort of LOB.)
If CLOB is the same thing as text, then why wouldn't you want to support a full-text index on a CLOB? That said, I tend to write my own full-text indexing tools to work around the limitation in MySQL's full-text index: no words shorter than four characters unless you're root, no common English words unless you're root, etc.
XML field type [in] MS SQL Server [...] allows you to essentially keep the table schema-less
shutdown -p now wrote:
it also permits you to specify an XML Schema for the field
But what's the advantage of XML with a schema over just parsing the XML in the application and putting each value in a column? Perhaps I'm having trouble seeing it because my mind is trapped in a relational paradigm.