Why My Team Went With DynamoDB Over MongoDB
Nerval's Lobster writes "Software developer Jeff Cogswell, who matched up Java and C# and peeked under the hood of Facebook's Graph Search, is back with a new tale: why his team decided to go with Amazon's DynamoDB over MongoDB when it came to building a highly customized content system, even though his team specialized in MongoDB. While DynamoDB did offer certain advantages, it also came with some significant headaches, including issues with embedded data structures and Amazon's sometimes-confusing billing structure. He offers a walkthrough of his team's tips and tricks, with some helpful advice on avoiding pitfalls for anyone interested in considering DynamoDB. 'Although I'm not thrilled about the additional work we had to do (at times it felt like going back two decades in technology by writing indexes ourselves),' he writes, 'we did end up with some nice reusable code to help us with the serialization and indexes and such, which will make future projects easier.'"
They must run their company pretty different than where I work.
Where I work, the most senior and backstabby developer saddles the worst tools he can find on the rest of the team, and then blames them (behind their backs of course) for the results of his poor decision making.
But MongDB is web scale.
No one cares. Stop click-baiting the buzzword Slashdot sub-sites. If we wanted to go to them we would do so voluntarily.
ObjectRocket has a pretty awesome solution and the dudes there know their sh!t: http://www.objectrocket.com/
there are two kinds
the first creates a 10,000,000 row table with no indexes, no PK and then complains that the DBA's are dumb because the app is slow or the server is broke
the second kind i've seen have a 100 row table, with 10 columns and 15 indexes on it. sometimes half my day is spent on deleting unused indexes created by our BI devs
Fools! Everyone knows DynamoDB isn't Web Scale.
"Our client is paying less than $100 per month for the data. Yes, there are MongoDB hosting options for less than this; but as I mentioned earlier, those tend to be shared options where your data is hosted alongside other data."
I think someone failed to explain how "the cloud" actually works.
Having actually RTFA, it just enforces how poorly most programmers understand relational databases and shouldn't be let near them. It's so consistently wrong it could be just straight trolling (which given it's posted to post-Taco Slashdot, is likely).
"However, the articles also contained data less suited to a traditional database. For example, each article could have multiple authors, so there were actually more authors than there were articles."
This is completely wrong, that's a text book case of something perfectly suited to traditional (relational) database.
that is getting sick of this content-free, slashdot echo chamber, clickcrack stuff. Hey Slashdot, why do you need whole nuther site to post original articles? And why do those articles make such a deafening sucking sound?
Problem is that I would be interested in a reasoned look at MongoDB v Dynamo but my experience with http://slashdot.org/topic/bi/ is not to waste my time by reading TFA.
MongoDB would have been perfect based on the structure of the data, but the client didn't want to pay for setup and hosting costs, DynamoDB was the cheaper alternative, but more of a pain in the ass to implement. Makes we wonder if the hosting cost savings offset the additional development time.
Everyone should believe in something. I believe I'll have another beer.....
As someone whose work and thinking are firmly planted in traditional RDMS, a few of those decisions did not make sense.
I understand what he's saying about normalized tables for author, keywords, and categories. But then when he has to build and maintain index tables for author, keyword, and categories, doesn't that negate any advantage of not having those tables?
I understand he's designed things to easy retrieval of articles, but it seems the trade-offs on other functions are too great. It's nice an author's bio is right there in the article object, but when it's time to update the bio, that does mean going through and touching every article by that author?
I've I got a bunch of similar examples, and I would not be at all surprised if they all boiled down to 'I don't understand what this guy is doing,' but basically, isn't NoSQL strength in dealing with dynamic content and in this example, serving static articles, the choice between NoSQL and traditional RDMS essentially up to personal preference?
Throughout the article the client says they don't want full-text search. The author says he can "add it later," then compresses the body text field. Metadata like authorship information is also stored in a nasty JSON format—so say goodbye to being able to search that later, too!
About that compression...
That compression proved to be important due to yet another shortcoming of DynamoDB, one that nearly made me pull my hair out and encourage the team to switch back to MongoDB. It turns out the maximum record size in DynamoDB is 64K. That’s not much, and it takes me back to the days of 16-bit Windows where the text field GUI element could only hold a maximum of 64K. That was also, um, twenty years ago.
Which is a limit that, say, InnoDB in MySQL also has. So, let's tally it up:
So what the hell is this database for? It's unusable, unsearchable, and completely pointless. You have to know the title of the article you're interested in to query it! It sounds, honestly, like this is a case where the client didn't know what they needed. I really, really am hard-pressed to fathom a repository for scientific articles where they store the full text but only need to look up titles. With that kind of design, they could drop their internal DB and just use PubMed or Google Scholar... and get way better results!
I think the author and his team failed the customer in this case by providing them with an inflexible system. Either they forced the client into accepting these horrible limitations so they could play with new (and expensive!) toys, or the client just flat-out doesn't need this database for anything (in which case it's a waste of money.) This kind of data absolutely needs to be kept in a relational database to be useful.
Which, along with his horrible Java vs. C# comparison, makes Jeff Cogswell officially the Slashdot contributor with the worst analytical skills.
Bio questions? Ask me to start a Q&A journal. Computer analogies available for most topics!
TL;DR: Jeff Cogswell doesn't understand how relational databases work. Or "the cloud", for that matter.
http://travispbrown.com/post/43167533260/a-tale-of-two-databases-dynamodb-and-mongodb
Mongo has more punch
We decided that MongoDB was adequate but didn't leverage the synergies we were trying to harvest from our development methodologies.
We looked at GumboDB and found it was lacking in visualization tools to create a warehouse for our data that would provide a real-time dashboard of the operational metrics we were seeking.
Next up was SuperDuperDB which was great from a client-server-man-in-the-middle perspective but required a complex LDAP authentication matrix that reticulated splines within our identity management roadmap.
After that I quit. I hear they are using Access 95 with VBA.
Wearing pants should always be optional.
A Tale of Two Databases: DynamoDB and MongoDB
We decided we wanted something comforting, so naturally we chose a waffle maker.
Is it just me, or is anyone else tired of seeing authors trying to pass off stuff like that as reasoning?
This guy is seriously throwing all his data into one comma delimited field? What's the database for again?
Why does slashdot keep giving him exposure?
Comment removed based on user account deletion
Re: "at times it felt like going back two decades in technology by writing indexes ourselves"
More like double that, to four decades. Custom written index maintenance code? Really!? This is no kind of positive recommendation for DynamoDB, more like an indictment of it.
MongoDB is free, if you dont care about "vendor" support. There's certainly a big enough community around MongoDB where 99% of your problems can be answered by simply googling it. Fail on the authors part.
So what he is saying is: the tools aren't mature, you have to re-invent the wheel with them, the wheel they invented is ok, but you will have to invent your own wheel. Somehow, that's all good though, you should try to follow what we did instead of using mature tools that already exist to build web infrastructure. We have religion about some of the products we use, and hope you will pick some of the tools we used for the same irrational reasons we used. It will take more time, cost more and won't get you any further ahead, but you might feel warm and fuzzy inside afterwards. On the other hand, you might not get as far as us, there is no code sharing so you are all on your own, and the time delays and extra costs might just kill your business/idea. This is news for nerds, just not good news for nerds. More like a cautionary tale.
FTFA:
Hello, I'm a time traveller from 1973 where I've been fondly imagining you folks in the future had written software to solve this kind of problem in a more generic fashion. Back in the past we have some visionary guy by the name of Codd, and in my wilder dreams I sometimes imagine by the year 2000 someone has created some kind of revolutionary database software which is based on his "SEQUEL" ideas and does fancy stuff like maintaining its own indexes.
Then I wake up and realise it was just a flight of fantasy.
tens of thousands of articles
What a modest amount of data. Why not just import them into a robust open source CMS?
The client was a small organization with only six employees and a tight budget.
Why spend their budget researching and developing a custom DB solution, which will probably be impossible for anyone else to support or extend?
their Website received a good amount of traffic from a niche group of scientists and researchers
How much traffic? Since articles and metadata hardly change very often, why bother doing the optimizations for structural performance in your DB? Varnish or any other cache engine will handle that for you.
It's probably a very fast scalable DB solution, but I am under the impression that clients want robust, cheap, easy to maintain and use products and not databases or other similar technologies or tools we developers choose among.
I wonder if the team also hand coded modules for authentication, wysiwyg editors, metadata editors, translitteration and security features etc. within the tight budget
I'm all for trying out all kinds of engines, but unless the client specifically asked for a hand coded database experiment I'm sure they are in for unpleasant surprises in the near future