RethinkDB Gets Acquired By the Cloud Native Compute Foundation; Joins the Linux Foundation (techcrunch.com)

← Back to Stories (view on slashdot.org)

RethinkDB Gets Acquired By the Cloud Native Compute Foundation; Joins the Linux Foundation (techcrunch.com)

Posted by msmash on Monday February 6, 2017 @07:20AM from the new-life dept.

An anonymous reader writes:The Cloud Native Compute Foundation (CNCF) today announced that it has acquired the RethinkDB copyright and assets, including its code, and contributed it to The Linux Foundation. RethinkDB, which had raised about $12.2 million in venture capital for its open-source database, went out of business in October 2016. The CNCF says it paid $25,000 to complete this transaction. The code will now be available under the Apache license.

21 comments

Min score:

Reason:

Sort:

No Thanks by Anonymous Coward · 2017-02-06 07:26 · Score: 4, Funny

I'll stick to Microsoft SQL Server. At least it doesn't cost $25K per transaction!
1. Re:No Thanks by JustAnotherOldGuy · 2017-02-06 07:36 · Score: 3, Funny
  
  I'll stick to Microsoft SQL Server. At least it doesn't cost $25K per transaction!
  I know, even Oracle doesn't charge that much.
  
  --
  Just cruising through this digital world at 33 1/3 rpm...
2. Re:No Thanks by Anonymous Coward · 2017-02-06 19:47 · Score: 0
  
  Hey, don't give them any ideas!
3. Re:No Thanks by Anonymous Coward · 2017-02-12 07:09 · Score: 0
  
  >> I'll stick to Microsoft SQL Server. At least it doesn't cost $25K per transaction!
  > I know, even Oracle doesn't charge that much.
  WHAT???? Quick, Rose, get me the number of that SOB who sold the database to us...
Slower than MongoDB, has joins by bluefoxlucid · 2017-02-06 07:49 · Score: 3, Informative

The whole pitch seems to be "Polling your database is slow; push in real-time!" They can make a query and then continue to give results when there are updates. I guess that's fine if you have a WebSockets service providing that, instead of just polling; on the other hand, that's rarely really a design constraint or an engineering problem--frequently, WebSockets are the wrong way to do something, and polling is the right way. For example: WebSockets to have a notification pop up when you get a new reply on a forum while idling on the forum would be wasted additional complexity versus just polling every 15 seconds or so and indexing on status.
Word on the net is this can be slower than MongoDB (although RethinkDB has joins...). Likewise, you could always set up a Redis server for caching, and use the publisher-subscriber model to accomplish the same thing.
I want to say there are already adequate alternatives out there, but it's silly. MongoDB is the document store you want. CouchDB, CouchBase, and others are slow (although CouchBase is much faster than CouchDB alone). MongoDB is easy to configure (which is good, because apparently people can't get as far as enabling security on MongoDB when that's a single-step process--but an explicit one, meaning if you don't do it it isn't there). MongoDB has built-in replication and sharding, and handles write-concerns that require journaling or replication to 50%+1 nodes. It's just fairly peerless in the space within which it operates.
It's the same way with PostgreSQL: it's performant, easy-to-configure, capable of handling enormous amounts of data, standards-compliant, featureful, and stable. PostgreSQL comes by default set to asynchronous updates in clusters (same guarantees on consistency and data safety as MongoDB Majority write-concern), but can be configured to a slower Synchronous mode. If you need a relational database rather than a document store, PostgreSQL will out-scale MS SQL Server and can keep pace with Oracle; the RDBMS space actually has a few decent competitors.
This contrasts with something like git, which is great and all, but wins on popularity for the most part; bzr and a few other DVCS are just as capable. In that space, git trounces svn and cvs largely because centralized VCS is vastly-inferior to DVCS. You want to use git because it will give you access to everything around you instead of leaving you on your own special little island.

--
Support my political activism on Patreon.
1. Re: Slower than MongoDB, has joins by Anonymous Coward · 2017-02-06 08:00 · Score: 1
  
  I think you are a smart guy who doesn't build a lot of complex systems.
  Having used all of the solutions you discuss, the real advantages of rethinkdb was nothing like you think. Sure it was way simpler than mongodb to Install and run. It's performance was good enough... etc etc.
  But it's real advantage was that it had a model that eliminated several parts of the stack. It was a DB, message queue, and rest interface all rolled into one. It let you speak Json anywhere. It made building performant feature Rich web apps easy, and it could also be used as an engine in medium to high complexity apps. It let regular developers tackle much bigger problems than they normally could with way less resources.
2. Re:Slower than MongoDB, has joins by Anonymous Coward · 2017-02-06 08:32 · Score: 0
  
  push systems always outperform pull systems unless the system being pushed to can handle the traffic. period.
3. Re:Slower than MongoDB, has joins by larkost · 2017-02-06 08:36 · Score: 5, Informative
  
  As a former RethinkDB employee I am more than a little biased, but I don't think that you understand the competitive space around MongoDB. Everything you have sited as an advantage for MongoDB is done better by just about every one of their competitors (RethinkDB included). MongoDB's main advantage is that they were the first big on in the field, and no-one has been able to make something better enough to de-seat them. It is not enough to be better, you have to be noticeably better in order to de-seat a reigning competitor. Think of the phrase "no one gets fired for buying IBM".
  And I also don't think you understand the cost of polling, especially for non-trivial (e.g.: not key-lookup) queries. While RethinkDB's `join` queries are not included in `changefeeds`, just about everything else is. So for example if you wanted to keep a leaderboard, say the top 10 scores in a game, you would have to re-compute that every time in most databases (at a minimum scan the index). With RethinkDB it automatically gets modified based on writes in the database, and sent to you. The efficiency improvement is truly huge. And since those queries can be fairly complicated (say: top 10 scores within the week), that gets very expensive with polling.
  An example that is in usage right now from a major stock trader: their iOS app uses RethinkDB to get streaming stock-price updates. The app (indirectly through a server) just opens a changefeed on the list of stocks that you follow, and RethinkDB coordinates who needs to get what updates when they feed in the stream of changes of market prices. They don't have a ton of clients constantly polling in order to show them constantly changing feeds of numbers (some change every second, others not in hours), and they can push out changes as fast as they get them.
4. Re:Slower than MongoDB, has joins by Richard_at_work · 2017-02-06 09:12 · Score: 1
  
  If you want a good document store, then Postgresql and Marten is amazing...
5. Re:Slower than MongoDB, has joins by Anonymous Coward · 2017-02-06 09:34 · Score: 0
  
  In postgres, you could put NOTIFY in a trigger for when your table is updated and have your software LISTEN to be notified when the table has changed.
6. Re:Slower than MongoDB, has joins by bluefoxlucid · 2017-02-06 10:26 · Score: 1
  
  Everything you have sited as an advantage for MongoDB is done better by just about every one of their competitors (RethinkDB included).
  Configuration takes about 10 minutes to figure out how to set up from scratch, as of 2.2 (although 2.4 is better), including getting a replica set working, setting up correct roles and access controls, and so forth. I still can't figure out how to make MySQL clusters actually work, mind you, so I'm not exactly the universal IT genius.
  
  So for example if you wanted to keep a leaderboard, say the top 10 scores in a game, you would have to re-compute that every time in most databases (at a minimum scan the index). With RethinkDB it automatically gets modified based on writes in the database, and sent to you. The efficiency improvement is truly huge.
  That can be true without being important. For example: running a service with compressed memory (zram, in particular) makes memory access when swapping a pain. When 50% of working set is in swap, you have 26 instructions per access to read, and more to write--when swapping. This is at a minimum twice the performance hit as a worst-case CPU cache miss.
  The efficiency improvement of having enough RAM is truly huge; the practical performance impact... is approximately zero for time scales significantly smaller than one second. Largely, unless the CPU is pegged around or above 99% and you're accessing memory at random in flat distribution continuously, your program spends a lot of time working on small chunks of the working set and then moving on. On a frequency of several times per second, you'll end up with time slices that would be CPU-idle for tens of mS at a time--which are larger than the time spent waiting for zram swapping; and that's before operating system memory access management (swap caching and prediction, as well as write-back scheduling) and multiple cores (which will have some idle time to perform further decompression before the program faults to swap) factors in.
  As well, a leader board would query for a list of documents based on a range of scores. To stream updates, every single update has to validate against the whole range of scores to determine who is the leader. That's an enormous amount of wasted work to update something when nobody's looking so you can push out an update that says to move #3 down one slot and insert #17 as the new #3. In this scenario, a DBA who isn't incompetent would place a key on the score field, and select for documents in descending order based on that score.
  If performance is a problem, then updates to scores could invalidate a cache made each time this is done, such that multiple updates between the polling frequency would not result in leader board recalculation or in repeat querying. That would allow polling at 0.2s intervals or so while updates occur 30 or 3,000 times in the interim, without wasting all that time recomputing leader boards fifteen thousand times per second to decide if a bunch of clients should get changefeeds updates.
  So I don't doubt that the specific problem of fetching leaderboard data in a certain type of scenario where the DBA is too incompetent to use indexes correctly is enormously faster in RethinkDB via changefeeds and continuously comparing new scores against a smaller set of cached data; I simply doubt that the amount of actual time spent doing that is tiny compared to the amount of time the application spends doing literally everything else. It's also the specific type of problem you could optimize out using something like Redis as an intermediary cache, or by employing someone who knows how to manage a database in production environments.
  
  The app (indirectly through a server) just opens a changefeed on the list of stocks that you follow, and RethinkDB coordinates who needs to get what updates when they feed in the stream of changes of market prices. They don't have a ton of clients constantly polling in order
  
  --
  Support my political activism on Patreon.
7. Re:Slower than MongoDB, has joins by Anonymous Coward · 2017-02-06 18:45 · Score: 0
  
  I fail to see how the argument on query complexity plays in favor of websocket. If you perform regular polling, I agree with you, but with long polling, the server just need to answer the request when needed.
  No I think the proper argument is with regards to resource management. Polling puts a load on your web server(s), websockets don't.
8. Re:Slower than MongoDB, has joins by Anonymous Coward · 2017-02-06 19:52 · Score: 0
  
  As a former RethinkDB employee I am more than a little biased, but I don't think that you understand the competitive space around MongoDB. Everything you have sited as an advantage for MongoDB is done better by just about every one of their competitors (RethinkDB included). MongoDB's main advantage is that they were the first big on in the field, and no-one has been able to make something better enough to de-seat them. It is not enough to be better, you have to be noticeably better in order to de-seat a reigning competitor. Think of the phrase "no one gets fired for buying IBM".
  It has gone so far that I often meet people that absolutely thinks that the term NoSQL means MongoDB.
9. Re:Slower than MongoDB, has joins by Anonymous Coward · 2017-02-06 19:55 · Score: 0
  
  Unless you want more than one postgres node running distributed in different areas of the world. Postgres has no ability to do horizontal scaling, it's good for toy apps but not if you want ability to actually scale.
10. Re:Slower than MongoDB, has joins by slashrio · 2017-02-06 19:59 · Score: 1
  
  So if the system being pushed to can not handle the traffic, then that is good?
  
  --
  "Trump!!", the new Godwin.
11. Re:Slower than MongoDB, has joins by Anonymous Coward · 2017-02-06 22:26 · Score: 0
  
  This is Just Not True...
  Yes, it scales horizontally. And quite well. I am not claiming it scales horizontally *better* than whatever other DB, though. It is worse than some, and much better than others. It depends on whether you are using the right DB engine for the job, too... why PostgreSQL when you don't need ACID? and so on.
LOL how to vaporize $12.2 million by Anonymous Coward · 2017-02-06 11:21 · Score: 0

Stupid management spent VC money on transgender washrooms instead of marketing. Morons.
Re: sited by slashrio · 2017-02-06 20:00 · Score: 1

cited
Sorry, couldn't let that one go, it just hurt(ed) my eyes. :)

--
"Trump!!", the new Godwin.