Cassandra Rewritten In C++, Ten Times Faster
urdak writes: At Cassandra Summit opening today, Avi Kivity and Dor Laor (who had previously written KVM and OSv) announced ScyllaDB — an open-source C++ rewrite of Cassandra, the popular NoSQL database. ScyllaDB claims to achieve a whopping 10 times more throughput per node than the original Java code, with sub-millisecond 99%ile latency. They even measured 1 million transactions per second on a single node. The performance of the new code is attributed to writing it in Seastar — a C++ framework for writing complex asynchronous applications with optimal performance on modern hardware.
Because it was written in Seastar
Seriously. WTF?
That is a lie!
I think they mean the C++ port is 10X SLOWER than Java.
Java is faster than C,C++ everyone knows that!
Maybe if they ran the code on a java interpreter, written in java, running on a java interpreter...
More recursive use of java == more speed!
Why slow a system down with all that C++ bloatware?
Almost as fast as native! Maybe even faster for some tasks!
sure
This is the trademark reason why Java shouldn't be used in performance sensitive environments in the first place.
As for would it have been any faster if it was written in C or straight ASM, probably not worth chasing down that extra 1%. Generally the justification for straight C or ASM is to remove runtime bloat, and you'd first have to give up using any frameworks to get there.
Just to remind potential programmers. Lean C before you learn any other programming language, otherwise you will not understand why your code's performance is terrible.
Sans sarcasm I would've also accepted: "duh"
--- Need web hosting?
They also boosted performance by never freeing memory, too!
If you post it, they will read.
Oracle has just launched a new series of patent infringement lawsuits. Oracle allegations include reverse engineering Java to improve the speed of applications like Cassandra, benchmarking Java without permission. They are seeking an immediate cease and desist order, in addition to immediate financial relief for sustaining PPS (More commonly known as Poopy Pants Syndrome.).
-The wise argue that there are few absolutes, the fool argues that there are no probabilities.
Databases are usually I/O bound and improvement of storage structure/network protocol is more important than spot optimization of code. A more likely statement is that scylladb performed ten times faster than Cassandra in one particular benchmark for which Cassandra has not been specifically optimized for yet and is ten percent faster in an average case.
In either case, good luck maintaining speed and stability after 5 releases when you implement every corner case of every feature and have to deal with legacy support.
I find it depressing that so little attention is paid to efficient computing. People now just throw memory and cycles at problems because they can with passable results. But I wonder how much more we could get out of our machines if software was carefully crafted from bottom to top.
Databases used to be disk bound, sure. But these days we have huge RAM caches and SSDs - no spinning disks. It's very common for the vast majority of requests to be served entirely from cache. Read the guys' site - it looks like they know what they're doing.
Imagine if Redis was ten times slower or ten times faster. It would matter.
Wow, two years ago everyone here told us that NoSQL is evil and tried to convince us that we should stick to MySQL.
Now everyone tells us Java is evil, because a rewrite in C++ is faster.
What a surprise.
If I would rewrite Cassandra from scratch, in Java, it also would be faster than the actual code.
Why? Because all the learning the original team did over a course of a decade I can reuse and improve on.
Keep in mind, the rewrite uses a new framework and new concepts for concurrency. Concurrency is one of the core areas where computing in future will certainly make lots of progress.
I for my part I'm waiting for a Lucene rewrite, regardless in what language. Probably the worst OSS code I have ever see ... actually the worst code regardless of OSS or closed source.
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
I will only use MongoDB because it is web scale.
Yes. It's now easy to scale to a million or more IOPS on a single server. That makes the CPU the bottleneck again.