Cassandra Rewritten In C++, Ten Times Faster
urdak writes: At Cassandra Summit opening today, Avi Kivity and Dor Laor (who had previously written KVM and OSv) announced ScyllaDB — an open-source C++ rewrite of Cassandra, the popular NoSQL database. ScyllaDB claims to achieve a whopping 10 times more throughput per node than the original Java code, with sub-millisecond 99%ile latency. They even measured 1 million transactions per second on a single node. The performance of the new code is attributed to writing it in Seastar — a C++ framework for writing complex asynchronous applications with optimal performance on modern hardware.
It comes from an old (15+ years) defense of Java. The claim was that Java was no longer slow thank to JIT, with HotSpot making it possible for Java code to run faster than equivalent code written in C or C++.
OP is playing the part of a turn-of-the-century die-hard Java zealot cracking under the harsh light of reality, desperately clinging to their long-cherished beliefs.
Required reading for internet skeptics
Not all true. Over the years I have compared "slow" languages Lisp, Java, and .Net to the "fast" C. For various odd reasons the slow languages were faster each time.
The modern Jit compilers have a huge advantage over C because they can do whole of program optimization and they can aggressively inline. Sure, one can declare C methods inline, but I compared Java and .Net to real production code where the programmers forgot to. So in practice the slow languages really were faster. And in-lined routines kill binary compatibility, particularly for access methods of opaque types -- not a problem for Java/.Net. Modern garbage collection is often actually faster than malloc/free, and certainly faster than any reference counting schemes.
Sure, there are some idiot restrictions in Java that make .Net faster, such as no structs and no way to turn off array bounds checking. But C is a technology that was out of date when it was first introduced 30 years ago. If any C compiler can beat .Net for some particular case it will be by a few tens of percent at most.
If this speed increase is real, it is nothing to do with C vs Java.
Yeah, it's more to do with using a framework that helps with the aggressive use of computer resources than being in one language over another.
Some of the latency gains might be down to C++ vs Java, but the throughput is probably because the CPU is less idle.
Really? Sounds a bit rich to claim that an interpreted language would be faster than a compiled one,
The reasoning is because any bottleneck in code will be in a loop (or recursion, or whatever).
Java is roughly only interpreted on the first iteration of a loop, when it gets compiled by JIT. After that, it's assembly code, just like C.
Add to that, there are some optimizations that can be done at run-time by the JIT that can't be done at compile time.
These are typically the reasons people claim Java is faster than C or C++.
Also, it seems the Java creators at Sun were really competitive and got upset when people said their language was slower than C++, so they spent a lot of time optimizing the efficiency of their standard library, more than the C++ compiler writers of the time.
"First they came for the slanderers and i said nothing."
The headline is rather misleading. This isn't just a plain port of the code from Java to C++ to get a magical 10x speedup. Amongst other things they appear to be running an entire TCP stack in userspace and using special kernel drivers to avoid interrupts. This is the same team that produced OSv, an entirely new kernel written in C++ that gets massive speedups over Linux ..... partly by doing things like not using memory virtualisation at all. Fast but unsafe. These guys are hard core in a way more advanced way than just "hey let's switch languages".
Why are you talking about an interpreted program? We're specifically talking about JIT-compiled Java. Modern JITs use trace-based optimisation, which means that they will generate straight-line binary code for hot paths that span multiple method calls and returns. This is something that an AoT-compiled implementation can't do without a lot of profiling information. A JIT compiler can also optimise based on assumptions that are true for one phase of the program, then throw away the result if it stops being true for a later phase.
There are also other trades. For example, if you're writing memory-safe C++ and sharing pointers across threads, then you're going to be using std::shared_ptr, which performs an atomic operation (MESI bus traffic) on every assignment. In a typical JVM, copying pointers doesn't require atomic operations, but the cost of this is the GC pass. Depending on your workload, the GC cost can be a lot cheaper, a lot more expensive, or about the same as doing it correctly with smart pointers.
Unfortunately, a big part of the current 'Java is slow' claim is from idiots who don't understand that different GC implementations are all on a spectrum trading throughput for latency and who then build big distributed systems where tail latency in the edge nodes is important, then run a throughput-optimised stop-the-world collector on the edges and wonder why it sucks.
I am TheRaven on Soylent News