Cassandra Rewritten In C++, Ten Times Faster
urdak writes: At Cassandra Summit opening today, Avi Kivity and Dor Laor (who had previously written KVM and OSv) announced ScyllaDB — an open-source C++ rewrite of Cassandra, the popular NoSQL database. ScyllaDB claims to achieve a whopping 10 times more throughput per node than the original Java code, with sub-millisecond 99%ile latency. They even measured 1 million transactions per second on a single node. The performance of the new code is attributed to writing it in Seastar — a C++ framework for writing complex asynchronous applications with optimal performance on modern hardware.
Cassandra is nothing to sneeze out since it outperforms other db-engines (which are written in C, like MySQL).
Anyhow, you use the right tool for the job, and the big question is: would ScyllaDB even exist if Cassandra wasn't written first?
Databases are usually I/O bound and improvement of storage structure/network protocol is more important than spot optimization of code. A more likely statement is that scylladb performed ten times faster than Cassandra in one particular benchmark for which Cassandra has not been specifically optimized for yet and is ten percent faster in an average case.
In either case, good luck maintaining speed and stability after 5 releases when you implement every corner case of every feature and have to deal with legacy support.
I would say that 95% of all people I know in person, who learned C first and not: Assembler, Pascal, SmallTalk, Lisp are extremely bad on advanced language concepts like functional or oo programming. Most of them shifted to scripting and operating servers and don't "code". A minority is doing embedded programming in C++ which mainly looks like C.
The idea that learning C first has any advantage is completely bollocks, a /. myth.
I started with C in 1987 ... on Sun Solaris (after 6 years Assembler, Pascal and BASIC) ... 1989 I switched to C++. I never looked back.
Only masochists would look back at C of that period.
ANSI C is much better ... but still: when I see a self proclaimed C genius with 30 years experience program Java or C++ ... shudder.
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
"C++ is my favorite garbage collected language because it generates so little garbage"
-Bjarne Stroustrup
1. C is not portable, it's tied to the architectures/OSs/APIs the programmer chose to target at write time.
2. Leanness and close-to-the-metal speed are irrelevant in most business scenaios (time to market rules, cores and memory are commodity, see ABAP and related monsters successfully running most of the world transactions regardless of C).
3. C is not a language meant to implement business solutions, it's a wrapper for ASM for idiots who can't write ASM themselves.(rethorical)
4. Writing string processing libraries is tuff stuff, text can have different endings, (rethorical)
5. You haven't done anything really complicated that requires your focus to shift away from "bare-metal" to "time-to-value", by your own logic ASM is better than C.
High performance software requires several things, among them good native code generation and good libraries. Java used to have neither, then it got the JIT. Unfortunately, Java's semantics and built-in data types make writing high performance software in it really hard.
C++ started out with good native code generation, and its standard library and built in data types make writing high performance software a bit easier if you know what you are doing. Most C++ programmers don't know what they are doing, though, so their software ends up bloated and inefficient anyway.