MemSQL Makers Say They've Created the Fastest Database On the Planet
mikejuk writes "Two former Facebook developers have created a new database that they say is the world's fastest and it is MySQL compatible. According to Eric Frenkiel and Nikita Shamgunov, MemSQL, the database they have developed over the past year, is thirty times faster than conventional disk-based databases. MemSQL has put together a video showing MySQL versus MemSQL carrying out a sequence of queries, in which MySQL performs at around 3,500 queries per second, while MemSQL achieves around 80,000 queries per second. The documentation says that MemSQL writes back to disk/SSD as soon as the transaction is acknowledged in memory, and that using a combination of write-ahead logging and snapshotting ensures your data is secure. There is a free version but so far how much a full version will cost isn't given." (See also this article at SlashBI.)
It sounds cool, but we can get 200k iops on Raid10 SSD without degradation.
When the foot seeks the place of the head, the line is crossed. Know your place. Keep your place. Be a shoe.
Really? Accessing RAM is faster than accessing a disk? What a novel discovery!
It seems to me that MySQL can also be run in memory. Apparently that's how the clustered database works (or used to work). I've never tried it, but let's see some benchmarks between MemSQL and an entirely memory-based MySQL.
"You cannot simultaneously prevent and prepare for war." -- Albert Einstein
MemSQL is definitely good news, and hopefully it will encourage the MySQL team to play catch up with it's performance. Maybe it will provide an improved web experience if it gets wide adoption and deployment. As a long time SysAd/webmaster/developer, I'm certainly interested, but for obvious reasons I'm not putting any business critical servers onto something this fresh and new, regardless of performance benefits. I think I'll download a copy and use it locally for testing, but like any software, there are going to be bugs, maybe even data loss or security issues that may emerge on certain server setups. I'll see how the changelog looks in 6 months or a year before considering it for my mission critical servers. Regardless, kudos to the developers. Grabbing my download before/if it gets /.'ed
When I think of fast databases to compare to, the first thing I think of is MySQL.
/Actually, I'd rather see a comparison to Pick or other lightning fast MV dbs
Show me benchmarks vs Oracle, PostgreSQL or SQLServer. Spare me the comparison with MySQL or some other toy.
Ok, so both article and video is extremely thin on details, the explanation for the massive performance is pretty much gibberish and their argumentation for ACID compliance is bullshit.
Just leaves me with the question, what are they trying to get out of this BS?
Give me fast enough, robust, easy to administer and standards compliant. Maybe a little less fast means you throw more hardware at a problem, but it doesn't matter if overall the overall cost and risk is inflated. A platform decision boils down to three things: (1) is it good enough; (2) is it economical; (3) if we decide later this doesn't work for us, are we totally screwed.
In any case, there's no meaningful way you can make a claim that a database management system is the fastest on the planet. All you have is benchmarks, and different benchmarks apply to different use-cases.
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
What you have there is (or may be) the fastest database management system.
I have the worlds fastest database. One table, one record, and one field (NULL).
Have gnu, will travel.
I wouldn't run my toaster on software engineered by someone from Facebook, let alone a database. I'd have to spend ten minutes searching for my toast, and it would show up the following week.
The hosts file thing can make networking hugely efficient, given how much time is eaten up on slow DNS servers.
I can easily see a DNS solution using P2P Bittorrent transfers and cryptographic signing that both allows distributed DNS, eliminates spoofing, and creates massive performance gains by mirroring hosts files to clients.
And then the next week, your toast would have changed from white bread to wholegrain and you're just going to have to get used to it.
Or its nowadays name: CACHE? The best, the fastest, and the most reliable commercial database on the planet? Common, guys, get real.
Windows NT Magazine (now Windows IT Pro) April 1997 "BACK OFFICE PERFORMANCE" issue, page 61
(&, for work done for EEC Systems/SuperSpeed.com on PAID CONTRACT (writing portions of their SuperCache program increasing its performance by up to 40% via my work) albeit, for their SuperDisk & HOW TO APPLY IT, took them to a finalist position @ MS Tech Ed, two years in a row 2000-2002, in its HARDEST CATEGORY: SQLServer Performance Enhancement).
This episode of "Tech Tips With Timecube Guy" brought to you by the letters W, T, and F, and by CapsCORP: If Bizarro Caps Are Your Thing, cALl uS!
History
The ARPANET, the predecessor of the Internet, had no distributed host name database. Each network node maintained its own map of the network nodes as needed and assigned them names that were memorable to the users of the system. There was no method for ensuring that all references to a given node in a network were using the same name, nor was there a way to read the hosts file of another computer to automatically obtain a copy.
The small size of the ARPANET kept the administrative overhead small to maintain an accurate hosts file. Network nodes typically had one address and could have many names. As local area TCP/IP computer networks gained popularity, however, the maintenance of hosts files became a larger burden on system administrators as networks and network nodes were being added to the system with increasing frequency.
http://en.wikipedia.org/wiki/Hosts_(file)
Dilbert RSS feed
I would like to see the compare againsr DB2. Midrange DB2 if you really want like for like, mainframe if you have guts :)
You have a sick, twisted mind. Please subscribe me to your newsletter.
Some clever tricks and cache management. All the speed improvement seems to be coming via read/write speeds rather than any fundamental breakthrough or parallel implementation or massively parallel database of any such thing. And the test was the standard test but some hand picked data base and their own queries. Probably the original funders are planning to sell it down to the next set of chumps.
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
They're durable and synchronously log all changes to disk, so what makes them faster? They do say this, from: http://developers.memsql.com/docs/1b/durability.html
Reconfigure the server to use a faster disk. MemSQL exclusively relies on sequential (not random) disk writes, so using an SSD will dramatically improve durability write performance.
Are SSDs better at sequential writes? I thought their advantage was random reads, and they weren't any faster at writes then HDDs. Also, the data would become hopelessly out of order by only doing sequential writes, unless they're periodically re-writing all the data in order, which would mean lots more I/O then a typical DB.
tomorrow who's gonna fuss
http://www.youtube.com/watch?v=b2F-DItXtZs
Is it written in php by any chance??
If you mod me down the terrorists will have won
Not only would it tell all your "friends" and relatives what you are eating and when but the control for turning notifications off would be deeply buried next to the mains power wire and mysteriously switch itself back on at random intervals.
The most over-the-top DB God I know started in Pick-land (ca 1972?). Although he does (is forced to?) use SQL nowadays, he thinks in ways that do not come out of any SQL DBA handbook. As a result he gets DBMSs to do things that are ... unnatural.
He is currently doing some data-cubing stuff for us that I didn't think could be done with something less than a DOD budget. He says his touchstone is thinking in Pick and then 'translating' to SQL.
I still think that the 2 missing courses from any CS degree program are 1) how to debug, and 2) history of computing.
Speed's fine, but what kind? Or more specifically, over what timeframe? High transaction rates are fine, but they don't do any good if you can only sustain them for a few seconds or minutes before the whole thing collapses. I want to know the transaction rate the thing can sustain over 24 hours of continuous operation. In the real world you have to be able to keep processing transactions continuously.
That long-time-period test also shows up another potential problem area: disk bottleneck. In-memory's fine, but few serious databases are small enough to fit completely in memory. And even if it will fit, you can't lose your database when you shut down to upgrade the software so eventually the data has to be written to disk. And that becomes a bottleneck. If your system can't flush to disk at least as rapidly as you're handling transactions, your disk writes start to lag behind. Sooner or later that'll cause a collapse as the buffers needed to hold data waiting to be written to disk compete for memory with the actual data. You can play algorithmic games to minimize the competition, but sooner or later you run up against the hard wall of disk throughput. And the higher your transactions rates are, the harder you're going to hit that wall.
Uh huh. MySQL is missing huge chunks of functionality by default, so this is not all that impressive. Wake me up when it's PostgreSQL-compatible.
They did have an ad to lure in "Top Coders" at http://developers.memsql.com/blog/
Apart from their ad, what they said about Top Coders was interesting - with the exception of top coders memorizing who books filled with algorithms, because top coders do not memorize nothing - top coders do not get to be top coders by memorizing.
Instead, top coders have that instinct to _know_ which algorithm to adapt and apply, and top coders know where (and how to) look for the algorithm (either from their own archive, from books, from old magazines, or from some strange corners on the Web)
Muchas Gracias, Señor Edward Snowden !
Newsflash: servers come with up to 2 TB RAM now.
Help stamp out iliturcy.
Oh but come on. Their engineers are super leet! To work at Facebook, you have to win a drunken speed-hacking contest just to be a PHP coder!
"You cannot simultaneously prevent and prepare for war." -- Albert Einstein
So what is the difference between MemSQL and TimesTen?
Other than the 16 years TimesTen has been out longer, the fact that Oracle now owns TimesTen, that it runs on both 32bit and 64bit Linux and Windows, that it can run in front of another database engine to give it a boost, and that it has customer installations up to the Terabyte range.
Just another lame attempt to reinvent the wheel.
http://www.youtube.com/watch?v=b2F-DItXtZs
Remember the good old days, when XYZ-db wasn't always available (or even disirable)? we used to use files.
Yea, files. Novel concept, these days, mention ISAM to someone and they don't know what you're talking about!
If you really need speed, maybe a database isn't your best bet. Maybe, just maybe, you should consider structuring the data in a way that makes sense for your application using files.
But not dislike the toast.
In memory database and ACID? More like ACIohshitpowerwentoff
"No more porridge". Right.
This thing is ACID at least?
Lisias@Earth.SolarSystem.OrionArm.MilkyWay.Local.Virgo.Universe.org
Oh but come on. Their engineers are super leet! To work at Facebook, you have to win a drunken speed-hacking contest just to be a PHP coder!
You just insulted coders everywhere.
The cesspool just got a check and balance.
I'm not so sure that this database would fly that fast if it was running on a beowulf cluster of Raspberry Pi with OSX.
c++;
MySQL the worlds most popular open source database
memSQL the worlds fastest database
PostgresSQL the worlds most advance open source database
SQLite most widely deployed SQL database engine in the world
I just wish people would dispense with their childish marketing bullshit already.
Actually, which license does this use in the first place? CDDL? GPL2? GPL3? Any other?
Are you thinking of Barry White? or Barry Smith? White left the Metheus workstation group and went to work at Mt Xinu, Smith was at OMSI and later worked at Oregon Software.on OMSI Pascal
That's the only Barrys I know.
This was intended to be humorous.
it very well may be the fastest and the bestest, my dad can beat up their dad to the googolplex infinitieth power!
I've had a love-hate relationship with MySQL for over ten years now, and have as much cause to hate it as anyone, but I have to point this out. Read the MemSQL docs carefully, and here's the killer - they only support single-query transactions, and only at isolation level READ COMMITTED.
Until those two facts change, then its hardly a fair comparison.
I guess they forgot about VoltDB and H-Base.
On my laptop I can do 43.000 transactions/sec or about 90.000 SQL queries/sec.
http://www.sgi.com/pdfs/4238.pdf
It seems to me that MemSQL is just an implementation of MySQL with the commit behavior changed to something like "BATCH, NO WAIT." This would normally introduce a period of time when the transaction could be lost before it is written to disk, if there were a power outage or something, but with battery backups on enterprise RAID cards, the transaction should still be saved in the RAM on the RAID card. I think it would still be possible to lose the transaction if the server crashed mid-transaction, so perhaps this is a safe implementation of BATCH, NO WAIT on MySQL?
Is it web scale?
I read it on Wikipedia, so it must be true.
Actually, which license does this use in the first place? CDDL? GPL2? GPL3? Any other?
MemSQL uses the standard, Pay Us And STFU About Wanting Access To Source Code license.
Based on that it should be safe to assume the PUASTFUAWATSC licenese is an OSI-approved/GPL compatible license. Expect MemSQL to be bundled in the next major release of Trisquel!
dBASE/xBase also teaches one to think in ways that SQL may otherwise squelch. Rather than get into yet another argument about which is "better" or more "mathematically pure", let's just say that it stretches the mind to see how other database languages go about things. Some things that are difficult in SQL are easier in other database languages, and the reasons will expand your mind to other possibilities. (Database LSD? Oh my!)
While there is a lot of diversity in the app programming world, the database world has been pretty much monopolized by SQL such that people stopped thinking outside the SQL box.
I hope the memory of Pick, dBASE, APL, etc. stay alive to avoid a monoculture in database-land, and spark the database future to improve in part by assimilating the best ideas of the past. (BorgBase? Oh my!)
Table-ized A.I.
Seriously, why do people, and then I mean slashdot nerds, think 'fast database' and then think 'mysql' ? 'MySQL-compatible' equals 'bad' in my world and, in comparison with Oracle, 'not so fast at all'.
Religion is what happens when nature strikes and groupthink goes wrong.
That's wasn't funny until I read "This was intended to be humorous." It made me smile. I didn't know Barry White could code.
first, we get a trunk full of pudding
then we get in the trunk full of pudding
then we start coding.
ohhh yeah
Why not put the whole memsql inside a GPU with DDR5 memory, 4gig per card, X 3 cards would eat any DB cluster.
Liberty freedom are no1, not dicks in suits.
Data center power problems. They can happen, and WAN replication is slow.
Having said that, 99.999% of the DB backed applications out there can quite happily survive a 0.0001% chance of losing a transaction, and should spec their DB solutions accordingly. Honestly, even most financial systems below the banking level can - you reach a point where its cheaper to just budget $10K per record to fix a problem should the once-in-a-lifetime perfect-storm error occur rather than to try to keep getting better at making sure they don't.
Kinda like the Ford Pinto, but with much better numbers (and less death when things go wrong).
You're special forces then? That's great! I just love your olympics!
Apache, whoops, no more Fds, pipe cannot open, whoops, ok, go super slow until linux, ssh, everything is unresponsive.
Liberty freedom are no1, not dicks in suits.
Research has shown over and over again that we CANNOT hold 20 factors in our minds at once, regardless of the context. The most gifted can only hold 5-7 things. I believe that the people who are truly top-anything get there by giving up such egotistical nonsense and writing things out on paper or a whiteboard. Everything else is bluster.
Shamgunov has excellent credentials in the database world, in spite of having worked at Microsoft on SQL Server for six years.
FTFY
'The tyrant will always find pretext for his tyranny.' - Aesop's Fables
Does it really take a long time to parse " SELECT * FROM customer" or whatever?
Doesn't prepare() usually precompile selects.
Their methodology is like assembling a car, making a drag race against a civic, and claiming you have the fastest car because it beat it very badly.
Looking at the resources consumed on our production database (SQL Server) over the last few minutes... Highest I/O usage: 1 MBps, average is near zero. Nearly all requests are happening in cached memory, and any I/O is to an SSD.
People here are talking about configuring MySql to run out of ram, update to ram, etc. But can't you accomplish much the same thing by not flushing the buffer cache? Or even tuning the machine so that the buffer cache gets lots of the ram? Doesnt Linux support lazy writes to disk? Why isnt MySql running mostly out of disk buffer cache?
Sure, man, whatever you say.
Dilbert RSS feed
Databases didn't use dynamic compilation of queries originally, so that's nothing new. Of course that performs better than dynamically generating a query plan. Even IBM DB2 on the mainframe still works this way. They state they use Write Ahead Logging. So do all modern RDBMS's including SQL Server, Oracle, MySQL, Informix, Postgress, etc. And the last claim is that they use SSD. Again, how is that specific to something they innovated? If you take any database and run it in memory (SSD really just being slower memory), it will run really fast. So what's new here? Maybe I'm missing something...
I can't speak for others, I can only speak for myself
I've been in this field for decades, and the one thing that I've found is this ---
If you're talking about Performance - for most apps, it's almost the 10%-25% of the code that takes up 80%-90% of the data-crunching time
Or, put it another way, when there is a need to really tune up the performance of an app, you need to understand how the code runs
And to do that, you need to profile the whole darn thing, fine where it takes most of the time, and then concentrate your effort on fine-tuning those segments
But of course, if you are a perfectionist - such as Steve Gibson of the GRC fame, - you can opt to go all the way to assembly language (I love asm, btw) and do the tweaking
Disclaimer: I am far from being great programmer, but I did have opportunities to learn from several great programmers that I had the fortune to work with, throughout my career
Muchas Gracias, Señor Edward Snowden !
I generally hate it when people say things like this, but... if you have this kind of problem, you likely have this kind of money too. They sort of go together.
Help stamp out iliturcy.
In other words, two guys spent just a year (wtf?) developing a database engine that manages to run queries in memory very fast, just like other database engines that can run in memory (that probably took teams of engineers more than a year to build). I wonder if they had time to write some transaction consistency code, because im pretty sure a 6yr old can make a program that inserts data into memory pointers.
Do they commit, or is "memory" a new form of "the cloud ... maan!"? Maybe they commit to the cloud? Fire and forget?
"Everyone knows that vi vi vi is the number of the beast" -- Richard Stallman
I am not sure about how an ssd fails, and what happens with the data on that disk...can you raid an ssd?
Having said this, if a failure on a hdd is eminent, you usually have it raided, so even if one fails, the transaction logs can be recreated.
This sounds like they managed to find a way to incorporate the writing to db log files unto the ssd itself, then transfer over to a hdd, but I wonder if during failure, you would lose 100% of the transaction logs due to that ssd not working. Anyone out there could answer this question?
MySQLs handlersocket (included since 5.5) does NoSQL-style read and write operations bypassing the SQL engine. While it has some limitations, it will do >200,000 queries/sec on a low-spec server and there are benchmarks of it doing >750,000 on a 8-core Nehalem (faster than Memcached!), and it's not restricted to in-memory operations. The nice thing is that you can use that for the simpler parts of your app, then use transactional SQL on the same database for more complex operations.
Another one to look at is TokuTek's TokuDB, another InnoDB drop-in replacement, which is particularly good for inserts, low disk use and low-latency replication. They ran a demo doing 1 billion indexed inserts in 7 hours when InnoDB took a week.
For distributed 'cloudy' apps, one of the better choices is Drizzle, which retains the nice bits of MySQL (and MySQL client compatibility) and rewrites all the rest.
I don't think I'll believe MemSQL until Percona have benchmarked it...
Of the two options presented, I most believe that the MemSQL team has beaten you to producing anything functional. You seem too incoherent to actually produce a useful program. The extent of your contributions to the world appears to be a list of anonymous posts to Slashdot (the vast majority of which are either downmodded to oblivion or only marginally appreciated) and a few old programs in Delphi, including an alarm clock with "extensive hand optimizations". I find no evidence that you have created anything of professional quality in the software realm. You have, however, produced a good many laughs with your seemingly genuine incompetence.
I never claimed I was a professor. I have a research job at a university, used to teach (as an adjunct faculty, actually), and wrote a few papers describing a new model that hasn't been explored before. You're the one twisting words now.
-Mr. H
You do not have a moral or legal right to do absolutely anything you want.