Brian Aker On the Future of Databases
blackbearnh recommends
an interview with MySQL Director of Technology Brian Aker that O'Reilly Media is running. Aker talks about the merger of MySQL with Sun, the challenges of designing databases for a SOA world, and what the next decade will bring as far as changes to traditional database architecture. Audio is also available. From the interview: "I think there's two things right now that are pushing the changes... The first thing that's going to push the basic old OLCP transactional database world, which... really hasn't [changed] in some time now — is really a change in the number of cores and the move to solid state disks because a lot of the... concept around database is the idea that you don't have access to enough memory. Your disk is slow, can't do random reads very well, and you maybe have one, maybe eight processors but... you look at some of the upper-end hardware and the mini-core stuff,... and you're almost looking at kind of an array of processing that you're doing; you've got access to so many processors. And well the whole story of trying to optimize... around the problem of random I/O being expensive, well that's not that big of a deal when you actually have solid state disks. So that's one whole area I think that will... cause a rethinking in... the standard Jim Gray relational database design."
I couldn't... agree... more... I'd say that some... very valid... points... have been... raised.
Can we please have another loud, circular debate over which database is best? It's the only way your favorite database will ever win.
Thank you.
Solid state disks will never match with hard disks. The price point for a spinning metal platter array will always stay ahead of the price point for SSD. A microdrive even does 8GB on what I assume is 2 sides of 1 platter... this would make a 3.5 inch drive with 8 platters 98 times bigger (surface area wise), 784GB. Of course we can do 2TiB such drives, and 20GB microdrives are feasible... 1.96TiB. 5.25 inch drives (Bigfoot hard drives) clock in at 200.5 times the surface area, 4TiB in this model.
Next year the data density on those spinning disks will go higher, and so will NAND. Let's watch the price point, especially when nanotech gives us shiny new read-write heads and super dense NAND....
Support my political activism on Patreon.
Gotta love that link between the hardware limitations and the software concepts that may seem fancy but are essentially only built to get around them. I believe someone once called it "the law of leaky abstractions" - would be interesting to see what the new limitations would be if you start combining solid-state storage with pervasive multiprocessing, i.e. what can you do with a multi-processor multi-sdd server that you can not do with a single-processor single-hard drive server?
I think TFA is pretty right on the money that parallellization and massive use of SSD could cause some pretty fundamental changes in how we approach database optimization - if I were to imagine that rack that I'm staring at being filled with SSD drives and processors instead of with nothing but hard drives... locality of data takes on a whole new meaning if you don't require data to be on the same sector of the HD, but rather want certain sets of data to be stored on storage chips located around the same processor chips to avoid having to overload your busses.
Then again, I haven't been in this game for so long, so maybe I'm overestimating the impact. Oldtimer opinion would be very welcome.
Eagerly awaiting the gratuitous Sun trash talking! Every thread seems to devolve into Sun bashing (especially with regard to the MySQL acquisition). Everyone: Sun did not decide to close source MySQL. If anything they are pushing against it.
MySQL has people who are responsible for *designing* it? I'm shocked, Shocked.
If moderation could change anything, it would be illegal.
I downloaded and tried to run this MYSQL version 5.0.51a. MYSQL is a seriously flawed product in my opinion. Because it uses a web interface. Why not a PC based information box? Requiring a web browser adds to the requirements of this product.
I start the thing. OK DOS box.
I start the administrative console.
Which starts SeaMonkey which is NOT MY defalt web browser.
SeaMonkey reports it cannot get to http://localhost:4848/
Does anyone know how MYSQL might have a decent PC based administrative panel like Microsoft products do?
Does anyone know about aforementioned errors?
I would appreciate any suggestions!
Jim
I haven't read the article yet, but that summary terrifies me. I keep hearing how in the modern age we shouldn't think about optimal programming because people have more resources than they need.
Databases need to scale to disgusting large numbers. Memory and disk resources should always be treated as expensive, precious commodities, because they might be plentiful on a simple database on robust hardware, but there are plenty of people out there with massive friggin' databases.
In corporate America, Oracle and MSSQL sadly are king. MySQL has some interesting advantages, one of them is performance over MSSQL, but if they squander that, what will they be left with? And ffrankly, I don't think Sun paid a fortune for MySQL just to piss away opportunities at gaining ground in corporate America.
http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
I recently blogged on this, but essentially, as long as your average PHP developer thinks of MySQL as a glorified flat file system to place their serialized PHP objects, an always-available, pay-as-you-go distributed database is going to revolutionize application development in the coming years. For those that want to keep control of their data, HBase is coming along quite nicely.
I'm actually reading the article now, and as he is talking about design for a database taking multiple cores into consideration, etc, I'm wondering if the traditional lock approach used in MySQL (and most SQL databases as far as I know) somewhat kills parallel operations. Wouldn't the interbase approach work better in a parallel environment?
Again, I'm sure this is a stupid question, but perhaps someone could clue me in.
http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
If you.... edit an article.... enough.... you can make it.... say just.... about anything.
In the article, they say the multiple cores will have a big impact on the design of databases. I don't understand this. Databases are already threaded, are they not? Why does multi-core support require different code than multi-thread support?
i'd hit it so hard, if you pulled me out you'd be the king of britain [bash.org]
2 words....SQL Server
Oracle is going the way of UNIX, and by UNIX I mean HPUX, Solaris, & AIX. In 10 Years you'll see the trend towards MS SQL Server for all but the top-tier business critical apps. MySQL and friends will likely remain a popular choice for meta-repositories and web tech solutions. The dirty secret is that tons of Oracle 8 & 9 mid-range systems are getting replaced with SQL Server every day as they get technology refreshes. You really can't argue the economics, and Oracle, like legacy big-iron won't change fast enough to turn the tide.
Flame on -
OCP Certified Oracle DBA
It almost sounds like he is talking about MonetDB.
As far as MySQL, after doing a lot of development with it over several years during the late dotcom/early bust days, I haven't worked with it lately because there hasn't been much demand for it. It'll be interesting to see if that changes with Sun's involvement.
Run and catch, run and catch, the lamb is caught in the blackberry patch.
Remember kids this is the same guy who said that ACID wasn't important and the emulating foreign keys was as good as having support for them in the database. Face facts - he will do / say what ever it takes to get you to use / buy his product. History shows this to be true.
Main-memory databases in the early 90s were figured out the issues when all pages can be accessed in constant time. A solid state
disk characteristics is no different from main memory.
See http://www.bell-labs.com/project/dali/ for one of such projects.
http://tangent.org/
Sup Brian!
-Bryan (from apachetoolbox)
If one's product does not support advanced features or, sometimes, even basic common sense features, no wonder one's predictions call for the world that does not need those features.
In my mind as a database engineer for a wall street bank, the biggest change in the near term that we forsee is data locality.
Given the amount of computing power on hand today, it may surprise many how difficult it is to engineer a system capable of executing more than a few thousand transactions per second per thread.
Why? Latency. Consider your average SOA application which reaches out to 4-5 remote services or dataserver calls to execute its task. Each network/rpc/soap/whatever call has a latency cost of anything between one and at worst several hundred milliseconds. Lets say for example that the total latency for all the calls necessary is 10 milliseconds. 1000/10=100 transactions per thread per second. Oh dear.
The amount of memory an "average" server ships with today is in the 32-64GB range. Next year it will be in the 64-128GB range. The average size of an OLTP database is 60-80GB.
So, the amount of memory available to the application tier will very soon be greater than the size of the database, warehouses excluded. Moore's law is quickly going to give the application tier far more memory than it needs to solve the average business state, exceptions noted.
The final fact in the puzzle is that for transaction processing, read operations outnumber write operations by roughly 20 to 1. (This will of course vary on the system, but that *is* the average.)
This situation is strongly in favor in migrating read only data caches back into the application tier, and only paying for the network hop when writes are done in the interests of safety. (There is a lot of research into how writes can be done safely asynchronously at the moment, but its not ready yet IMHO.)
Challenges exist in terms of efficient data access and manipulation when caches are large, performant garbage collection and upset recovery - but they are all solvable with care.
Its my opinion that in the near future large data caches in the application tier will become the norm. What has to be worked out is the most effective way of accessing, manipulating and administering that data tier and dealing with all the inevitable caveats of asynchronous data flow.
Some (not complete) examples of implementing this:
Relational Caches (there are many more):
http://www.oracle.com/technology/products/coherence/coherencedatagrid/coherence_for_java.html
http://www.alachisoft.com/ncache/index.html
Object Caches:
http://www.ogf.org/OGF21/materials/970/GigaSpaces_DataGrid_OGF_Oct07.ppt
http://jakarta.apache.org/jcs/
PC-File (from good 'ol Jim "Button" Knopf of Buttonware). That and PC-Write (Bob Wallace, Quicksoft) are all everyone ever needed. That and 640K. Where's my box of floppies?
There exists no way of exchanging information without making judgments. --Bene Gesserit Axiom
This is just another case where Moores Law bypasses another Law, in this case the Micromechanical Law.
Microchips are developing faster than Micromechanic so by leaving behind harddrives we might see flashbased storage become bigger and bigger.
But even more interesting, I think we will see SATA-attached flash being a short lived exception. It is much cheaper, faster, more flexible and more direct to attach it directly into address space (doesn't matter if NAND or NOR, the memory controller will take care of that). With that done we will also see DRAM becoming some sort of cache for flash and flash replacing DRAM. And as memory and inner circuit bandwidth will increase even further there really is no reason to safe bandwidth or storage any more - who cares if your 100GB database uses 200GB because you deflate/multi-index it for faster access times? The only limit here is that 64bit address space may be a little narrow but then we just see 128bit architectures ten years earlier.
Ah yes, the future of SQL... well, I think a small SQL-to-DBM interface should do the trick until all applications map the whole db into real address space. There is little use for a overcomplicated interpreting language if you can just put your whole 200GB db into an associative array.
"Life is short and in most cases it ends with death." Sir Sinclair
Until recently, solid state storage devices have been treated as "disks". But they're not disks. They have orders of magnitude less latency.
For files, this doesn't matter all that much. For databases, it changes everything. Solid state devices need new access mechanisms; I/O based seek/read/write is oriented towards big blocks and long latencies. The optimal access size for solid state storage devices is much smaller, more like the size of a cache line. With smaller, lower latency accesses, you can do more of them, instead of wasting channel bandwidth reading big blocks to get some small index item. It's not RAM, though; these devices usually aren't truly random access.
It starts to make sense to put more lookup-type functions out in the disk, since getting the data into the CPU has become the bottleneck. Search functions in the disk controller went out of fashion decades ago, but it may be time to bring them back. It may make sense to put some of the lower-level database functions in the disk controller, at the level of "query with key, get back record". Cacheing at the disk controller definitely makes sense, and it will be more effective if it's for units smaller than traditional "disk blocks"
This could be the beginning of the end of the UNIX "everything is a stream of bytes" model of data storage. We may see the "database is below the file system" model, which has appeared a few times in mainframes, make it to ordinary servers.
Considering SSD drives won't be mainstream for a few years on production servers for critical data storage, that a shift from relational to "something else" won't happen overnight due to inertia and business conservatism, that this "something else" system of storage hasn't been developed yet, wake me up in 10 years and I will look into it then, and I probably still will be an early adopter.
I wouldn't count on it. The main problem is the tension between standardized, interoperable, commodity physical devices and the ever-changing needs of the application software. The byte-stream or block-based I/O models won because they are painfully general purpose!
Instead, consider "the system is a cluster"... your local processing will occur finally when the full OS and application stack can migrate tasks onto the storage-local coprocessors and do message-passing over the network that used to be the "system bus".
SSD hasn't changed the whole seek/read/write paradigm because the current batch are tailored to be retrofit to existing HBA/RAID storage systems.
Once the IO drive is here we'll have a true random access to nonvolatile storage with latency measured in nanoseconds.
bite my glorious golden ass.
IIRC that's already the way AS-400's try to work.
It never works out in practice.
You sometimes (often) need to manage and understand your memory usage.
If you think that won't be true in the future implicit in your thought is that the problems being solved (or at least crunched on) will not keep up with hardware in complexity.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
Still up to your old tricks I see.
When you run out of columns just join to Participant2 and keep adding columns.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
Had to comment, the reference to Jim Gray was a little weird? I was lucky to work with Jim and we were often talking about technology changes and enhancements. Now - see what for example Tandem did call "massively parallel" database! The system was already built to allow several cpus and several nodes to interconnect transparently, Jim did see how that could be used and how the database optimizer really could work. Of course making direct access to any disc faster will help, especially now when the SDD's are getting bigger but the theory is nothing new. Even SQLite can show you that and think systems where you have 32, 128 or even 256 bit flat, memory speed but storage backed world - will change the picture, or? But be careful, we have already gone through many iterations making part of the system faster, as fixed head disks and even indexing in solid state, and found that it may (will) create other problems, not always seen upfront (except by JG!)
http://revj.sourceforge.net
However sometimes the truth is stranger than fiction and sarcasm is superfluous.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Not trying to start a war here, but seriously Databases != RDBMS. It seems like no one knows that object databases have been around a long time too. In the context of the article, many of the points can be applied to all types of databases, but it's so focused on the RDBMS (no shock considering the author).
There were a multitude of issues in the past with object databases from agendas, performance, complexity, etc that put relational databases at the forefront. Hardware and the quality of object databases has more than caught up, so why are object databases so rarely used still?
One answer why object databases are ignored to a large degree is that people don't like to stray from the norm and tend to implement what they know. Another possibility is many people simply have never even heard of the concept of object databases. Further, in academia we almost exclusively focus on relational databases in most courses. Finally, legacy data is perhaps the biggest hurdle.
A corollary to the issues above is that there is an entire industry of DBAs and developers that fight learning something new. There's also mega corps with billions invested in the concept of the relational databases. I don't blame MySQL and some of the things said in the article because they're just trying to improve, but on the user level, it's amazing how much effort goes into adapting the RDBMS into the online world and resulting crazy architecture/technologies/code.
Object relational mappers are a great example of our unwillingness to leave the RDBMS world(unless you're working with legacy/existing data of course, but even then, investigate the possibility to migrate). Why do we need ORMs in the first place? They are a product of using relational databases. When I'm programming, I want to work in objects and not bizarre mapping layers, complicated DALs, etc. We spend so much time on mapping and layers to build bridges between a relational and object world at the cost of productivity and performance simply to continue to hang on to our old RDBMSs.
I've found that in most cases, object databases are faster for my projects. I've also tried related databases like grid/network databases. There are definitely cases where relational databases are also better, but I would use one over the other on a case-by-case basis. I find for the average case I've seen, hardware and architecture tips the balance in favor of object databases because of the way how we want to model things using objects anyway. If we look at a popular type of app right now, a social network... why use a relational database? Typically the associations and structures we make are objects and hierarchies or networks. Relational databases are ill suited at both. Instead, we start to develop hack and wtf schemas, rely to heavily on the app to sort out the data, or introduce object database-like concepts like table inheritance. This also forces us to introduce and learn yet another language.
SQL is a huge discussion in itself. I find SQL brilliant and easy to use, but nonetheless ill-suited for many tasks. Once cursors, user defined functions, etc. were introduced, the nightmare got worse. I find procedural and object constructs instead of set based constructs in SQL created by clients all the time as a result. This ends up crippling performance and instead of fixing the issues, decision makers will just throw more hardware at the problem or ignore it all together. There's also this myth that somehow SQL creates a way for the layman to query data in the database. This is true to a small degree, but has mutated into something not unlike "human readable" for XML.
I'm certified in SQL Server and Oracle, and Postgres is my home RDBMS of choice, so certainly I have a lot invested, but if I'm offered something that is better I will gladly abandon all my intellectual and time investments in these systems. I use whatever works the best for the task. After building several apps using Gemstone over the years, I have to cringe every time I return to Oracle or even w
Memory and disks (solid state or platters) are *hardware*. Transactions are *software*. Transactions are implemented in code to insure the information gets written to disks to faithfully represent what happened with the information (updates, deletes, insertions, reads. With distributed applications, this is even more important than ever.
Also looking at the spec. of the Tokyo cabinet DB model and all I see is a return to network databases. Having suffered through IDMS, I can tell you this is the wrong approach.
Things haven't changed in a while because they work. Stop trying to reinvent the wheel!
putting the 'B' in LGBTQ+
Everybody knows that with spinning drives sequential writes are much faster than random. RAID and SAN technology has reduced the cost of random writes significantly (and in many cases made serial writes a lot more seeky than you'd assume). But in SSD the concept of physical distance between chunks is completely meaningless. Data can be Ghod-knows-where---and due to rewrite minimization algorithms on the drive, it probably is. The only advantage one large write has over many small ones is in the cost of setting up an atomic I/O operation.
Internally, most of the DBMSs I've worked with are architected around the idea of serial writes==fast. They have two basic phases for how they write to disk. Whenever you do an update---for every update that needs to be written to disk---the DBMS first writes it out to a serialized log file that contains transaction IDs, serial IDs within the transaction, and from- and to-images of the affected rows or pages or whatever. When it's time to flush that to disk, the DBMS starts doing a sequential read through this file, applying changes to the affected on-disk pages, probably going through an intermediate step where it sorts the writes so the random seeks require the least amount of head movement.
One of the problems with this technique is that the log file gets used like a circular buffer: An old, uncommitted write from this morning can bite you hours later when you've wrapped around to that point in the log.
With SSDs, all this clever thinking goes straight out the window: just tag the old disk pages with one ID, the new ones with another ID, and when the transaction is done you point to the new IDs. Done right it's wicked fast, and you no longer have circular buffer problems: The volume of uncommitted data you can handle is limited only by available disk space.
It's kinda neat, and I think it can revolutionize and simplify DBMS persistence even more than the introduction of RAID and SAN storage. I've done DBA work long enough to remember obsessing over which spindle I was putting what type of data on---never put an index on the same spindle as its table, sort of thing. I think about how much simpler RAID made my job and I can't help but wonder how much simpler SSD will make things for the DBMS itself.
This is not my sandwich.
Lets say for example that the total latency for all the calls necessary is 10 milliseconds. 1000/10=100 transactions per thread per second. Oh dear. Your numbers assume that you are blocking while waiting for the responses. It should be possible to process other transactions during that time. Some form of async messaging and a state machine, and you should be good to go.
Even if you don't want to go with asynch processing, why is it a problem if you can only handle 100 transactions per thread per second? Linux will happily run with thousands of threads.
I'm sure there's some reason why it's more complicated than it sounds, I'm curious what it is.
Saying MySQL/InnoDB or MySQL/ISAM gets complicated since you can access tables from different engines within a single query. Maybe there's no silver bullet engine for all scenarios. Ultimately, flexibility is the key feature. That way you can gather real performance data and choose the best database/engine for your specific application and data set.
Check out Informix and DB2 . Even Oracle does parallel scaling after a fashion.
This new fangled stuff like "row locking",
"parallel processing", etc. is only new to MYSQL
not to other RDBMS. So just copy what has been out there for a decade or two and stop telling us
MYSQL invented the Internet, everyone knows Al Gore did that.
I have always been a user of relational databases such as DB2 and Oracle. I am interested in the changes that would take place.
This is almost certainly a verbatim dump of Brian Aker's speech, and I think it makes him sound like a scatter-brained schizophrenic. Which isn't really fair, because most of us would sound this way when giving interviews. Read TV closed captions without audio on news programs, and even press secreteries speak in these fragments. Aker, or whoever manages media contacts at MySQL should insist on proper editing for interviews.