Enterprise Datacenter Hardware Assumptions May Be In For a Shakeup (acm.org)
For the entire careers of most practicing computer scientists, a fundamental observation has consistently held true: CPUs are significantly more performant and more expensive than I/O devices. The fact that CPUs can process data at extremely high rates, while simultaneously servicing multiple I/O devices, has had a sweeping impact on the design of both hardware and software for systems of all sizes, for pretty much as long as we've been building them. This assumption, however, is in the process of being completely invalidated.
Look at the I/O towers.
Every tower is lighting up.
Yes!
For the entire careers of most practicing computer scientists, a fundamental observation has consistently held true... and you won't believe what happens next!!!
This piece is citing articles written in 2005 as "ye olde world" and saying "OMG! something amaaaazing has happened.
Well, those 10 years represent 2 or 3 generations of datacentre hardware, depending on how you amortise your assets. So if the author has only just woken up to SSDs or SCMs then what have they been doing for the past decade?
In practice, the biggest bottleneck in the datacentre has been the network for a longish time. And the biggest bottleneck in most systems is the user's think-time. It is that last aspect which lies at the heart of multi-user systems.
However, the guy does have a point: the need for "olde worlde" performance management - designing the bottlenecks out of a system and diagnosing where the choke-points are (ans. the network) when things slow down has largely disappeared. But as for the rest of his stuff? Yes, we know all that.
politicians are like babies' nappies: they should both be changed regularly and for the same reasons
Never seen the word "performant" until today. Must be an obscure five-dollar word that scientists love to toss around. Meanwhile, I'll stick with cheap performance as my word of choice.
https://en.wiktionary.org/wiki/performant
Can someone provide a more useful summary?
That's a lot of description for telling absolutely nothing.
This is certainly interesting for processing data after it's been captured but in my experience the vast majority of data comes from sources that operate at nowhere near these speeds.
I'll not be giving up on spindles as a cost effective cache for 'slow data' any time soon...
This issue has been known to anyone using SSD's. The CPU's are still fast enough but the bandwidth between clients and servers (10Gbps is the average these days within a datacenter) no longer uses the full capacity of the disk subsystem (which is now connected at >10Gbps to each drive). Even having multiple disks in a single subsystem you can no longer use the capacity, not because of CPU issues but because of bandwidth issues between the CPU and the PCIe bus. That's why we're going away from large disk arrays and using 1 or 2U servers with 4-12 SSD drives and hooking them together with 'object storage' or other distributed storage mechanisms. That way you don't have a single point of failure and resource contention slowing you down.
But that's not the point, the point the article is making is that CPU's are getting too slow and that's not true. The CPU's are plenty fast and using any sort of off-loading mechanism would result in RAID controllers with CPU's that have to be just as powerful because if they aren't, you get the issues you have with current RAID controllers: they are slow and expensive (a single link to a 12Gbps chip is a bottleneck to an entire array of 12Gbps drives). Also you lose the scheduling, checksumming, hardware monitoring and all the other fancy things software-based solutions do these days.
Using CPU's as glorified RAID controllers is just fine and I don't foresee another solution as long as your software is fast and concise (eg. ZFS). If you start handing off anything to dedicated CPU's then you're just losing the control and customization a software based solution allows you to have.
Custom electronics and digital signage for your business: www.evcircuits.com
I found a single card that cost $14k to provide 5GB.
Until the price drops by a factor of at least 10 I doubt this will go anywhere fast.
storage is slower than processors until you consider caching things in ram and in which case its magically faster.
Other points mentioned:
Balanced Systems: you can have lots of ram but make sure you have the network to serve it. CPUs were unavaillable for comment.
Contention-Free I/O-centric Scheduling: uh, has been around for nearly 15 years since the invention of the X86_64 architecture...at least...formally in the domain of commodity hardware. CPUS could not be reached for comment.
Workload-aware Storage Tiering: remember all that crap we mentioned about memory caches for everything? well now we're drifting into the realm of object stores so sit tight. tiered storage has existed for 15 years as well...so we're a bit late to the party for this one.
The Future: RAM + Acronym + expensive support contract = Storage Class Memory!. learn it, embrace it, and most importantly, make sure its on the fucking purchase order this year*
*not applicable if youre using redis, memcached, ceph, couch, hadoop, hypercube, or any one of about 30 other different commodity hardware centric distributed data frameworks designed to purge the vendors from the budget as jesus purged the jews from the temple.
Good people go to bed earlier.
Now tell me something I don't already know..
OK, OK, so CPU speeds are not trending up at quite the same pace and nonvolatile storage. But it's not like this has gone unnoticed or we haven't been making hardware changes to take advantage of this over the last decade in the data center. Just like we've adjusted to new power, network and virtualization technologies in the data center.
The real story is that CPU speeds are not trending up as steeply as they where 10 years ago, but we've been seeing huge leaps in storage speeds. How this changes the optimum hardware and software configuration is no real mystery, as system designers and integrators have been effectively dealing with this for years now...
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
What does it mean for a CPU to be "more performant" that an I/O device? They do totally separate things. You can't even measure them with the same units.
Is a drill "more performant" than a hammer?
Performant is actually a pretty useful word in place of "real" ones like "faster", because "performance" is a word that can change meaning depending on what you consider to be good (or desired) performance.
Maybe good performance means that it's using all of the cores on a CPU well. maybe it means that it's not using much of the system at all, but is using the network very well, or work is spread out across a cluster in an extremely balanced fashion. "Faster" may be a by-product, but it may not, because people using the word "performant" often value stability over absolute speed.
I guess the closest concept "performant" comes to is being well-balanced, or perhaps meeting some goal you had set during design.
So don't be too dismissive of a new word, it can be the case a new word was made because old ones wouldn't really fit without a lot of verbosity.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
At least, not totally correct. Memory bus non-volatile storage such as Intel's X-Point stuff still requires significant cache management by the operating system. Why? Because it doesn't have nearly enough durability to just be mapped as general purpose memory. A DRAM cell goes through trillions of cycles in its live time. Something like X-Point might be 1000x more durable than standard flash, but it is still 6 orders of magnitude LESS durable than DRAM. So you can't just let user programs write to it however they like.
Secondly, in terms of data-center machines becoming obsolete. Also not correct. SSDs make a fine bridge between traditional HDD or networked storage and something like X-Point. For two reasons: First, all data center machines have multiple SATA busses running at 6GBits. Gang them all together and you have a few gigabytes/sec worth of standard storage bandwidth. Secondly, you can pop nVME flash (PCI-E based flash controllers) into a server and each one has in excess of 1 GByte/sec of bandwidth (and usually much more).
Third, in terms of memory management, paging to/from SSD or nVME 'swap' space, or using it as a front-end cache for slower remote storage or spinny disks, already provides servers with a fresh new life that means they won't be obsolete for years to come.
And finally there is the cost. These specialized memory-bus non-volatile memories are going to be expensive. VERY expensive. To the point where existing configurations still have a pretty solid niche to play in. Not all workloads are storage-intensive and these new memory-bus non-volatile memories don't have the density to be able to replace the storage required for large databases (or anywhere near it).
So, the article is basically a bit too pie-in-the-sky and ignores a lot of issues.
-Matt
Hyper Convergence is possible now. VSAN is my favorite :)
a fundamental observation has consistently held true: CPUs are significantly more performant and more expensive than I/O devices
If I sold my computer now, I'd say 95% of what I get would come from my I/O devices. And here I'm including motherboard, RAM, SSD & chassi to "CPU".
A good display, keyboard, mouse & headphones have been more crucial for my performance than CPU upgrades for soon a decade.
ceph is cool and just want NON raid cards to link the back planes to the system board. Hardware raid was good in the past but now days multi node software is better with out the hardware raid lock in / losing 1-2 disks = data lost.
Latency is just one dimension of the problem.
1. 64 bit addressing offers huge address ranges perfect for addressing high speed devices.
2. density of high speed SSD storage has almost reached a point where it's cost effective.
The idea of having offboard 'controllers' for things will reach an end at some point. We've already seen consolidation here, with thunderbolt (PCIE on a cable), USB 3.1 (which is basically thunderbolt).
Ohhhh...
So that's why Intel bought out the FPGA company, think in terms of a programmable data filter
built into the drive, itself.
Well. I think the authors do have some points although at least some of them are existing in embedded systems (which execute directly out of Flash) for a long time:
* CPU cycle hungry, most efficient disk caching algorithms are not that efficient anymore once "disk" (or rather Flash) access manages to catch up to the CPU. Less efficient but also less resource hungry algorithms might be advantageous then.
* Issuing lots of read accesses in advance to keep your worker threads busy might only help in occupying RAM but not speed up processing anymore if data arrives long before workers finished their previous job.
* Multi-core access to the Flash needs better (and less blocking) synchronization than with disk, where actual colliding accesses would have been more rare due to long time between them (being executed).
* If serial and random accesses show only small difference in access times (as they do for Flash: a few clock cycles for the Flash to throw away its read-ahead cache and get new data instead of the huge wait for head positioning and sector arrival of spinning disks), caching strategies might have to change, e.g. maybe caching leaf inodes is then not efficient anymore (just guessing here).
* And they seem to be talking about networking due to attaching disk clusters to servers via ethernet. But there I guess the authors are not radical enough: Why not connect the Flash devices directly to the server, they take much less space and power than spinning disks.
But as I said, I would have expected many of this being explored with respect to embedded computing long ago and with respect to servers already since the advent of the first SSDs (by talking to the embedded guys). Now seems a bit late for an ACM magazine article about that (unless ACM is falling behind tech development).
Seems to me there's a simple and major error in this document. Figure 2 shows storage latency in nanoseconds for a spinning disk... I don't know what kind of spinning disks they're using... maybe alien technology?
The basic point of the article is dead on. The major assumption that I/O is extremely slow has driven the organization of computer architecture from the beginning. But, as the article noted, in the last few years, that equation can be changed drastically. The memory hierarchy is going to get more complicated: DRAM, NVDIMM, NVM, SSD, HD, Optical/Tape, and best using that hierarchy means that there are changes that need to be made.
For one, I think there will be a lot of research in this area. Just like modern network cards do a lot of processing before involving the CPU, it may be necessary to have similar abilities. For example, allowing the network and I/O devices to work with each other with much less CPU intervention. Or making the I/O controllers smarter so that computation can be moved to the disks. How to best organize operating systems to use this new memory hierarchy well.