Enterprise Datacenter Hardware Assumptions May Be In For a Shakeup (acm.org)
For the entire careers of most practicing computer scientists, a fundamental observation has consistently held true: CPUs are significantly more performant and more expensive than I/O devices. The fact that CPUs can process data at extremely high rates, while simultaneously servicing multiple I/O devices, has had a sweeping impact on the design of both hardware and software for systems of all sizes, for pretty much as long as we've been building them. This assumption, however, is in the process of being completely invalidated.
Yes!
For the entire careers of most practicing computer scientists, a fundamental observation has consistently held true... and you won't believe what happens next!!!
This piece is citing articles written in 2005 as "ye olde world" and saying "OMG! something amaaaazing has happened.
Well, those 10 years represent 2 or 3 generations of datacentre hardware, depending on how you amortise your assets. So if the author has only just woken up to SSDs or SCMs then what have they been doing for the past decade?
In practice, the biggest bottleneck in the datacentre has been the network for a longish time. And the biggest bottleneck in most systems is the user's think-time. It is that last aspect which lies at the heart of multi-user systems.
However, the guy does have a point: the need for "olde worlde" performance management - designing the bottlenecks out of a system and diagnosing where the choke-points are (ans. the network) when things slow down has largely disappeared. But as for the rest of his stuff? Yes, we know all that.
politicians are like babies' nappies: they should both be changed regularly and for the same reasons
"Performant" is an invaluable word. It instantly identifies those who use it seriously as people who may be safely ignored.
Bugrit! Millenium hand and shrimp!
Performance is a noun. Performant is an adjective. I guess he could have said "faster"
"SSDs exist now"
Language is a tool. Just because you're not versed in its intricacies doesn't mean that someone who is is inferior to you.
Linux, you magnificent bastard, I read the fucking manual!
I/O tends to be the system bottlenecks. Someone's trying to sell something.
.
So either the word is relatively new, or in niche use.
This issue has been known to anyone using SSD's. The CPU's are still fast enough but the bandwidth between clients and servers (10Gbps is the average these days within a datacenter) no longer uses the full capacity of the disk subsystem (which is now connected at >10Gbps to each drive). Even having multiple disks in a single subsystem you can no longer use the capacity, not because of CPU issues but because of bandwidth issues between the CPU and the PCIe bus. That's why we're going away from large disk arrays and using 1 or 2U servers with 4-12 SSD drives and hooking them together with 'object storage' or other distributed storage mechanisms. That way you don't have a single point of failure and resource contention slowing you down.
But that's not the point, the point the article is making is that CPU's are getting too slow and that's not true. The CPU's are plenty fast and using any sort of off-loading mechanism would result in RAID controllers with CPU's that have to be just as powerful because if they aren't, you get the issues you have with current RAID controllers: they are slow and expensive (a single link to a 12Gbps chip is a bottleneck to an entire array of 12Gbps drives). Also you lose the scheduling, checksumming, hardware monitoring and all the other fancy things software-based solutions do these days.
Using CPU's as glorified RAID controllers is just fine and I don't foresee another solution as long as your software is fast and concise (eg. ZFS). If you start handing off anything to dedicated CPU's then you're just losing the control and customization a software based solution allows you to have.
Custom electronics and digital signage for your business: www.evcircuits.com
"Performant" is an invaluable word. It instantly identifies those who use it seriously as people who may be safely ignored.
Speaking of invaluable, I have found that those who spew the most buzzwords in their vernacular also happen to control the budget.
In other words, tread lightly. The "PHB" wasn't born from pure fiction...
storage is slower than processors until you consider caching things in ram and in which case its magically faster.
Other points mentioned:
Balanced Systems: you can have lots of ram but make sure you have the network to serve it. CPUs were unavaillable for comment.
Contention-Free I/O-centric Scheduling: uh, has been around for nearly 15 years since the invention of the X86_64 architecture...at least...formally in the domain of commodity hardware. CPUS could not be reached for comment.
Workload-aware Storage Tiering: remember all that crap we mentioned about memory caches for everything? well now we're drifting into the realm of object stores so sit tight. tiered storage has existed for 15 years as well...so we're a bit late to the party for this one.
The Future: RAM + Acronym + expensive support contract = Storage Class Memory!. learn it, embrace it, and most importantly, make sure its on the fucking purchase order this year*
*not applicable if youre using redis, memcached, ceph, couch, hadoop, hypercube, or any one of about 30 other different commodity hardware centric distributed data frameworks designed to purge the vendors from the budget as jesus purged the jews from the temple.
Good people go to bed earlier.
What does it mean for a CPU to be "more performant" that an I/O device? They do totally separate things. You can't even measure them with the same units.
Is a drill "more performant" than a hammer?
Performant is actually a pretty useful word in place of "real" ones like "faster", because "performance" is a word that can change meaning depending on what you consider to be good (or desired) performance.
Maybe good performance means that it's using all of the cores on a CPU well. maybe it means that it's not using much of the system at all, but is using the network very well, or work is spread out across a cluster in an extremely balanced fashion. "Faster" may be a by-product, but it may not, because people using the word "performant" often value stability over absolute speed.
I guess the closest concept "performant" comes to is being well-balanced, or perhaps meeting some goal you had set during design.
So don't be too dismissive of a new word, it can be the case a new word was made because old ones wouldn't really fit without a lot of verbosity.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
A system which has good performance is said to be performant.
Your own link says this has been in used for at least since the 70s.
It's hardly a new term. It may only come up in specific contexts related to computing performance, but it ain't new.
Lost at C:>. Found at C.
Really what we're seeing is storage finally starting to catch up with CPU after lagging behind for nearly 30 years. The author is freaking out that ZOMG the disk isn't always the slowest thing on the system anymore, but this is not really news at this point. The exciting part for me is that in some cases you may be able to eliminate one cache from the system. Caches are a necessary evil that introduce big headaches into system design, so being able to eliminate one can greatly simplify parts of your system.
I did find it odd that the use case he kept going back to was someone buying some crazy expensive RAM storage array, and then sticking a single commodity server on the thing and being shocked that the server was CPU bound. The point about the Linux IO subsystem not being up to the task is interesting, but having not looked into it myself I can't help but to wonder if there isn't some kernel tuning or optional module support he could have enabled to improve the situation.
I read the internet for the articles.
At least, not totally correct. Memory bus non-volatile storage such as Intel's X-Point stuff still requires significant cache management by the operating system. Why? Because it doesn't have nearly enough durability to just be mapped as general purpose memory. A DRAM cell goes through trillions of cycles in its live time. Something like X-Point might be 1000x more durable than standard flash, but it is still 6 orders of magnitude LESS durable than DRAM. So you can't just let user programs write to it however they like.
Secondly, in terms of data-center machines becoming obsolete. Also not correct. SSDs make a fine bridge between traditional HDD or networked storage and something like X-Point. For two reasons: First, all data center machines have multiple SATA busses running at 6GBits. Gang them all together and you have a few gigabytes/sec worth of standard storage bandwidth. Secondly, you can pop nVME flash (PCI-E based flash controllers) into a server and each one has in excess of 1 GByte/sec of bandwidth (and usually much more).
Third, in terms of memory management, paging to/from SSD or nVME 'swap' space, or using it as a front-end cache for slower remote storage or spinny disks, already provides servers with a fresh new life that means they won't be obsolete for years to come.
And finally there is the cost. These specialized memory-bus non-volatile memories are going to be expensive. VERY expensive. To the point where existing configurations still have a pretty solid niche to play in. Not all workloads are storage-intensive and these new memory-bus non-volatile memories don't have the density to be able to replace the storage required for large databases (or anywhere near it).
So, the article is basically a bit too pie-in-the-sky and ignores a lot of issues.
-Matt
sorry, but "unabridged" means nothing with respect to the comprehensiveness of a dictionary -- except that if there is an abridged edition of the same dictionary the abridged version will have fewer words (or smaller definitions, or something).
Your conclusion is based on a flawed assumption. I've been using a small dictionary for around thirty five years now. I can't check it (the dictionary is at home), but based on frequency of word use over time and the quality of that dictionary I expect the word would be listed. Maybe your unabridged dictionary isn't as good as you think it is -- if you actually care about words it is worthwhile having more than one dictionary.
People often don't really register words that they hear or see (especially if the words seem familiar) and tend to underestimate the frequency of usage or overestimate their recency of usage as a result.
Lets compare "performant" (which has apparently been in use for over a hundred years and is still used with many people having a basic idea of what it means from its form and in context) with "tergiversator" (which comes from latin and is essentially unused in recent years though still listed in dictionaries -- and I'd wager most people have no idea what it means without consulting a dictionary).
And VMware made VSAN to go on top of them - RAID5 for compute/disk nodes w 10G network backplanes instead of SAS.
There are two types of people in the world: Those who crave closure
Language is a tool. Just because you're not versed in its intricacies doesn't mean that someone who is is inferior to you.
People who use buzzwords to hide the fact that they aren't really saying anything are tools. It's been a long time since I've read an academic article so full of bullshit.
ceph is cool and just want NON raid cards to link the back planes to the system board. Hardware raid was good in the past but now days multi node software is better with out the hardware raid lock in / losing 1-2 disks = data lost.
Sorry, but performant isn't in the Webster online dictionary, and even Chrome thinks it's a misspelled word. Googles Ngram views also shows that up until the last few decades it's a rarely used word. However, in the books the Ngram viewer references it's not being used to indicate performance in at least one case in 1812. Heck even the usage in the 70's and 80's seems to be referencing it as an actor in something, and having nothing to do with performance or efficiency as this article want it to be.
Performance is a noun. Performant is an adjective. I guess he could have said "faster"
performant
noun
a performer
Word Origin
based on informant, etc.
I thought that was "performer".
performant is to performer as informant is to informer.
PCIe SSDs exist now.
Contribute to civilization: ari.aynrand.org/donate
Never seen the word "performant" until today. Must be an obscure five-dollar word that scientists love to toss around. Meanwhile, I'll stick with cheap performance as my word of choice.
https://en.wiktionary.org/wiki/performant
Performance is a noun. Performant is an adjective, meaning "having (high) performance" (or performing well). If you stick with performance, be sure to reword your sentence so that it makes sense.
You're right, performance is a noun while performant an adjective. Most of the time, though, you can just say fast.
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo
(slashdot lameness filter prevention plz ignore)
Linux, you magnificent bastard, I read the fucking manual!
Concur. I am the proud possessor of a paper copy of the 20-volume Oxford English Dictionary. "Performant" isn't in the OED. "Performancer", as in "he / she who performs", however, is....
Religous speak to God. Insane are spoken to by God. When all shut up, one can finally hear Shostakovich in peace
Thank fucking Gawd. Finally, someone says something on slashdot based on fucking data. Wow.
Religous speak to God. Insane are spoken to by God. When all shut up, one can finally hear Shostakovich in peace
Everyone has an obligation to speak clearly in a way that everyone else can understand. As a college graduate, I could write enough buzzwords to make a scientist feel dizzy. But I write at an eighth grade level to reach the widest possible audience. If being clear makes me ignorant, I don't have a problem with that.
Well. I think the authors do have some points although at least some of them are existing in embedded systems (which execute directly out of Flash) for a long time:
* CPU cycle hungry, most efficient disk caching algorithms are not that efficient anymore once "disk" (or rather Flash) access manages to catch up to the CPU. Less efficient but also less resource hungry algorithms might be advantageous then.
* Issuing lots of read accesses in advance to keep your worker threads busy might only help in occupying RAM but not speed up processing anymore if data arrives long before workers finished their previous job.
* Multi-core access to the Flash needs better (and less blocking) synchronization than with disk, where actual colliding accesses would have been more rare due to long time between them (being executed).
* If serial and random accesses show only small difference in access times (as they do for Flash: a few clock cycles for the Flash to throw away its read-ahead cache and get new data instead of the huge wait for head positioning and sector arrival of spinning disks), caching strategies might have to change, e.g. maybe caching leaf inodes is then not efficient anymore (just guessing here).
* And they seem to be talking about networking due to attaching disk clusters to servers via ethernet. But there I guess the authors are not radical enough: Why not connect the Flash devices directly to the server, they take much less space and power than spinning disks.
But as I said, I would have expected many of this being explored with respect to embedded computing long ago and with respect to servers already since the advent of the first SSDs (by talking to the embedded guys). Now seems a bit late for an ACM magazine article about that (unless ACM is falling behind tech development).
Well Looks to me like Oxford says webster can suck it: http://www.oxforddictionaries....
"I opened my eyes, and everything went dark again"
According to the link you provided, popular usage peaked in 1990 at 0.000001551%. :/
An important exception occurs when we're talking about MTBF. Fast doesn't mean performant, at all.
An important exception occurs when we're talking about MTBF. Fast doesn't mean performant, at all.
Right, then you can say something like reliable. We have now seen how useless the word performant really is. It's completely dependent on context. So it means nothing more than good yet takes three times the bandwidth.
The basic point of the article is dead on. The major assumption that I/O is extremely slow has driven the organization of computer architecture from the beginning. But, as the article noted, in the last few years, that equation can be changed drastically. The memory hierarchy is going to get more complicated: DRAM, NVDIMM, NVM, SSD, HD, Optical/Tape, and best using that hierarchy means that there are changes that need to be made.
For one, I think there will be a lot of research in this area. Just like modern network cards do a lot of processing before involving the CPU, it may be necessary to have similar abilities. For example, allowing the network and I/O devices to work with each other with much less CPU intervention. Or making the I/O controllers smarter so that computation can be moved to the disks. How to best organize operating systems to use this new memory hierarchy well.
Well there is new kernel tech for it (https://www.thomas-krenn.com/en/wiki/Linux_Multi-Queue_Block_IO_Queueing_Mechanism_%28blk-mq%29).
Well there is new kernel tech for it (https://www.thomas-krenn.com/en/wiki/Linux_Multi-Queue_Block_IO_Queueing_Mechanism_%28blk-mq%29).
That is really interesting, thanks for pointing it out. I missed your question when you posted it:
When do you decide to have a system managed service (for example apache) or a /etc/init.d initscript ?
If it is a process I want to stay up, then I use inittab. Apache is a pretty good choice for an init service, another example is databases or messaging systems. However , if it is someone else's system I just do it how they do it to fit in.
For example, apache could be be set up in inittab with a 'respawn' directive, so if the process is terminated it restarts automatically, if there is a problem with the service, and it won't start then, init will disable if it is respawning too rapidly.
My ism, it's full of beliefs.