Slashdot Mirror


Enterprise Datacenter Hardware Assumptions May Be In For a Shakeup (acm.org)

For the entire careers of most practicing computer scientists, a fundamental observation has consistently held true: CPUs are significantly more performant and more expensive than I/O devices. The fact that CPUs can process data at extremely high rates, while simultaneously servicing multiple I/O devices, has had a sweeping impact on the design of both hardware and software for systems of all sizes, for pretty much as long as we've been building them. This assumption, however, is in the process of being completely invalidated.

12 of 100 comments (clear)

  1. Can this entry be any more click bait? by turbidostato · · Score: 4, Funny

    Yes!

    For the entire careers of most practicing computer scientists, a fundamental observation has consistently held true... and you won't believe what happens next!!!

    1. Re:Can this entry be any more click bait? by jellomizer · · Score: 2, Interesting

      Technology changes, so does how to better use them.

      Old technology long term storage was very slow. So we used to CPU calculate a lot of data. Think Mario and Luigi in the original NES They were 1 bitmap and you just swapped the pallets, as well many of the creator's same expensive bitmap image, and use the CPU to cheaply give them different color. As times goes on Storage is cheaper and faster. So we have independent bitmaps for Mario and Luigi so they are different in appearance, luigi being taller and thinner. Technology advanced to a point where it was good enough to have different images, and it was worth it vs trying to spend all your time with silly hacks.

      But lets get away from games and onto serious computing.

      1980's Mainframes: Computing was expensive, so you were better off having a centralized computer that would be accessed with dumb terminals. The speed of communicating data to the terminals was fast enough to balance any speed in processing the data, allowing many users on one system. A lot of the data were stored in active RAM. And were very small, being that most data retrieval would be needed via tape. These programs were very small and concise. And they wouldn't be used today because they were very buggy, and could be hacked into. People who used the computers often had the title Computer Operator, as they would know what to do and what not to that would cause a problem.

      1990's Desktops: The PC has reached a price point where except of having one mainframe people would have their own PC. This means faster displaying of data, and also allow people to do big calculations without worrying about slowing down everyone else. This allows applications to query the disk for the programs much more readily and page data. This allowed for more advanced UI in programs that would prevent people from doing stupid things as often. However program sizes got larger.

      2000's Networked Desktops: With servers now being build on Desktop Technology and 100mbs - 1000mbs networks being common in an office. It allowed PC's to talk to the server easily and quick enough for most data retrieval. Data files began to be stored on the network, freeing the desktop to have smaller drives, enough to run the application, and RAM requirements peaked at the 4gigs. While the improvements went toward the servers. Also with cheap Servers, we were allowed to install effective Databases on them. MySQL, PostGreSQL, MsSQL Server. All very affordable DB that were able to deal with most businesses level of work. (Causing the decline of systems such a FoxPro, Dbase and MS Access) which were meant for PC usages.

      2010's The "Cloud": Well web based applications, they may not be setup on a cloud platform. But with higher speed internet access available to most people and business, the web standards have matured to a point where most UI functionality is available. We were able to run our apps in a browser or thin client, so the performance of our personal device matters a lot less. Most things we do we can still do on a PC that is 10 years old. Where back in the 90's if your PC was 4 years old it was too out of date to do anything new. Servers have been designed to be more distributed and faster storage means we have more than enough processing power. The limitations are usually more limited to total bandwidth.

      As time changes how we use software changes, now these are trends not hard and fast rules. I still work on a mainframe daily for work, and it is handing a lot of work rather well. I still run across FoxPro and Access Applications that are in production use that needs to be maintained. There are still file share drives. And now there is a set of hosted apps to use. They all come with tradeoffs. But changing technology gives us more options.

      --
      If something is so important that you feel the need to post it on the internet... It probably isn't that important.
    2. Re:Can this entry be any more click bait? by gstoddart · · Score: 2

      I tried it, it put my back out for a week, and I nearly lost an eye.

      Wait, are we talking about the same thing?

      --
      Lost at C:>. Found at C.
  2. The slowest thing in the datacentre ... by petes_PoV · · Score: 3, Interesting
    ... would be the pundits.

    This piece is citing articles written in 2005 as "ye olde world" and saying "OMG! something amaaaazing has happened.

    Well, those 10 years represent 2 or 3 generations of datacentre hardware, depending on how you amortise your assets. So if the author has only just woken up to SSDs or SCMs then what have they been doing for the past decade?

    In practice, the biggest bottleneck in the datacentre has been the network for a longish time. And the biggest bottleneck in most systems is the user's think-time. It is that last aspect which lies at the heart of multi-user systems.

    However, the guy does have a point: the need for "olde worlde" performance management - designing the bottlenecks out of a system and diagnosing where the choke-points are (ans. the network) when things slow down has largely disappeared. But as for the rest of his stuff? Yes, we know all that.

    --
    politicians are like babies' nappies: they should both be changed regularly and for the same reasons
  3. Re:You say performant, I say performance... by jonnythan · · Score: 3, Informative

    Performance is a noun. Performant is an adjective. I guess he could have said "faster"

  4. Re: I'm not reading all that by Fwipp · · Score: 3, Funny

    "SSDs exist now"

  5. Re:You say performant, I say performance... by Captain+Splendid · · Score: 4, Insightful

    Language is a tool. Just because you're not versed in its intricacies doesn't mean that someone who is is inferior to you.

    --
    Linux, you magnificent bastard, I read the fucking manual!
  6. Author doesn't know what he's talking about by guruevi · · Score: 2

    This issue has been known to anyone using SSD's. The CPU's are still fast enough but the bandwidth between clients and servers (10Gbps is the average these days within a datacenter) no longer uses the full capacity of the disk subsystem (which is now connected at >10Gbps to each drive). Even having multiple disks in a single subsystem you can no longer use the capacity, not because of CPU issues but because of bandwidth issues between the CPU and the PCIe bus. That's why we're going away from large disk arrays and using 1 or 2U servers with 4-12 SSD drives and hooking them together with 'object storage' or other distributed storage mechanisms. That way you don't have a single point of failure and resource contention slowing you down.

    But that's not the point, the point the article is making is that CPU's are getting too slow and that's not true. The CPU's are plenty fast and using any sort of off-loading mechanism would result in RAID controllers with CPU's that have to be just as powerful because if they aren't, you get the issues you have with current RAID controllers: they are slow and expensive (a single link to a 12Gbps chip is a bottleneck to an entire array of 12Gbps drives). Also you lose the scheduling, checksumming, hardware monitoring and all the other fancy things software-based solutions do these days.

    Using CPU's as glorified RAID controllers is just fine and I don't foresee another solution as long as your software is fast and concise (eg. ZFS). If you start handing off anything to dedicated CPU's then you're just losing the control and customization a software based solution allows you to have.

    --
    Custom electronics and digital signage for your business: www.evcircuits.com
  7. summary of TFA...not really about the CPU at all.. by nimbius · · Score: 2

    storage is slower than processors until you consider caching things in ram and in which case its magically faster.
    Other points mentioned:
    Balanced Systems: you can have lots of ram but make sure you have the network to serve it. CPUs were unavaillable for comment.
    Contention-Free I/O-centric Scheduling: uh, has been around for nearly 15 years since the invention of the X86_64 architecture...at least...formally in the domain of commodity hardware. CPUS could not be reached for comment.
    Workload-aware Storage Tiering: remember all that crap we mentioned about memory caches for everything? well now we're drifting into the realm of object stores so sit tight. tiered storage has existed for 15 years as well...so we're a bit late to the party for this one.

    The Future: RAM + Acronym + expensive support contract = Storage Class Memory!. learn it, embrace it, and most importantly, make sure its on the fucking purchase order this year*


    *not applicable if youre using redis, memcached, ceph, couch, hadoop, hypercube, or any one of about 30 other different commodity hardware centric distributed data frameworks designed to purge the vendors from the budget as jesus purged the jews from the temple.

    --
    Good people go to bed earlier.
  8. Performant does not (necessarily) mean "faster" by SuperKendall · · Score: 4, Insightful

    Performant is actually a pretty useful word in place of "real" ones like "faster", because "performance" is a word that can change meaning depending on what you consider to be good (or desired) performance.

    Maybe good performance means that it's using all of the cores on a CPU well. maybe it means that it's not using much of the system at all, but is using the network very well, or work is spread out across a cluster in an extremely balanced fashion. "Faster" may be a by-product, but it may not, because people using the word "performant" often value stability over absolute speed.

    I guess the closest concept "performant" comes to is being well-balanced, or perhaps meeting some goal you had set during design.

    So don't be too dismissive of a new word, it can be the case a new word was made because old ones wouldn't really fit without a lot of verbosity.

    --
    "There is more worth loving than we have strength to love." - Brian Jay Stanley
  9. Article is kinda pie-in-the-sky wrong by m.dillon · · Score: 3, Interesting

    At least, not totally correct. Memory bus non-volatile storage such as Intel's X-Point stuff still requires significant cache management by the operating system. Why? Because it doesn't have nearly enough durability to just be mapped as general purpose memory. A DRAM cell goes through trillions of cycles in its live time. Something like X-Point might be 1000x more durable than standard flash, but it is still 6 orders of magnitude LESS durable than DRAM. So you can't just let user programs write to it however they like.

    Secondly, in terms of data-center machines becoming obsolete. Also not correct. SSDs make a fine bridge between traditional HDD or networked storage and something like X-Point. For two reasons: First, all data center machines have multiple SATA busses running at 6GBits. Gang them all together and you have a few gigabytes/sec worth of standard storage bandwidth. Secondly, you can pop nVME flash (PCI-E based flash controllers) into a server and each one has in excess of 1 GByte/sec of bandwidth (and usually much more).

    Third, in terms of memory management, paging to/from SSD or nVME 'swap' space, or using it as a front-end cache for slower remote storage or spinny disks, already provides servers with a fresh new life that means they won't be obsolete for years to come.

    And finally there is the cost. These specialized memory-bus non-volatile memories are going to be expensive. VERY expensive. To the point where existing configurations still have a pretty solid niche to play in. Not all workloads are storage-intensive and these new memory-bus non-volatile memories don't have the density to be able to replace the storage required for large databases (or anywhere near it).

    So, the article is basically a bit too pie-in-the-sky and ignores a lot of issues.

    -Matt

  10. Re: I'm not reading all that by NatasRevol · · Score: 2

    And VMware made VSAN to go on top of them - RAID5 for compute/disk nodes w 10G network backplanes instead of SAS.

    --
    There are two types of people in the world: Those who crave closure