Slashdot Mirror


ARM In the Datacenter Isn't Dead Yet (theregister.co.uk)

prpplague writes: Despite Linus Torvald's recent claims ARM won't win in the server space, there are very specific use cases where ARM is making advances into the datacenter. One of those is for use with software-defined storage with open-source projects like CEPH. In a recent The Register article, Softiron's CTO Phil Straw states about their ARM-based CEPH appliances: "It's a totally shitty computer, but what we are trying to do here is storage, and not compute, so when you look at the IO, when you look at the buffering, when you look at the data paths, there's amazing performance -- we can approach something like a quarter of a petabyte, at 200Gbps wireline throughput." Straw claimed that, on average, SoftIron servers run 25C cooler than a comparable system powered by Xeons." So... ARM in the datacenter might be saying, "I'm not quite dead yet!"

26 of 147 comments (clear)

  1. Linus is more nuanced ;-) by e70838 · · Score: 2

    The claim of Linus is "that as long as everybody does cross-development, the platform won't be all that stable". I have my web hosting on arm and I compile on arm. I cannot find any good and cheap arm laptop with ubuntu. If this does not happen soon, arm in servers will die shortly, like a hype. IMHO the future is not decided yet and what give Linus is a good indicator to analyse where it is aiming. If this summer we have many arm laptops that sell reasonably well on the market, I continue hosting on arm. Otherwise I go back to intel.

    1. Re:Linus is more nuanced ;-) by e3m4n · · Score: 2

      Sometimes the best engineering, the best design, or the best science, still feels to make market share because of marketing, financials, and deliberate partnerships. Betamax vs VHS. Or HD-DVD vs BluRay or even X2 vs 56KFlex. It’s not always the best design that wins. There are patents which result in recurring revenue from licensing at stake. This usually results in a whole lot of finagling and cutting deals in board rooms regardless of which technology was superior.

    2. Re:Linus is more nuanced ;-) by squiggleslash · · Score: 2

      I worked in a place where we did the bulk of our development on a set of DEC Alpha servers but one of the production servers (a customer whose product was self-hosted) was an HPUX thing.

      While I wouldn't say it was great (I had to recompile a local copy of GCC because the HPUX C compiler was K&R, for example), it certainly wasn't hard.

      And here's the thing: that was way more complex than most situations where you develop on ix86 and deploy for ARM or vice versa. In the modern world you're usually using the same operating system, and even the same "binaries" - yeah, I know, binaries is the wrong term, but do you think most people who write stuff that run on servers in datacenters are programming in C++?

      No. They're programming in Java. Or .NET. Or *shudder* PHP. The job of doing the CPU specific part has already been done.

      Torvalds is a C programmer, and he's laser focused on C and machine code. I don't necessarily blame him for not understanding the modern intricacies of modern server software. But if anything Torvalds was saying about how people decide what to deploy in datacenters was true, then IBM would be wasting its time with POWER. Indeed, the logic wouldn't just apply to datacenters, it'd apply to all situations where people develop for one platform and deploy on another. That'd mean Google would have to give up on Android, for example, as Acorn hasn't made an ARM workstation in decades, so how are people going to develop for it?

      Torvalds is good at making hardware sing. He is not an expert in everything. He's made some boneheaded *cough* Bitkeeper *cough* mistakes before even in the field he's good at, to expect him to know everything about everything outside of how to write a kernel in C and make it really good is unrealistic.

      Can ARM do well in datacenters? No idea. But like everything, it'll boil down to third party support coupled with the merits of the chip itself, not programmers.

      --
      You are not alone. This is not normal. None of this is normal.
    3. Re:Linus is more nuanced ;-) by squiggleslash · · Score: 2

      Yeah, he's probably looked all over the manpage for 'php' and cannot find the damned cross compilation flag ;-)

      --
      You are not alone. This is not normal. None of this is normal.
  2. AWS ARM instances by Anonymous Coward · · Score: 2

    ARM adoption will increase because AWS offers the a1 instance family now. You can now easily fire up servers with ARM hardware to work on your software solutions. For many applications it will be a viable solution with substantial cost savings. Watch the stories and statistics that you start seeing at the summits and reinvent from customers in 2019.

  3. Re: LOL shitty computer? Er, no. It's an Intel kil by Anonymous Coward · · Score: 3, Insightful

    Lack of Intel Management Engine or other spyware built-in features, that can't be removed without a high degree of risks of permanently damaging the hardware.

  4. Re:Requires changes to software by Anonymous Coward · · Score: 2, Funny

    Wait for the new generation of Amiga computers :-)

  5. Re: LOL shitty computer? Er, no. It's an Intel kil by fuzzyfuzzyfungus · · Score: 2

    What they are used for varies by implementation, since ARM is all kinds of things to various people; but 'Trustzone' extensions are specifically designed to provide analogous capabilities(at lower cost, the invisible super-privilege enclave is logically separated but runs on the same CPU rather than being a separate processor); and tends to be used for similar purposes in cases where conditional access enforcement or 'platform integrity' are design goals. ARM SoCs commonly also implement all the features one requires for a full crypto bootloader lockdown.

    If you are working at some scale this matters less because you get to dictate if some of those features are enabled, whose keys are burned in as trusted, etc.(unlike Intel, where your leverage is likely to be substantially lower: there is at least one exception, the High Assurance Platform ME firmware variant, but for the most part they aren't terribly open to suggestions in that area); but if you are buying consumer or small business quantities of off-the-shelf ARM there's no particular reason to be more optimistic about how much control you have over the low level behavior.

  6. remember that time... by sad_ · · Score: 4, Insightful

    remember that time when everybody said intel x86 would never make it in the data center...

    --
    On a long enough timeline, the survival rate for everyone drops to zero.
    1. Re:remember that time... by squiggleslash · · Score: 2

      No. When did anyone say that? Considering there was a time 8080s and their descendents were in the data center (the Chicago Stock Exchange had a room full of S-100s in racks processing most of their data at one point in the 1980s) why would anyone assume the most popular CPU line on Earth would be excluded from there?

      --
      You are not alone. This is not normal. None of this is normal.
    2. Re:remember that time... by jwhyche · · Score: 3

      Can't seem to recall any one saying that. What I do recall is there was a significant effort for everyone to have their own custom processors, PowerPC, Sparc, PA-risc, Clipper, etc etc etc. All of them eventually gave way to the x86.

      --
      I read at +2. If your post doesn't reach that level I will not see or respond to it.
    3. Re:remember that time... by munch117 · · Score: 2

      I can. Of course the people saying it were the people pushing "PowerPC, Sparc, PA-risc, Clipper, etc etc", but yeah, I remember the notion that 80x86 wasn't proper server hardware being expressed from time to time back in the 1980's and maybe even early 1990's.

      I don't believe anyone used the term "data center", though. Was the term even invented back then? "Mainframe" and "server", more like.

  7. amd epyc is good for pci-e storeage nodes with 128 by Joe_Dragon · · Score: 2

    amd epyc is good for pci-e storage nodes with the 128 pci-e lanes.

    also CEPH / ZFS like lots of ram as well.

  8. The ARM in the datacenter... by Freischutz · · Score: 3, Funny

    There's a disembodied zombie ARM in the datacenter! Oh God, it's not dead yet and, ... and it's coming for us!!!!!! AAAAAAAAAAAAAAAAAAAAAAAAAAAAH THE DOOR IS LOCKED!!!

  9. Re:Immune to speculative execution by weilawei · · Score: 2

    The A72 does, however, do speculative execution. And ARM chips aren't invulnerable to cache attacks, either. That said, I really like the A53 (as implemented by the BCM2837) But what got me was this about Ceph,

    The ceph-deploy utility must login to a Ceph node as a user that has passwordless sudo privileges, because it needs to install software and configuration files without prompting for passwords.

    Hard pass, thanks.

  10. Re:Requires changes to software by AmiMoJo · · Score: 2

    It's the lack of custom stuff that has held ARM back. To get really high throughput you need high end NICs and storage controllers, which in turn need proprietary drivers that need porting to ARM. It's not just a case of re-compling either, or just one or two components you can bolt on. For example even now there are not many options for ARM boards with huge numbers of PCIe lanes to feed all that stuff, but if you buy a Xeon of Threadripper/Epyc system that's standard.

    --
    const int one = 65536; (Silvermoon, Texture.cs)
    SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
  11. Re:Requires changes to software by weilawei · · Score: 2

    The boot situation is particularly tangled. Still, the *way* the Raspberry Pi brings up the system is a step in the right direction (see the GPU being used in the same manner as the Z80 was to bring up the 6502 on the Commodore 128). Now, that's specific to one vendor (Broadcom), but maybe ARM could take a hint and formalize that a bit?

    It's trivial to drop a new kernel onto one via the SD card or serial.

  12. Re:Not "dead yet".. It has not even grown up yet. by fuzzyfuzzyfungus · · Score: 2

    A lot of the RAID controllers as well(I think that's the purpose that Intel held onto their last bit of ARM for when they sold the rest to Marvell, not sure if they've fully divested at this point or if it remains the case).

    That's what makes the report of these SoftIron storage nodes (and the fact that the storage nodes are accompanied by management nodes whose architecture the article doesn't specify and 'router' nodes that expose iSCSI, NFS, and SMB; architecture also unspecified) unsurprising (if you want to do relatively computationally cheap 'expose disk to network' stuff, an AMD A1100 with 2x 10GbE and 14x SATA integrated into a relatively low power SoC is just what the doctor ordered); but also not terribly novel or relevant to the viability for more general purpose stuff: ARM cores have been wildly successful doing specific things, tailored to their strengths by the vendor, for ages now; so much so that we don't even notice most of them. What they haven't been is terribly viable as substitutes for general purpose computers.

    The degree to which that hasn't panned out is honestly a trifle surprising: It's not a surprise that they can't go head-to-head with Xeons or AMD's current gen actually good parts; but what is a bit curious is how quickly the lower power/lower performance options start suffering from one or more serious deficiencies: there are the ones that seem promising, and are often astoundingly cheap, but are cripplingly awful on the software side(the usual fate of rPi-alikes derived from Chinese tablet SoCs); there are the ones that have reasonably sane software, either OSS or at least competently vendor supported, but cap out at very low performance(the rPi and various router/networking oriented parts tend to fall here; lacking things like RAM expansion options and usually with I/O options that are sharply limited in one or more areas); and there are the ones that actually seem really cool, but are either only show up in expensive appliances and a a pricey dev board or two, or do show up for sale in at least a few boards/systems but are just really expensive for what they do compared to a basic x86 that won't require any special attention(The AMD ARM Opterons seem to have done this a bit; the SoftIron storage appliances seem interesting as storage appliances; but if you want just an ATX board or a barebones server based on one good luck even matching, much less beating, the price of a boring x86 box that's at least as fast and much better supported).

    I, apparently naively, would have expected a few more products of the "We licensed whatever ARM's current 64-bit core is, support a bunch of DIMMs and a PCIe root; and the firmware is designed to get you into mainline Linux with minimum fuss" persuasion to be available.

  13. Re:Requires changes to software by ctilsie242 · · Score: 2

    By no means is the Raspberry Pi perfect (more RAM would be nice), but it is very easy to get started working with it. It would be nice if more ARM SBC makers agreed on a chipset standard, making the SoC modules available to (and hopefully part of) the Linux mainstream kernel.

    I can see a niche for SBCs designed to be desktops, or perhaps small blades for a dense enclosure, similar to a Raspberry Pi compute model, but with a non-trivial amount of RAM. This would make for a very useful VDI structure.

  14. Re:LOL shitty computer? Er, no. It's an Intel kill by Miamicanes · · Score: 2

    > 2) Cost of running the CPU in heat and power is *significantly* less than intel/amd

    ONLY true when the ARM's performance is significantly less than Intel/AMD as well. Beef an ARM up to i9 specs, and it's going to burn as much power and throw off as much total heat AS an i9 with identical raw performance.

    It's like LED lighting. A single LED might throw off light with just milliwatts of power... but crank it up so it's throwing off EXACTLY the same amount of light as a 100-watt halogen lightbulb (measured from every direction), with color fidelity that's at least as good as that 100-watt halogen bulb (none of this "80+ CRI" shit, or even "92+ CRI with weak R9"), and it's going to CONSUME at least 70-80 watts and throw off almost as much heat AS the original incandescent bulb. Because the only way to get deep, saturated reds without making the light appear 'pink' is to crank up the near-infrared output (which stimulates your 'long' cones, without bleeding into green/blue territory and desaturating it). And even if you settle for lower-quality light, the power consumption is no better than fluorescent bulbs, because a "white" LED basically IS a "fluorescent" bulb.

    If you want the equivalent of an elderly personal servant or ten thousand army ants instead of a half-dozen deity-like level bosses, ARM might win over Intel/AMD64. Try to scale the army ants TOO much, and you end up wasting most of your effort just trying to keep them coordinated (the current bane of multithreaded programming).

  15. Re: LOL shitty computer? Er, no. It's an Intel kil by Miamicanes · · Score: 3, Interesting

    > ARM can be easily scaled to hundreds of cores

    And yet, an Android phone with 8+ cores and nominal clock speed of 2GHz+ still can't render a Javascript-heavy web site (like Amazon, Walmart, or Sears) as well as a 15 year old 700MHz Pentium III.

    > without having an astronomical price

    Scale an ARM-based solution up to the point where it's capable of genuinely matching the performance of an i9, and you'll find that the ARM-based solution is probably quite a bit MORE expensive.

    > without requiring a nuclear power station sitting on the desk

    Compared to the power and cooling requirements of a Pentium IV with 15kRPM hard drive, an i9 with RTX and SSD is practically a laptop watt-wise. 20 years ago, I literally cut a hole in the wall between my computer room and the hallway so I could put my computer in the hall & pass the cables through the wall to get the heat and noise out of my face.

  16. Re:Not "dead yet".. It has not even grown up yet. by Spazmania · · Score: 5, Informative

    it reverses the traditional storage server layout by moving CPUs to the front of the PCB and storage drives to the back. This means cool air from the fans blows over the drives first, and then the CPUs â" which wouldn't make any sense in a compute server.

    Um... What?

    Modern servers cool front to back. They place the drives in front where they are cooled first. Some place the CPUs behind the drives. Others place the CPUs in parallel with the drives so that they're also cooled directly from ambient air.

    No one in their right mind places the drives after the CPUs. Losing a CPU is just money. Losing the drive is everything.

    Were you maybe trying to say they put the fans in front of the drives instead of behind them, pushing air instead of pulling it? That doesn't make any real difference to the cooling, but it makes hot-swap harder.

    If these machines actually cool back to front, that's a bad thing. Modern data centers are laid out with a hot-aisle cold-aisle design. The wiring and exhaust side (the back) faces the hot aisle. Equipment which reverses that flow effs everything up. Cisco is especially bad about this, but servers mostly get it right.

    --
    Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
  17. Re: Not "dead yet".. It has not even grown up yet by hey! · · Score: 4, Funny

    Well, technically you'd have to say he observes, then opines (literally "to state as one's own opinion"). If all he ever did was observe then we'd never know, would we?

    --
    Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
  18. Re:LOL shitty computer? Er, no. It's an Intel kill by bluefoxlucid · · Score: 3, Interesting

    AMD created the x86-64 architecture, and is making inroads with Epyc. AMD also has some RISC-V work in the pipeline. I'm predicting RISC-V will be big: Intel may try to capitalize on ARM thanks to mobile space, and AMD will start shoving RISC-V (no license fees) into processors for Chromebooks and the like, then into servers running Linux for RISC-V or something.

    The next Raspberry Pi might be RISC-V. It's been mentioned. Nobody's taking that seriously yet, and they're not suggesting it seriously yet.

    AMD beat Intel once doing this. They invented a whole new architecture and killed IA-64.

  19. Re:LOL shitty computer? Er, no. It's an Intel kill by bluefoxlucid · · Score: 4, Interesting

    ONLY true when the ARM's performance is significantly less than Intel/AMD as well.

    ARM has historically had more performance per clock than x86 and x86-64; and modern ARM chips run like 2.4GHz at a watt of peak TDP on four cores.

    Think about linear character matching ("abc" in "aaabc" -> "a=a, b!=a" -> "a=a, b!=a" -> "a=a, b=b, c=c" -> match) versus Booyer-Moore ("abc" in "aabc" -> "c:a = 3" -> "c=c, b=b, a=a" -> match). Booyer-Moore finds a string--faster with longer search strings--in large amounts of text with few comparisons, thus issues fewer CPU instructions.

    CPUs can implement ALUs, decoders, and pipelines to execute the same instruction code in fewer clock cycles. Just like using a different software algorithm, you can use a different hardware approach.

    Prefixed instructions and fixed-length instruction sets are core to ARM. Literally every instruction is prefixed. That means where you might compare for one cycle, then jump or not jump on the next cycle, ARM simply jumps or doesn't jump. One fewer cycle.

    The decoder doesn't have to deal with figuring out instruction size or the content if it picks an instruction prefixed to only execute if ZF is set, so if you SUB r2, r1 and the result is zero, the next instruction that executes only if ZF is not set is just skipped and the decoder moves on.

    Because the CPU will read ahead and cache (preload) the next several instructions (fetches from RAM are slow!), it's technically-possible to block out the next e.g. 10 instructions as IFZ [INSN], and have an ARM CPU internally identify the next several instructions are prefixed IFZ and just skip the instruction pointer ahead that many. Remember: every instruction is exactly one word wide; you don't need to know what the next instruction is to know where the following instruction starts. You don't have to decode the instructions if they won't be executed.

    This feature frequently eliminates a large number of comparisons and jumps, trimming down the size of the code body (you'd think variable-length insns would do that, but that usually doesn't work out). More instructions fit into cache, and branch prediction becomes simpler (less power) and more-effective.

    ARM also has 30 GPRs. x86-64 has 10 GPRs, plus source/destination/base/count pointer registers that are basically GPRs. A lot happens without using RAM as an intermediate scratch pad.

    It's like LED lighting. A single LED might throw off light with just milliwatts of power... but crank it up so it's throwing off EXACTLY the same amount of light as a 100-watt halogen lightbulb (measured from every direction), with color fidelity that's at least as good as that 100-watt halogen bulb (none of this "80+ CRI" shit, or even "92+ CRI with weak R9"), and it's going to CONSUME at least 70-80 watts and throw off almost as much heat AS the original incandescent bulb

    Halogen and incandescent bulbs are black-body emitters: much of their light is in the infrared range. LEDs are narrow emitters and use combinations of materials to emit in multiple ranges when providing white light. That means an LED operating on 100 watts of power emits about 80 watts of visible light, while a halogen operating at 100 watts emits about 20 watts of visible light, and an incandescent tungsten-coil bulb emits about 10 watts of visible light.

    An LED emitting the same broad-spectrum visible light as a 100-watt halogen would consume 25 watts of power.

  20. ARM well represented in Network Hardware by aaronb1138 · · Score: 2

    Both Juniper and Palo Alto use Cavium ARM processors in their hardware, usually for management plane tasks (FPGAs and ASICs do the heavy traffic processing on high end units). And ARM SoCs are popular for switches and routers where raw compute power isn't necessary. Certainly Cisco is the only one willing to stick with low end, neglected Intel Atom offerings even after the Nexus 9k, ISR 4k, And ASA 55x6 series got bit by defective Atom C2000s (sorry bro, your $55k switch just died because of a $41 CPU).

    So ARM is great anytime you don't care about CPU processing power, but still want to move data -- storage appliances and network. Which is odd given that in the mobile space the few Atom x86 Android phones to reach the market had lesser raw CPU benchmarks than their ARM contemporaries at the time, yet in actual usage felt much smoother because of wider / faster buses and superior throttling (Had a Zenfone 2 with the Atom and it's still smoother than a lot of Snapdragon 6xx midrange phones).