ARM Readies Cores For 64-Bit Computing
snydeq writes "ARM Holdings will unveil new plans for processing cores that support 64-bit computing within the next few weeks, and has already shown samples at private viewings, InfoWorld reports. ARM's move to put out a 64-bit processing core will give its partners more options to design products for more markets, including servers, the source said. The next ARM Cortex processor to be unveiled will support 64-bit computing. An announcement of the processor could come as early as next week, and may provide further evidence of a collision course with Intel."
hi
ARM has to walk the power way up. I don't see how 64 bit computing would let them snatch server oriented clients. Similarly, I doubt Intel would be wise to deliver chips for the wristwatch market without first having something more compelling for the smartphone.
To do list for Windows
I know folks think it's 'overkill' to have 64-bit CPUs in portable devices, but consider that the -entirety- of storage and RAM can be mmapped in the 64-bit address space... That opens up a lot of options for stuff like putting entire applications to sleep and instantly getting them back, distributing one-time-use applications that are already running, sharing a running app with another person and syncing the whole instance (not just a data file) over the Internet, and other cool futuristic stuff.
I'm wondering when the first server/desktop OS is going to come out that realizes this and starts to merge the 'RAM' and 'Storage' into one 64-bit long field of 'fast' and 'slow' storage. Say goodbye to Swap, and antiquated concepts like 'booting up' and 'partitions'.
"Sometimes, I think Trent just needs a cup of hot chocolate and a blankie." -Tori Amos on Nine Inch Nails
It has to be cheap , power efficient , dense (performance per rack unit ) and most of all _stable_ if they want to use it for servers.
If they can manage those details it would be an instant hit , x86 servers are mighty expensive for small businesses , at least around where i live.
Either way some competition would be welcome and is sure to drive costs down.
The other essential problem is getting motherboards to meet the same criteria.
Consider something more important like having time_t up to 64-bit to have dates beyond 2038... among others.
... years after 64 bit computing is already available, ARM thinks they're going to be innovative and do the same... how about some forward thinking and better planning, like moving forward further to 128 bit.
P.S. I'm not looking for some technical analysis based on today's limitations saying 'oh that requires too much power' or whatever; the point of innovation is to change what is possible.
MOD UP!
Would be the most exciting revolution to watch. Since it has a totally different design it changes the parameters of how hardware end products can be built.
As ARM cores are so simple and ARM Holding does not have their own fabs, anyone could come up with their own optimized ARM-compatible CPUs. It's one of those moments when the right economics and the right technology could fuse together and change stuff.
As ARM cores are so simple and ARM Holding does not have their own fabs, anyone could come up with their own optimized ARM-compatible CPUs. It's one of those moments when the right economics and the right technology could fuse together and change stuff.
The problem is... Windows. More precisely, proprietary closed-source software which can't just be recompiled for a new architecture.
The huge amount of installed Windows software out there won't run on ARM, so it won't change the mainstream laptop/desktop market any time soon.
Can you please explain the advantage of ARM over X86 in the server room because this one has me scratching my head. While I'm all for different arches (I have a PPC G3 Mac just so I could play with non x86) I thought the whole point of ARM was it was super low power for mobile devices? while I'm sure cutting down power usage in the server room would not be a BAD thing, considering how much software, both for Windows AND Linux, that isn't for ARM based CPUs I just don't get what the advantage of this would be over say a Bobcat, Nano, or Atom based solution.
Now in mobile I get it, as you can make a cheap iPad knockoff that can get 8+ hours of battery life, but in servers? Maybe there is a use case I don't know of, but when I was setting up servers while power was a consideration it certainly wasn't looked at as a priority over the performance in server roles. How well does ARM handle large amounts of users? How well does it scale with increased demands? While I wish them all the best I just haven't seen a screaming need for these, not when you already have Atom and Nano and are about to have Bobcat and Bulldozer (which from the looks of it will be nice as it has a well built GPU in the Bobcat and Bulldozer so AMD stream coding could be used) all in that same market. What am I missing here?
ACs don't waste your time replying, your posts are never seen by me.
Well, considering that somewhere between 60-90% of the desktop marked in reality does not care what their computer is running, so long their got access to a browser and facebook and in worst case a office suit on the side for minor work, it would not really have mattered.
The only real problem is not Windows, it is getting the computers into the mainstream stores to be sold alongsides the Macbooks and the various normal Windows OEM solutions. Just getting it there would mean instant markedshare over night, because only a minority is application bound in reality.
It should in theory scale better than x86-64 anyhow, and the performance per watt is quite superior, so yes, it has a major place in the server room.
The problem is... Windows. More precisely, proprietary closed-source software which can't just be recompiled for a new architecture.
Much less of a problem than it used to be. Aside from games, how many closed-source software packages do you run that are CPU-limited? In typical usage, the CPU monitor on my laptop rarely goes over 20%. Even emulating everything, it wouldn't be too slow to use. Modern emulators don't emulate everything though, they thunk out to native libraries for things like drawing. That's how Rosetta works on OS X, for example; OS X ships with stub versions of all of the native frameworks for PowerPC, which call the x86 versions outside of the emulator. When you call a library function from an emulated program, you're calling native code. Even if the emulator only runs at 20% of native speed, the apps typically run at over 50% of native speed, meaning that they use 10% of the CPU instead of 5%. You wouldn't want to run all of your code this way, but for the one or two apps that you can't get native versions of, it's acceptable.
I am TheRaven on Soylent News
Wake me up when ARM has the performance part of the package at least partially addressed. If we want low cost, low power, low performance servers, we already have Atom and Nano, both of which offer x86 binary compatibility and can run the latest releases of WIndows and any Linux flavor of the month, and both of which deliver superior performance (to ARM). Anyone thinking that they are heading on a collision course with Intel any time in the next decade... I want some of what you are smoking.
I guess it is nice that they are contemplating servers and thousand dollar cellphones for overpaid yuppies, but where are the hundred buck low power good enough for surfing ARM desktops or "nettops"? That's what I am really interested in, cheap, good enough, cool running, electron sipping can run linux and not x86 machines.
Mind you, if ARM ever gets there, there will be a Windows version almost immediately. NT is actually quite portable. Historically, it's been on MIPS, Alpha, and PPC, in addition to x86, x64, and Itanium (the currently available ports). There's no reason Microsoft couldn't port it to ARM, and if they see a reason to do so (such as a servers-running-ARM market) they will certainly do so.
There's no place I could be, since I've found Serenity...
Apple, Google and Canonical have seen the writing on the wall: Make the apps independent of the ISA, and your platform can go anywhere.
Best way to do this is to provide the storefront, and handle distribution integrated with the OS.
I think the App Store is the biggest software revolution from the 00's ... and it's yet to play out completely.
Make sure everyone's vote counts: Verified Voting
I was going to mention a few, but then I realized that almost all of them are .NET based. MS already has a .NET implementation on ARM (for their mobile devices) and I believe Mono also works on ARM.
The remaining ones are MS Office (ported to x64 and PPC), Visual Studio (partially .NET and hopefully somewhat portable), Opera (portable), Foxit (there are other PDF apps even if it's not portable), and probably a few more.
Of course, you can't just ignore games. Relatively few of those are portable, and I happen to care about them quite a bit.
There's no place I could be, since I've found Serenity...
Sure, but they will lose markedshare on the initial wave when the markeds starts appearing. When it finally comes to "5% of desktop(desktop+laptop,+etc) sales and rising?!", then Windows will pull out a version.
Before that, Linux will gain markedshare, most likely, unless they mess up attempts at markeding again.
In large datacenters, power and cooling costs have become a significant part of the TCO. For smaller server rooms x86 compatibility is probably more important.
Imagine these scenarios:
Building a Linux kernel on a dual-core AMD64: "make -j3 bzImage"
Building a Linux kernel on a quad-core or 8-core ARM: "make -j5 bzImage" or "make -j9 bzImage"
Any bets on which one will finish sooner? The smaller ARM die means the same wafer can hold more ARM cores than any current Intel x86 or AMD cores. The term "embarrassingly parallel" comes to mind.
No! Not the dreaded, "collision course." Can you imagine the energy that will be release when these 2 behemoths collide!
Quick, call the Intel bunnies and tell them to don their purple Nikes! Phone the folks at the LHC and let them know so they can accelerate their schedules before it's too late and we all die without knowing if the Big Bang really abhors vacuuming or Newton only thought he saw stars after being hit on the head with an apple.
We are all on a one way train to marketing=speak Valhalla, and we're never getting off~!
Seriously, should burn Rupert Murdoch's style book.
Low Power - High Performance ... that is already occupied by Cavium, Tilera and others ...
However in the MOBILE space this will have some applications ...
Did he mention servers? Oh well then.
I laughed.
Sure, but they will lose markedshare on the initial wave when the markeds starts appearing. When it finally comes to "5% of desktop(desktop+laptop,+etc) sales and rising?!", then Windows will pull out a version.
Before that, Linux will gain markedshare, most likely, unless they mess up attempts at markeding again.
Are you redarded?
This isn't like the 16->32 bit transition where it quickly became apparent that the benefits were large enough and the costs both small enough and rapidly decreasing that all but the smallest microcontrollers could benefit from both the switch and the economies of scale. 64-bit pointers help only in select situations, they come at a large cost, and as fabs start reaching the atomic scale we're much less confident that Moore's Law will decrease those costs to the level of irrelevance anytime soon.
Most uses don't need >4 gigabytes of RAM, and it takes extra memory to compensate for huge pointers. Cache pressure increases, causing a performance drop. Sure, often x86-64 code beats 32-bit x86 code, but that's mostly because x86-64 adds registers on a very register-constrained architecture and partly because of wider integer and FP units. 64-bit addressing is usually a drag, and it's the addressing that makes a CPU "64-bit". ARM doesn't have a similar register constraint problem, and the cost of 64-bit pointers would be especially obvious in the mobile space, where cache is more constrained- one of the most important things ARM has done to increase performance in recent years was Thumb mode i.e. 16-bit instructions, decreasing cache pressure.
Most of those who do need more than 4GB don't need more than 4G of virtual address space for a single process, in which case having the OS use 64-bit addressing while apps use 32-bit pointers is a performance boon. The ideal for x86 (which nobody seems to have tried) would be to have x86-64 instructions and registers available to programs but have the programs use 32-bit pointers, as noted by no less than Don Knuth:
It's funny to continually hear people clamoring for native 64-bit versions of their applications when that often will just slow things down. One notable instance: Sun/Oracle have told people all along not to use a 64-bit JVM unless they really need a single JVM instance to use more than 4GB of memory, and the pointer compression scheme they use for the 64-bit JVM is vital to keeping a reasonable level of performance with today's systems.
Funny but in 1990 I bet the said the same thing about Intel.
In any office of say 50 or so people a 64 bit ARM would probably do just fine. NAS and SANs in bigger installations would probably also run very well on a 64 bit ARM. And then one has to wonder just how many ARM cores might fit on a die?
ARM is a much more modern ISA than X86 so it will be interesting to see just where it goes. Trust me if you had told anyone in 1982 that someday there would be an X86 that was faster per clock cycle than a Cray1, ran with a multi ghz clock, and had a 64 bit address space they would have locked you in a rubber room.
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
Not that big of an issue in the server space. Sparc and Power5 don't run Windows. And almost all the big server apps already run under Linux so those can recompile without much effort.
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
Current Windows software won't run on ARM. Maybe that's not a big concern for Linux, since most of the Linux software is open source and can be compiled on whatever platform you want, but I don't see companies buying ARM computers instead of x86 ones (you know, the ones that still use IE6 because some business app requires it, going to ARM will be even worse, since all their current apps won't work, not just the badly written ones).
Apple managed to make the switch from PowerPC to Intel almost seamlessly, thanks to a well-written emulator. Microsoft might be able to do the same.
The huge amount of installed Windows software out there won't run on ARM
All the software for Pocket PC aka Windows Mobile (based on Windows CE) already runs on ARM.
Yet benchmarks consistently show that despite the overhead of 64 bit pointers, nearly every program is faster on AMD64.
For windows maybe, but what popular software for linux is tied to x86?
I have run lots of stuff on Debian arm.
He's a marked mad.
Everybody gets what the majority deserves.
So, now I can really get WinCE Jamming! lol J/K of course...
"Computers are a lot like Air Conditioners" "They both work great until you start opening Windows"
Microsoft could write an emulation layer to run x86 code on ARM. Apple created a 68000 emulator when they transitioned from 68k to PowerPC, and then a PowerPC emulator (Rosetta) when they switched to Intel x86 processors.
x86 isn't as easy to emulate, and the performance would probably be terrible, so it's not too likely. But it's an option if some future architecture beats the pants off x86 enough to make emulating x86 for legacy apps run at a reasonable speed.
oh god, not another architecture to maintain! This is going to set back the next release a few years for sure!
@mods: it was a joke, I'm not trolling.
Performance per watt is what matters in the server room, and that's one area where ARM handily trumps x86.
I do not fail; I succeed at finding out what does not work.
No floating point is ever required for filesystems or encryption.
When will it run Android?
deleting the extra space after periods so i can stay relevant, yeah.
Yes, emulation is an option, but I don't think that ARM running x86 emulation layer will be competitive with native x86 CPUs. Didn't this happen to Itanium? Slow x86 performance and AMD's x86-64 resulted in virtually zero market for Itanium.
...considering how much software, both for Windows AND Linux, that isn't for ARM based CPUs...
CPU architecture doesn't really matter with FOSS - once you have a working compiler, you just compile everything from source. Alright, you need some arch-specific work in the kernel and a few other places too. But by the time you get to end-user applications, all of that is long gone. So I would reply with "almost all Linux software already is for ARM-based CPUs". Or MIPS. Or POWER/PowerPC. Or whatever architecture you want.
And one advantage that ARM's low power/heat could bring is high density. Take a look at the Gumstix boards. Now imagine a "blade server" board with 16 or more processors crammed onto one board. You could easily get at least a few hundred CPU's in a 19 inch rack, with each CPU draining less than a watt of power. Now I'm not really sure what could be done with such a system - either do everything over the network (NFS or ATAoE), or equip each CPU with a good lump of flash storage for data and programs. But it would draw very little power and is something to think about.
There are a lot of boxes out there doing nothing but serving files and printers, if ARM did start to be popular you can be sure that MS would be sure not to lose that business. And then, once you have the things installed, it suddenly makes sense to write some of your new programs to run on them...
What makes you assume Apple won't switch to ARM sometime in the next couple years? They dumped PPC for X86 due to the more favorable power/performance ratio. It's only natural to assume that when high-powered ARM processors appear, Apple will switch to that architecture without a moment's hesitation.
To be fair, that doesn't counter his argument, amd64 has more registers than i386 and they do make a big difference. Repeat the tests with 32-bit pointers and 64 bit registers and then get back to us.
As of today, the method he mentions would probably provide a bit better performance, assuming the processor optimizations didn't break when their expectations weren't met.
However, I think it is very short-sighted to miss the fact that about the only thing increasing these days is memory and that apps tend to grab all the address space they can get. By 2050 I can see machines with 1TB ram, but I can't see apps keeping themselves under 0xFFFFFFFF.
Furthermore, thanks to ASLR, which is a feature available now on most OSes, address space fragmentation is a problem today even for programs well under the 4Gb mark. The future is 64:64. 32 bit architectures are already dead, they just haven't realized it yet.
10 little-endian boys went out to dine, a big-endian carp ate one, and then there were -246.
What you should be asking "how many VMs will it run?". That's where servers are now really. If it has some similar VM CPU extension like the current intel/AMDs do and it gets superior performance-per-watt there could be a big argument for using ARM in the server room...
"UNIX is very simple, it just needs a genius to understand its simplicity." -Dennis Ritchie
That reminds me of something else I don't get: why didn't they just drop a single X86 CPU into Itanium for running needed x86 code? in the old days we often had several different chips working together, ala the Amiga. Now it is all or nothing, and that makes NO sense to me! We should be able to just drop in a PPC, or ARM, or hell even custom ASICs, and be able to get the best of ALL platforms, just as we are able to get vector processing now with Stream and CUDA thanks to GPUs.
I just don't get why we don't go back to that, especially since we have high speed interconnects like HT on AMD. So why are we all or nothing now? Wouldn't it make more sense to have highly specialized chips in the server for specific needs than trying to force X86 to do it all?
ACs don't waste your time replying, your posts are never seen by me.
You must not be aware of JIT style binary translators. I used to use x86 Windows binaries on my DEC Alpha running NT4 (and NT5 up till build 2000) under FX!32 all the time. Heck, I even ran Win32 games that used OpenGL. Nothing prevents somebody from doing the same with ARM. In fact, I'm waiting for somebody to do that for WINE.
I think I saw a coprocessor that worked with AMD's HT and plugged in place of another CPU (on a motherboard with multiple sockets).
Itanium+x86 CPU would be expensive and still slow, I mean if all your current programs are x86 (because Itanium was a new CPU, it did not have any software written for it at first) it does not make much sense to buy a CPU that is slow on x86 but would be faster if you rewrote or recompiled your software (in case of locally written apps) or bugged the developer to do it (in case of licensed apps). Also, IIRC, Intel was trying to get away from x86.
It probably is also much more expensive to get the specialized chips and write software specifically for them instead of just buying a faster general purpose CPU. CUDA and similar are useful in only a minor subset of jobs (ones that can be split to a huge number of cores, do not need a lot of memory and do not need to access it often).
What you should be asking "how many VMs will it run?". That's where servers are now really. If it has some similar VM CPU extension like the current intel/AMDs do and it gets superior performance-per-watt there could be a big argument for using ARM in the server room...
But would you really need to even have any VMs if you can 10-20x as many servers in the same amount of space? Instead of running server on a VM you could just dedicate it to its own ARM computer.
"To prevent this day from getting any worse, I'll just read ERROR as GOOD THING" 1GJU8xLuDKDxEs4KLf8fAGyptoDsqvEsBT
Some stuff just needs to get to a lot of disks and does not need much in the way of CPU power, such as the 500MHz Sun stuff I already have in my server room to run tape drives, or a webserver with some simple static pages that gets about 3 hits a day etc. Other functions such as local DNS, ntp, dhcp and a pile of others really don't need much grunt and are sometimes handy to have on a bit of hardware that isn't doing anything else that is taxing. Then there's the file server/NAS situation where so long as the CPU can feed what comes off the disk controller card to all the NFS and SMB connections then it's fast enough. All niches, but one or two niche devices in every server room adds up to a lot. The 64 bit nature is just icing on the cake for future compatability while the real driving force is less watts and space to get tasks done. WE don't need it much while there is Atom and others, but ARM might need it to continue as a platform despite the obvious killer apps in mobile/embedded devices.
If it's cheaper, doesn't use much power and gets the little jobs done then it will have an advantage over Atom etc.
He's ignoring at least two benefits that help no matter the architecture.
First, address space allocation is a hashing problem. Calls to malloc (mmap) need to find free space; this gets slow as the address space fills up.
Second, we care about security and we realize that programs may have bugs. ASLR benefits greatly from extra address space. This can be the difference between an attacker having a 1-in-256 chance or having about a 1-in-trillion chance.
He's also focusing on the cache-related downside while ignoring the cache-related upside. Unless you get rid of all 64-bit libraries, adding 32-bit stuff to the system will double the amount of code that the CPU has to cache. Context switching between 64-bit and 32-bit software causes one or the other to start getting thrown out of the cache.
But by the time you get to end-user applications, all of that is long gone.
C and the C like bits of C++ are a very leaky abstraction.
Take unaligned accesses for example. Some architectures will just quietly fix them up. Some will terminate your app with a sigbus and some will return bogus results (with older arm chips arm did the last of these, with modern arm chips the kernel can trap it but iirc it doesn't by default).
And then there is the fun of va_list. on x86 it's a simple pointer, on other architectures it's a more comlex structure and this can cause problems if you try to use it in certain ways.
As someone who has watched debian rc bugs architecture specific failures are not at all unusual. Sometimes it is actual bugs in the toolchain, other times it's portablity issues in the user code.
For common FOSS these issues have already been largely fixed (at least to the extent that they broke something obvious) because of the work of projects like debian but if you have custom C or C++ code written by code monkeys then you have a problem.
And if your custom code is in java you potentially have a much bigger problem. There is an arm port of openjdk but it's rather immature at the moment. There is gcj too but don't expect good compatibility there.
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
They basically did add an x86 to Itanium. I think the Itanium 2 had an updated Pentium core on it, to improve the emulation speed. Frankly, though that took up a lot of die room, and was very slow.
Comparatively, I've seen x86 emulation be faster in many cases than native x86. Granted, this was with Alphas and fx!32, where the chip simply was faster. Trying that with a slower chip, much as Transmeta tried, didn't work so well. One of the problems with just using HT on AMD, is that there wasn't really a standard interface, and let's face it FPGAs are rare. FPGAs also require a design to run.
The thing that most people don't get is this: compilers are not magic. Special hardware almost always require special code. Code is relatively expensive/time-consuming to create. With compilers being unable to automagically generate great code (on of the major reasons for Itanium failing-Intel bet compilers would be a lot better than they were, or currently are) for even a VLIW as the IA-64 is, it basically killed any performance of Itanium. (Nevermind a lot of (at least from the outside) stupid design decisions, that got reversed when Intel brought in the HP engineers from PA-RISC & what remained of the Alpha engineers from DEC via Compaq, and produced the relatively much better Itanium 2.)
Now, as to why not differing architectures, it's hard. The hardware has to support it well, which can be done somewhat, the OS has to support it. Right now, there isn't a mainstream OS which is well setup to handle non-SMP multiprocessor. Usually if it does, that's only clockspeed differences, and maybe slightly different chips of the same architecture. (It was somewhat common on older SGI and Sun MP boxes.) Granted, Linux can be worked into doing cross-architecture MP (such as with IBM's RoadRunner), but it's not common and needs a lot of tweaking. You will note that there is only one RoadRunner system. Some of the Crays (also Linux) are that way as well. However, those are top10 level supercomputers, with a dedicated and usually very highly competent & trained staff. They would likely be useless without that staff. The only reason things like CUDA or OpenCL are useful is they are well defined interfaces, with a lot of hardware, and it doesn't really need to alter the Operating System.
For a custom piece of hardware, if it's deployed widely enough, it might be useful, see for example cell on the PS3, where you have a slightly different kind of semi-MP setup. Mostly older machines are coming to mind here for examples, some of them successful (PS2 emulation of PS1 via a chip which was usually used for sound) and the Nintendo DS, but most others have failed. Again, code is time-consuming, and unless you either have a very large base (DS/PS2) or a very specific reason(Supercomputers), you don't really want to deal with anything excessively complicated.
It's a large and accepted platform. We don't want to be stuck with x86 forever just because there is no valid alternative and no innovation because it's dominating the market.
The other main reason is that power actually is very important in the data center. Everything you power, you have to cool. All large server room users, the cloud providers and firms that need their own centers because they have a whopping amount of servers for their use, are looking at moving to colder climates, using natural resources to get a more friendly power bill and carbon footprint and all. Sun is playing the power consumption card with their latest SPARC generation and it seems to be very beneficial for certain usages.
I was promised a flying car. Where is my flying car?
ARM is a much more modern ISA than X86.
Yes - there are about 5 years difference between them.
You might not fully appreciate it - but ARM is 27 years old by now. I wonder what percentage of slashdot readers wasn't born when the first ARM driven machine (Acorn's Archimedes) was introduced.
(I do take your point of it being more modern - but please also place it into perspective - it's not the latest/greatest thing that hasn't had time to prove itself in the market - it's been around for a large part of my lifetime...)
When I was a youngster like you, we had 200Mhz, single core ARM processors and we liked it! Get off my lawn!
Think of how stupid the average person is, and realize half of them are stupider than that.
I get what you're saying but I don't think I explained myself well. what I was talking about was like this: Think of distributed computing...in a box. Instead of send a job to another machine, you would send it to another part of the box itself. With things like virtualization and hypervisors this should be possible. And I'm not talking about having a dozen different CPUs running full bore 24/7, what I was thinking of was more like what we have with CUDA, where specialized jobs that can be done much better by CPU B will be sent to CPU B instead of trying to power through it with CPU A.
Let me give an example: Say you have a server and part of its job is encryption. Let us say that PPC does encryption 1000% faster than X86, then my question would be "Why can't we using hypervisors allow chips for certain tasks while having the "main CPU" for others?". I mean we already have such a thing in mobile with Broadcom and specialized chips that let Atom CPUs decode BD content they would never otherwise be able to process, why can't we do the same in the server?
Now I'll admit I'm not a coder and I'm sure as you pointed out there would be some pretty big hurdles to overcome, but considering how much money there is in datacenters anything that made these machines even 30% more efficient energy and processor wise would save billions and be worth the effort. And it isn't like there aren't plenty of jobs that could probably be done better by a custom ASIC or separate arch with the main CPU conducting, like encryption, Java and Silverlight VMs, certain math like they use in HPCs which IIRC Itanium excelled at, etc. I remember when HT first came out there was all this talk of plugging different chips into the socket to allow the Opterons to have a fast link to communicate, but then the multicore war heated up and the idea seemed to be forgotten. I simply think it is an idea worth revisiting is all.
ACs don't waste your time replying, your posts are never seen by me.
Current Windows software won't run on ARM.
We're talking server software so...
All of those Java web apps will work. All of those ASP / ASP.NET web apps will work. All of those PHP web apps will work...
I am TheRaven on Soylent News
Actually, Apple did not create Rosetta. They licensed it from a spin-out from Manchester University, called Transitive. The same company also sells x86 to MIPS translators and has demo'd SPARC to PowerPC and vice versa emulation. I don't know if they make x86 on ARM emulators, but I'd be surprised if they didn't.
The neat thing about the Transitive stuff is that it makes it very easy to call native libraries from emulated code. This is really useful when you have a binary-only app that uses a lot of shared libraries. If you ran on Windows/ARM, any x86 code would call native code for any of the Windows system DLLs, which accounts for quite a large proportion of their CPU time.
I am TheRaven on Soylent News
There are a lot of boxes out there doing nothing but serving files and printers,
running on ARM for a couple of years already. It's funny how many comments are talking about hypothetical situations, while home servers like Buffalo Linkstations have been available for years. You can install a proper distro (at least Debian or Gentoo) on one and see for yourself, then imagine what they could do with more CPU power and RAM.
Escher was the first MC and Giger invented the HR department.
But that's where virtualisation comes into play on x86. Run lots of little machines on one bigger one..
I am a viral sig. Please copy me and help me spread. Thank you.
It's sometimes handy to have an extra bit of hardware (eg. the example above where you need a machine to physically connect to the tape drives). For the sort of tasks I was talking about (DNS, DHCP, low end httpd etc) you don't need the overhead of virtualisation to run a lot of them on the same host with any sort of modern operating system - one very low end machine can run a lot of things.
I'll bet virtualisation has inspired a lot of people to do really stupid stuff like run a MS Windows PDC and BDC on the same bit of hardware or outside the MS Windows world multiple DNS servers in the same box.
I thought the whole point of ARM was it was super low power for mobile devices?
The ARM didn't start off that way: the first ARM chips (back when the "A" stood for "Acorn" not "Advanced") were designed as tasty desktop/workstation processors for systems like this which could seriously put the wind up the x86s (for x in 2,3,4) of the day. There was even an add-on "accelerator" card for PCs (PDF file). The shift of emphasis to embedded/mobile processors was because Wintel had the desktop market sewn up.
while I'm sure cutting down power usage in the server room would not be a BAD thing,
For anybody with a significant sized server farm, power consumption is a very BIG thing - first you have to pay for all that energy, then you have to pay again for all the air conditioning to get rid of the resulting heat. Also, less heat means you can pack more processors into a smaller space.
considering how much software, both for Windows AND Linux, that isn't for ARM based CPUs
Possibly true for Windows, but no so much for Linux, where most software is distributed as source and there's a long tradition of supporting multiple processor architecture. Debian supports ARM, for example. Heck, how many ARM-based NAS boxes are already out there running Samba and LAMP stacks? If you have Samba, Apache, PHP/Perl/Python and MySQL or PostgreSQL then that's already a pretty big range of server apps.
In a survey of 100 programmers, 111111 thought that duck-typing was a good idea.
They did effectively drop a single x86 CPU into one of the Itanium chips. However, since the (Windows) customers wanted to use x86 programs almost exclusively, they weren't so impressed with their new Itaniums performing like 400MHz Celerons.
There is practically no reason to prefer a specific instruction set for specific tasks. POWER isn't incredibly fast because it uses the POWER instruction set; you could use the exact same design techniques to make an incredibly fast ARM (or even x86, with a little more trickery). Good luck with selling a $20k ARM or x86 though.
Finally! A year of moderation! Ready for 2019?
What you say is true in general, but x86 and ARM provide almost identical C-level abstractions. Both are little-endian. Both are 32-bit. Both support unaligned access, but it's very slow[1]. Both support the same set of GCC intrinsics for atomic operations. I don't recall how va_list is implemented on ARM, but I've never seen any code that does anything with this other than use the standard macros, which are portable.
[1] Not true for all ARM chips, but the ones that we're talking about all trap to the OS for unaligned loads and stores. This is very slow, but it's pretty slow on x86 if it spans a cache line too.
I am TheRaven on Soylent News
You can gain 30% CPU efficiency just by picking the L series Xeons. Or delaying your purchase by a year. Chip architectures are only orders-of-magnitude faster than other architectures at specific jobs in the small window until the other CPU designers catch up. Notice SSE vs. Altivec, or the various dedicated crypto/hash instructions.
The only place where it makes sense to have different architectures for different jobs is in GPU's, and you can already mix-and-match those to your heart's content.
Finally! A year of moderation! Ready for 2019?
The buzzword that ARM server vendors are spouting is physicalisation. The idea is that you get something with an ARM core or four (with integrated network controller), a blob of RAM and a blob of flash all stacked onto the same package. You put these densely on a board, probably an 8x8 arrangement, giving 64 discrete computers, connected to a SAN for anything that doesn't fit on the flash (probably 1-2GB in a single chip for the OS). Each one consumes 1-2W and can be powered down when not in use. If you want a new server, you flash a new chip with your boot image and power it up.
That said, the Cortex A15 does support virtualisation, up to four cores per die, and a 40-bit physical address space.
I am TheRaven on Soylent News
Isn't the main problem with IA64 is that it's slow executing its own code, possibly due to quality of compilers, let alone emulating that of another chip?
the ones that we're talking about all trap to the OS for unaligned loads and stores.
While modern arms can trap to the OS and the OS can fix them up that isn't the default behaviour.
http://lecs.cs.ucla.edu/wiki/index.php/XScale_alignment#Have_the_kernel_find_the_problem_for_you
This is very slow, but it's pretty slow on x86 if it spans a cache line too.
IIRC kernel traps are extremely slow (for example reports i've seen say that kernel floating point emulation is 10 times slower than pure software floating point) which may be why this was never turned on by default. Is unaligned access on x86 really THAT slow?!
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
There was no attempt to counter the OPs argument, only a single point that is untrue for the steady transition from i386 to AMD64. While I can respect your fairness, the facts remain that people are asking for AMD64 apps because they exhibit better real world performance and because few can be bothered with multilib systems.
Surely 32bits better for each pointer :P
A few years ago, Microsoft bought a company called Connectix. Their flagship product was called VirtualPC, and was an x86 emulator for PowerPC. Not only could they do it - they have the expertise in house to do it. I believe they used this code to run XBox games on the XBox 360 and if the people who wrote it haven't left they could almost certainly come up with an ARM version.
Or they could use it as an opportunity to push .NET - anything written with .NET will run in Windows on any CPU...
I am TheRaven on Soylent News
Maybe Google is interested in using ARM on the server-side. 64-bits sounds logical over there and it would heavily reduce power consumption on applications that are optimized.
It's not as slow on x86, but it's sufficiently slow that you want to avoid it. Basically, no one is the kind of horrible pointer casting that generates this kind of access on x86 in anything performance critical.
It's also worth noting that compilers will never generate unaligned loads if they can avoid it. If the compiler can tell that the load might be unaligned, it will generate a pair of load-mask-shift sequences and xor the results together (the shift is free on ARM, so this is not as expensive as it could be). It's only if you do some evil pointer arithmetic and casting that makes the compiler think it's got an aligned pointer when it actually has an unaligned one that you get problems.
I am TheRaven on Soylent News
Just imagine some 32 or so ARM cores in a chip. With x86, that thing would probably melt, but ARM power requirements are way lower.
Rethinking email
But would you really need to even have any VMs if you can 10-20x as many servers in the same amount of space? Instead of running server on a VM you could just dedicate it to its own ARM computer.
Yes. VMs are massively easier to manage than physical machines.
The main problem isnt so much that compilers werent good enough.. the main problem was the expectation that compilers *could* be good enough.
Intel's asymmetric execution unit philosophy only goes so far.. sure, you can get an extra amount of performance from the same number of transistors, but only for problems for which that asymmetry doesnt become a barrier.
The proper way to design asymmetric execution units is to sample lots of existing code and then produce an optimal set for that sample set.
This is how Intels x86/x64 line of CPU's has been optimized and it works. The Itanium on the other hand was done backwards. They designed the asymmetric units first and then expected compilers to eventually produce code which leverages the asymmetry well..
AMD has always been more symmetric in its design. Each ALU can handle of all the same operations, and so forth.
"His name was James Damore."
And when you answer with the example be sure to tell us why not being able to run that application effectively "is totally crippling" to the platform.
I'm not saying that having more memory would not be useful, simply pointing out how idiotic the argument is that the platform is "crippled" simply because it has less memory than many desktop machines.
You also have to remember that many large 'web-properties' (top websites) all run on Linux or BSD. And that for example Google is supposedly the largest designer of motherboards only behind HP and Dell.
Only a quarter of the websites run on Windows, so that is still a lot of servers that could be replaced over the years with more efficient processors.
You could argue they already have, they sell many, many times more ARM-based devices then they sell desktop-machines.
New things are always on the horizon
I've got eight 8-bit AVRs and duct tape right here. That's almost the same thing.
That isn't necessarily true. For an example look up the Via Nano design and check it out. They have placed a good chunk of silicon specially designed for crypto and RNG, with higher AES and Blowfish going through that chip like crap through a goose. Now I can't picture Intel and AMD suddenly deciding to just add a big chunk of silicon for a specific job like that which would only help in certain roles. But in a server I can definitely see how having that chip cranking the crypto while a nice Opteron or Xeon does the heavy lifting would be good.
The whole "we'll make it bigger LOL!" is how we ended up in the MHz wars in the first place. There is a GOOD REASON why ARM is having more and more things like BD encoding done off chip, it is because often a specialized chip can do it MUCH better than a general CPU. That is why you don't see Xeons in TV sets, and why Intel is making an in order CPU that is like a cranked up P1. I simply think it is an avenue worth exploring and that someone with a "new way" that came up with a drop in design could make serious $$$$$. Just look at those shops that have turned Atom into a blade server chip. By splitting the work to hundreds of smaller chips you end up with pretty significant power savings and that turns into damned good profits for the company that thought it up.
If ARM tries to put everything on chip they will end up just as power hungry as any Intel chip so it is pretty obvious they will probably go multi-chip, so maybe that route is worth exploring? Breakthroughs often come down to "hey why don't we try" in the tech field, and I still say this is a good niche which could make some good money. Imagine being able to boost performance and add server roles by simply dropping in an HT chip or even a PCIe board with the RAM and CPU onboard?
ACs don't waste your time replying, your posts are never seen by me.
Exactly. How could they pass it up?
Mac: I'm a Mac ... [PC slumps over].
PC: and I'm a PC
Mac: You're looking a little sluggish PC.
PC: Yeah, my Dr. says I should stop using this Intel processor, but I just can't quit.
Mac: That's too bad, because now that I use half the power I can run an entire day without recharging.
PC: Sure, but who would want
Mac: PC, are you OK?
PC: [picking back up] Sure, I'm just demonstrating how I can go a day without recharging by using sleep mode.
Mac: How much of the day will you be sleeping then?
PC: About 20 hours [PC slumps over again].
---------
If we reach a point where ARM chips are available that match x86 on two of price, performance, and power usage and beat x86 by say 20% on the other factor, you can bet that the only platforms which will remain using x86 are those that can't make the leap. And any platform that can't switch over will be obsoleted pretty soon.
That isn't necessarily true. For an example look up the Via Nano design and check it out. They have placed a good chunk of silicon specially designed for crypto and RNG, with higher AES and Blowfish going through that chip like crap through a goose. Now I can't picture Intel and AMD suddenly deciding to just add a big chunk of silicon for a specific job like that which would only help in certain roles.
It's a shame you can't picture it, because at least Intel has already implemented AES acceleration. I haven't followed AMD closely, but I doubt they'll let themselves fall behind for long.
Finally! A year of moderation! Ready for 2019?
I could be wrong, but having an ARM-based system with Linux/btrfs as iSCSI SAN target doesn't sound to bad either.
New things are always on the horizon
Can you please explain the advantage of ARM over X86 in the server room because this one has me scratching my head.
Perhaps the type of "servers" they had in mind were more akin to the SheevaPlug.
"Slashdot - News and Chat Sites Deviant". (Click "homepage" link above for details).
As someone who has watched debian rc bugs architecture specific failures are not at all unusual. Sometimes it is actual bugs in the toolchain, other times it's portablity issues in the user code.
And how many of use non-Linux users have had to put up with bash-ism when someone specified /bin/sh, Linux-ism syscalls, and GNU-isms on BSD or Solaris when it comes to CLI utilities in scripts?
Portability bugs can be found all over the place. That's one of the problems with Unix to a certain extent.
Let me rephrase that, if the hosts using it are Linux-based, why not use Ceph instead of iSCSI ?
New things are always on the horizon
Advantage of ARM over X86 is lower power consumption. They are prsumably at a point where they can replace an X86 with an ARM processor and both will be able to do the job. It's just keeping an ARM one on 24/7/365 is cheaper on the electric bill than a X86... At least that is how I understand it.
09F911029D74E35BD84156C5635688C0
+2 Troll is Slashdot's way of saying groupthink is confused
i would put it 9 years since it was an evolution of the 8080 and 8085. Even when new it was considered primitive and limited. The 386 and 64bit version have helped a lot but ARM has also evolved a lot more then
But the ARM has had the advantage of a much more advanced starting point the X86 did.
This arm has proven it's self in the market. That is just it as the x86 has creeped up so has the ARM and soon just like as the X86 has grown from the bottom up so has the ARM.
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
That's why I said servers, specifically. If MS produces an ARM port of NT, they'll port over IIS, Active Directory, Exchange, SQL Server, Sharepoint, and all the rest - everything that they need to tell people "you can run our software on your servers, no matter the architecture!" Third-party proprietary software will be slower, of course, but open-source stuff should also be ported quickly. In the end, the ecosystem for server software is actually a lot more architecture-flexible anyhow.
The thought of an ARM server hosting IE6-only ActiveX controls in x86 binary makes me several kinds of sad, but there's no reason it couldn't happen. IIS doesn't care what instruction set the bits it serves are intended for.
There's no place I could be, since I've found Serenity...
Are you redarded?
No, he just has a head cold.
Tiller's Rule: Never use a word in written form that you've only heard and never read. You will end up looking foolish.
True enough, for the physical connection stuff, but for pretty much everything else (bar the really (cpu|memory) intensive servers) can be virtualised :)
Running things on the same box is a problem if you don't have something like vSphere or Citrix XenServer, with their live migration abilities.
I am a viral sig. Please copy me and help me spread. Thank you.
Why suddenly does it appear that somebody is following jcr around with a bucket of "Troll" mods? "Score 3, Troll" just cracks me the frack up.
If you mod me down, I shall become more powerful than you could possibly imagine.