Sun To Release 8-Core Niagara 2 Processor
An anonymous reader writes "Sun Microsystems is set to announce its eight-core Niagara 2 processor next week. Each core supports eight threads, so the chip handles 64 simultaneous threads, making it the centerpiece of Sun's "Throughput Computing" effort. Along with having more cores than the quads from Intel and AMD, the Niagara 2 have dual, on-chip 10G Ethernet ports with cryptographic capability. Sun doesn't get much processor press, because the chips are used only in its own CoolThreads servers, but Niagara 2 will probably be the fastest processor out there when it's released, other than perhaps the also little-known 4-GHz IBM Power 6."
...If they put THESE under the GPL, along with the T1, they'd be getting more press than they could imagine. If they used these a bit more aggressively - such as using them as a graphics processor on a PC - they'd be getting some amazing press. If they keep them locked in a server closet, it's only then that nobody will care.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
This processor will also have a floating-point unit for each core, unlike the UltraSPARC T1 (Niagara) which only had one shared amongst all 8 cores. This should make it much more suitable than the T1 for a wide variety of applications. The T1 did great on multithreaded server-type tasks (e.g web, email, database) but would have been pretty hopeless for anything doing more than a bare minimum of FP work.
Yes, but will it run Vista?
...Quite literally I suspect if the cooling system ever breaks!
I like it. In my work with high performance computers, a significant limiting factor in a lot of our tasks was the interprocessor bandwidth. The Niagra2 has a crossbar, with a huge amount of bandwidth available between the different cores and their L2 caches.
I'd like to see some benchmarks, and more technical specs, on these babies.
Well, we tend to have jobs that are somewhat interesting and potentially even what is commonly known as "a life". This may be an unfamiliar concept but it includes things that are more important than the processing capability of the latest SUN processor (though not by much) but a lot of the added value comes from the fact that conversations with a two year old are generally more interesting than debates here.
I've had a wonderful time, but this wasn't it -- Groucho Marx
(nt)
Because it isn't main stream. Their stuff is really, really expensive. That doesn't mean there's not a place for it (they'd not be in business if there wasn't) but most people just don't care because they can't pay that price. As an example we just took an old Sun server off of maintenance because we've moved all functions off of it. The cost of maintenance was $2,500 per year. No, that's not a typo. Ok well for that price, we can literally buy a new fairly high performance server from someone like Dell or Gateway (with a 3 year warranty).
Well, when you are talking in price classes of that nature, most people just don't give a shit. A new Intel or AMD processor is exciting because it is something that is in the realm of what people can actually afford. Even if the item itself is high end, you know it'll be coming down soon enough. Intel's quads were over a grand when they launched, now you get get one for like $300. Sun stuff based on their own chips (and even not) is just damn expensive. If you aren't an enterprise type user it just isn't going to be on the list of things you'll get.
Thus much less press.
Also, their processor division has been kinda lagging. The SPARC offerings prior to this really haven't stacked up that well against what Intel and AMD have. We got a Sun Fire V440 and it works fine and all for the SPARC only apps we have, but for things that will run on x86, it gets blown away by Core 2 Duos.
The Niagara looks cool but the base model is $10,000 which gets you the 4 core version of the chip and 8GB of RAM. If you want the 8 core setup, that's $21,500 minimum. At those prices, there's going to be little mainstream press as that is out of the range of even most companies. Thus most people just don't care, as Sun never will be bringing it to the masses (barring a massive strategy change).
customers just want to fit 4 cores in one socket. That's all that matters. That you can get a 1U with two sockets and put 8 intel cores in it under under $2k is a big deal right now.
That said I've always wanted to get my hands on some of these new multicore UltraSparcs. I think they have a lot of potential, and the new ones seem extremely powerful.
Now if only Sun would but the low end one in a mac mini form factor and sell it as a java developers kit then maybe I could play with one. The low end sun fires are something I could almost afford, but I don't really want to keep a 1u on my desk just to try out the technology.
I think the big 64-bit address space and the ability to run lots of threads seems to fit well with Sun's Java. Not that I am a Java developer, I just think it's a good match, and it seems to be that's why people were using the older CoolThreads systems, enterprise Java.
“Common sense is not so common.” — Voltaire
you make it sound like talking to a 2 year old is something bad to compare something worse to... Have you ever sat down and talked to a 2 year old? the conversations you have are much more interesting than that of your peers. You should try talking to a few younger kids, or even have some of your own. Have a good one.
... will a beowulf cluster of these run linux, or blend?
Tie two birds together: although they have four wings, they cannot fly. (The blind man)
"Sun to Release 8-Core Niagara 2 Processor"
So does that mean the Canadian version is going to be ten times better than the American?
huh? ethernet ports where? anyone care to explain?
Am I the only person who read the headline as "Sun to Release 8-Core Viagra 2 Processor"?
They're nocturnal, like Vampires..or raccoons :P
Parent: "Hey, would you like to .."
14-yo: "You hate me, don't you! I wish I wasn't born!"
molmod.com - computing tips from a molecular modeling
The two year old here wants me to tell you that "you are mister poo head".
I've had a wonderful time, but this wasn't it -- Groucho Marx
I think the Niagara is a pretty solid design, but it's not the processor to end all processors. For service workloads, I don't think you can get a better processor, but you probably don't want one of these processors in your workstation. Sun Microsystems is also headed in the right direction, establishing an open-community around these processors and Solaris.
I'm Trappped at Berkeley.
Then again, many of us have moved to Australia (or Portland, OR - which is almost the same thing), so don't read too much into the "time of post"
Niagara? I don't want to know what happens when one of these has to compute an integer overflow, do I?
"Let's face it, it's a good story. Accuracy would kill it."
can it blend? - yes I'm sure it can, the iphone blended.
speaking of which how much does this processor cost, and why doesn't Sun Microsystems make laptops, I was looking for Unix machines recently and I decided to go with the Mac book pro, rather than the Linux machines (laptops) at Dell, because of the hardware and general lack of processing power, which doesn't seem to lend itself to virtualizing other Operating systems.
Under the influence of Post-Cyberpunk Gonzo Journalism
I'm still waiting for an assembly that can actually allow me to work w-i-t-h-o-u-t the da.mn b..ox slowing down because something ins.i.d.e has decided to update/upgrade or otherwise do something that has nothing to do with my work. And no, it's virus clean and yes, it has plenty of resources. It's one of the reasons I like Linux, but even that has not yet solved the other problem especially laptop users have:
:-)
Why, after 20+ years of PC chips, does it still take so bloody long to boot up? Only "resume" is reasonably usable. I'm waiting for a BIOS that doesn't show me heaps of wonderful stuff other than when I, for instance, hold a key down during powerup, I'm waiting for an OS that is smart enough to realise that I'm unlikely to rip the video card out of my box, nor is the setup of HDD + CDROM drive going to change anytime soon so it would be nice if it didn't spend a friggin' year probing for change. Probe once, register, get on with it, do not probe again on any reboot. Give me a boot option to request a re-probe instead.
There's a opne BIOS out there that boots so fast they had to slow it down to give the HDDs a chance to spin up - no sign of improvement for the rest of the world. It's a wide open marketing opportunity..
. I am obviously in serious need for my medicine, or at least coffee
Did anyone else notice TFA's mention of Sun's "bang up profits" of $329? Niagara for teh win! Seriously, though: the processor looks cool and all that, but how many of us are likely to seriously be able to play with one anytime soon? (And if anyone answers with 'I am!', I will happily trade jobs with you for a week)
http://xkcd.com/313/
They do. Ultra 3 Mobile.
There are also the units from Tadpole, and I'm sure others
Time to flip the record.
http://www.personaltours.ca/niagara-info.html
Deleted
How dare you correct your own mistakes!
I, and my fellow grammar Nazi overlords, were just about to rip your lousy post to shreds.
Looks like the Ultra 3 mobile might have beeen a nice laptop, but it's no longer orderable. I didn't look to see if they have anything to replace it.
"In a time of universal deceit, telling the truth is a revolutionary act!" -- George Orwell (Eric Arthur Blair)
Only one silly meme per customer please.
Will it run Doom?
To me the most exciting part is that they're putting 2x10Gb ethernet ports directly on the CPU. The crypto is cool too: I hope it's not encapsulated entirely in the ethernet, so apps can call it directly.
If they made these CPUs cheap enough, we could put them on PCI-e cards in a Xeon, and run a Linux cluster over the PCI-e, coordinated by apps running on the Xeon. Or maybe stuff a Niagara/PCI-e box with extras, like we used to do with Mac Quadra 950/NuBus cards. But this time with 20Gbps ethernet per node, for a networked grid of nodes.
--
make install -not war
The quads from Intel provide four physical cores per socket. That is the definition of a quad in this context. The exact workings of how many bits of silicon there are, how they talk to each other and to the rest of the system is, to 99.999% of users and computer buyers, background fluff.
This was the same as when Intel put two single-core chips into a package to release a 'dual core'. Lots of people like you jumped up and down and pointed out it wan't *real* dual core, and how the FSB issue would cripple performance. Amazingly, it wasn't the case - they sold in droves, and real-world performance was good enough to carry Intel through to the 'true' dual core, the Core 2 Duo.
If the competition had anything out that was the same cost and performed significantly better than the 'fake' quad cores, you would have an argument. But they haven't and you don't. Bear in mind I'm talking about the huge x86/x64 market, not the relatively low volume non-x86 server market.
What Intel did back then and again now is perfectly sensible. They have millions of high yield, robust dual core chips being churned out, and they have built into the infrastructure the ability to put two into a package, lower the speed a bit to drop the per-core heat output, and sell reasonably priced (now) quad core chips. When the drop to 45nm happens, they will release their 'real' quad cores, and pretty quickly put two of those into a package to start selling oct-core (whatever we're going to call them). And so it goes.
What's the alternative? Not sell quads until 45nm comes out? Not working out too well for AMD is it? I've asked the question before here and on realworldtech.com - at what point will the FSB problem actually become a painful problem for the Intel chips? Well, not yet (4 core) is the answer, despite dire predictions from the AMD camp for years. My gues is that, shock of shocks, Intel have actually thought it through - and that's why CSI is coming. When the number of cores gets to the point where FSB will actually hurt performance relative to the AMD architecture, that's when CSI will kick in. Maybe at 8 cores, maybe at 16.
What, you don't need quad core yet? Fine, stop your bitching and choose what's right for you. Vive la difference, and 3 cheers for a market that gives us the choice.
One quite important point with the T2000 cost is that Oracle requires 0.25 license per core as opposed to 0.50 license/core with Intel/AMD systems.
I've got 2 T2000/32GB ram boxes here and if you remember their limitations and run what they are designed for, they are awesome.
Resistance is not futile - www.gnu.org
"Do No Evil"
It's like it's 1999 all-over again, except this time Sun actually has revenue in-line with expectations. I continue to maintain Sun is this century's Bell Labs and Xerox PARC all rolled into one.
Website Hosting
Your troll-fu is weak, Young Sycraft. You're arguing something we already know, and something that most people will acknowledge - Sun isn't for the hobbyist or the home/small business market.
But saying they don't get press is a misnomer. They do get press - in the publications that matter to their market. You'll find ads and articles for Sun in places like CIO magazine, Infoworld, and Information Week. Tom's Hardware, in the big scheme of things, is great for commodity hardware review, but when you're building a 24/7 datacenter, they are probably the last place to look for information.
And finally, with there being a greater focus on virtualization and data center consolidation, a server that can handle 64 simultaneous threads will go a long way to conserving datacenter floor space, lowering cooling costs, and using less electricity. So it may have a greater upfront cost, but if you factor it out over 10 years, it will pay for itself in lower energy bills for the datacenter.
My Sysadmin Blog
Don't get me wrong. Most of a large financial house doesn't need a Sun server. However, those who do (e.g. quants) really do. Same goes for government, biotech, military contractors, etc.
Me wants it...my precious.
I have been actively interested in the T1 and T2 series for a while. Currently, my backup server at work is a v880 (Sparc III) with 8 GigE interfaces.
I could replace it, and get more throughput from a T2000, but the issue was doing restores would lose that edge from poor single thread performance
The Niagara 2 series is set to have 1.4X the single thread performance, plus the higher simultaneous threads (Though a slightly longer pipeline).
Since I am moving away from tape and going to Virtual Tape Library tech, I won't be constrained by how many backups I can do and avoid over multiplexing. I plan on doing 24-32 (or even more) simultaneous backups to virtual tape drives without skipping a beat. The only thing then will be keeping the network from being over-saturated.
Don't have any 10Gbe switches in house yet, but that can't be too far off. I'd likely put in 2 4 port 1Gbe cards and pump them like no tomorrow. I'm getting about 20-30MB/sec from each machine, so assuming 140MB/sec on a GigE port, and 8 of them, I can handle over 1100MB/sec, but doing 32 backups would be about 950MB/sec. It is close, but should work.
It's an apples vs. oranges argument. Pick the tool based on what you need, not on what you see on the shelves at Best Buy or Fry's.
Last I heard, they don't. However, the UltraSPARC line isn't really what you want in a laptop anyways. Much better to get an X64 laptop and run Solaris10/x86 on it. You can use the list of tested and proven hardware for Solaris x86 to make sure it'll run without fiddling.
There's been a lot of (justified) doubt in the past about Sun's commitment to Solaris x86, bit it clearly is the future of consumer-directed Solaris. And it rocks.
"People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban
I agree about the x64, and I am/was one of those doubters.
"In a time of universal deceit, telling the truth is a revolutionary act!" -- George Orwell (Eric Arthur Blair)
All that and the 64 threads run at 84 watts maximum (not TDP)
:-P
Please note that it doesn't *run* 64 threads simultaneously. It *manages* 8 threads per core -- but each core has only two integer units, one load-store unit, and one floating point unit. At best a core can have ops from four different threads in simultaneous execution, but this will be a very rare case (when int, int, float, load/store happen at same cycle). Most often each core will be able to simultaneously execute instructions from just one or two threads -- which all is still excellent for 84W!
It just irks me when people read "manages 64 threads" as "is a 64-core uber-chip", when what they have is just a wider version of Intel's idling-eliminating HyperThreading (in each of the eight cores). Surprisingly Sun PR hasn't made much of an effort to remedy the misconception
Yeah. I have to use an umbrella every time I put a shrimp on the barbie.
(Of course that's nonsense, none of us ever use umbrellas)
[...] and why doesn't Sun Microsystems make laptops, I was looking for Unix machines recently and I decided to go with the Mac book pro, rather than the Linux machines (laptops) at Dell, because of the hardware and general lack of processing power, which doesn't seem to lend itself to virtualizing other Operating systems.
That's odd.
I'm looking at right now a Dell Latitude D630 that has 3.5 GB of addressable ram, a Core 2 Duo T7300 dual core chip @ 2 Ghz, and an NVidia Quadro graphics card. And it's running Centos 5 just fine.
~Wx
sig?
Fuck Everything, We're Doing Five Blades
(The funniest thing about this article is that a year after they published it, Gillette actually did release a five-bladed shaver!)
Each core supports eight threads, so the chip handles 64 simultaneous threads, making it the centerpiece of Sun's "Throughput Computing" effort.
Wow! Only 64 threads, eh? That's the problem with threads, you can't have too many of them because switching from one thread to another is very expensive, cycle-wise. In other words, as long as threads remain the only multitasking mechanism used by the computer industry, super fast, fine-grained multiprocessing will remain a dream. It gets worse. There is another problem with threads that is even worse than this. Threads are inherently asynchronous. Until and unless the computer industry comes to its senses and realizes that asynchronous processing makes it impossible to implement programs with deterministic timing, we will continue to pay the heavy price of software unreliability. Switch to a non-algorithmic, signal-based, synchronous software model (with the supporting CPU architecture), and the problem will disappear. Threads suck! Period. One man's opinion.
Thanks for the info. but I'm sure you can forgive me if I just want something off the shelf. I also, want the virtualization software to just run, and I need to virtualize whatever OS I want. Your system would be nice if Dell was selling them with the option of adding Xen, and incidentally running a *nix system from the start. Honestly, I don't understand why Dell doesn't offer this sort of thing it kinda seems like a waste M$ on a 64bit system, selling Linux without a 64bit option is kinda goofy too.
Under the influence of Post-Cyberpunk Gonzo Journalism
Anyone got links to user discussions or reviews on 4.7GHz Power6 since they've been out for 2 months? I've read all the IBM, Oracle, Data Center promo stuff.
"Will a beowulf cluster of these run Vista?"
:(
Well, no. It wouldn't. I guess the joke wasn't that funny then
Quite true. For more info on the way its cores work, see the UltraSPARC T1 article on Wikipedia (which I have edited quite a bit). Each core is a barrel processor, meaning each stage in the pipeline is handling an instruction from a different thread. This adds complexity, but in exchange it means that branch mis-prediction is no longer a problem - any branch instruction has already been through the execute stage and the Program Counter modified before the next instruction of the thread gets fetched.
The other big advantage with the multi-threaded UltraSPARC T1/T2 design is that it has high throughput. While a single-threaded CPU has to wait on cache misses, the T1/T2 just continues chugging along with its remaining threads. It's switching threads on every clock cycle, so each thread gets only 1/8th of the 'power' of each core. But because it's doing something on every single clock cycle, it can do a lot of work - as long as the work is multi-threaded. That's its weakness.
I was lucky enough to get to play around with a Niagara 1 demo unit a year or so ago, and it was mediocre as a general-purpose server. The system was amazingly fast if you could keep its 32 threads saturated (and I notice that the new one is 64 threads), but if you were only running, say, 8 threads, you would do as well on a more mundane server. I don't have the exact numbers here, but from what I recall:
The first test was a "make" test. On my desktop machine (generic dual-core Athlon), configure for some large software package (BerkeleyDB, I think, to run more benchmarks on) took a minute, and make -j 3 took 5. On the Niagara, configure took 5 minutes, and make -j 40 took only one.
For high-concurrency database benchmarks, the cost of synchronization made the Niagara slower than a standard AMD-based server. For a less concurrent load, the Niagara was of course much faster. Interestingly, a dual-core server performed much better here than a dual-processor single-core server, because the synchronization cost was lower.
For web applications, the Niagara did well for simple applications, but introduced unacceptable latencies for more CPU-intensive ones.
For anything floating-point, the original Niagara choked due to its single FPU, but that's what the T2 is supposed to fix.
I hereby place the above post in the public domain.
There's some (very little) talk of Sol 10 x86, mostly pushed by the "If it ain't Sun/Solaris, it's crap!!" crowd.. but its' not getting much more then a little mention from the ISVs because to them it's another whole platform to support, and why bother when linux is working ok and they did all that work to move to linux (at customer request) a few years back. Price of a system is rarely a company's most significant cost (within an order of magnitude) when you're dealing with high performance computing. It's the people and the data vendor relationships that usually cost you the bulk of your outlay. Hardware of just about any sort is fairly cheap by comparison. True, but for us the highest cost is the license costs of the software. So they want to get as many jobs as they can out the door as fast as they can, and SPARC can't do it.
Now the big push from us (IT and engineering) is for the ISVs to parallelize their tools as much as possible. They were ok for awhile there when CPU speeds kept going up, but now that the CPU makers have basically given up on the speed jumps and are going multi-core, the tools need to parallelize too. Thus far, the multi-core and multi-hosted tools still run orders of magnitude faster on linux hosts vs even the latest from Sun.
[*]There was a very short stint with IA64 as king for huge RAM jobs (128Gig+ in the box) for the last testing stages, but x86_64 overtook the cpus in speed awhile ago, and now you can get 256Gig+ hosts (we've recently ordered a few of those). We've got a small amount of 32bit linux boxes left too.. mostly for support of older tools.
- My favorite error message: xscreensaver, running on an old Sparc 5 w/ 8bit color: bsod: Couldn't allocate color Blue
Thanks for the clarification.
The 84 W is impressive considering that in addition to the CPU itself, not only is the 921 Gb/sec RAM controller on the chip (AMD does that too) but also the two 10 Gb Ethernet links (at 50 Gb/sec) and the PCIe interface (at 40 Gb/sec). Normal Intel north bridges don't have 10 GbE, period, and still consume plenty of power alongside the CPU.
Where I work our datacenter is a bit constrained on space, power, and cooling. Adding these bad boys allows us to support many more applications, websites, and whatever else the business wants with less power and cooling and capital cost than what we used last year. And, yes, you can get a three year warranty on brand name Intel servers but the reliability and serviceability of Sun gear lasts way beyond three years.
I think their desktops suck. And I wasn't too much of a fan of Solaris until Sol 10. It was boring. Run Solaris x86 if you want to try it cheap. Linux has made it much better by forcing new features liek Dtrace and ZFS. The cost of entry is a bit steep (and over powered) for SME but if you want serious computing power you can do much worse than Sun. They've been written off more times than I care to count (kind of like Apple) but they're still standing.
Now, I'm no Sun fan boy. I work with Linux exclusively these days because that's the world I live in, and it's what my application wants, but if I were in the business of high-powered computation and I needed a box to run a parallel, non-distributed application, I would run it on a Sun. There was a very short stint with IA64 as king for huge RAM jobs (128Gig+ in the box) for the last testing stages, but x86_64 overtook the cpus in speed awhile ago, and now you can get 256Gig+ hosts (we've recently ordered a few of those). We've got a small amount of 32bit linux boxes left too.. mostly for support of older tools. I think you live in the world where very fast, single-threaded, memory-hungry computation is king. In that world, Intel running Linux might be the right call. Just don't make the mistake of assuming that all the world is your back yard.
"Sun Smoke. Don't breath this."
My abilities are only limited by my imagination
SPARC64 is a waste of money. Don't buy it.
Buy x86 or x86-64 instead.
This processor is ideal for a Veritas Netbackup media server. Netbackup is heavily multi-threaded, and getting the backup data across the network to the server will be tremendously aided by the on-chip 10G bandwidth. Symantec has been doing this with great success.
Imagine 8 web servers each on it's own "server" in an LDom - with the ability to handle multiple requests. While you can do something similar with Solaris 10 containers, LDoms give you control over the memory and independent O/S images.
LDoms can be used with containers. In my example, those 8 "servers" could be owned by different departments who could use containers to have a test and development containers within the same LDom.
People are having good results from similar efforts with VMware for a non-trivial free. By the way, LDoms are free.
Did anyone mention that the T1-based servers such as the T1000 and T2000 don't require a lot of power? 300-350 watts Replacing 8 older Sun SPARC servers with one (as in my example) can save a lot of space and electricity.
-johnj
John J. McLaughlin, Editor-in-Chief/CTO, System News Inc. Publishers of "System News for Sun Users"
I was also 'lucky' enough to play around with a Niagara processor.
I happened to have my own highly threaded app.
It ran really, really slowly. The sun engineers eventually figured out that two of my
threads would saturate one of their processor cores.
(there was zero fp in my code).
With 32 threads, their fastest T2000 took 10,600 seconds.
My old, dual xeon 2.4ghz machine took 11,494 seconds.
An 8 way opteron 2.8ghz machine was 11.6 times faster than the niagara.
A a Sun Sparc V890 8cpu, 16 core was8.3 times faster than the niagara.
Since their t2000 cost around $20k, and a dual 2.4ghz xeon cost around $500-$1000,
that made the price/performance roughly 20-40 times better for a xeon than the niagara.
Perhaps their new chip is higher performing than their earlier generation.
You can see all the gory details at www.weasel.com/comp-perf.html
source code available upon request.