Is Sun's Niagara Server Viagra?
argonaut writes "Ace's Hardware has an in-depth article on Niagara -- Sun's upcoming parallel server processor with 8 cores and 4 threads each. The article discusses the chip's radical architecture and what kind of performance can be expected from it in traditionally thread-heavy server applications like web hosting, databases, and other multi-user applications. Given the recent cancellation of the UltraSPARC V, it seems this is going to be Sun's new direction for its in-house CPU design efforts. Furthermore, both Intel and IBM are working on other highly parallel processors and AMD is expected to eventually introduce a dual-core Opteron. So, will more threads prop up Sun's performance?"
Not to destroy the lovely mental image in this thread. Well, here is the story, Sun is working on Niagra and the Rock. The Rock would combine the single-threaded approach of the UltraSparc product line with the multithreaded architecture of the Niagra processor ... check out the complete atricle
The only really significant change needs to be in the lower levels of Solaris' scheduler, so that it handles the context switches properly. Solaris already does that for existing SPARC architectures with thread level parallelism support. The only difference the OS sees is the caches and the number of available "slots" for running LWPs.
Of course, you'll only see a significant benefit when you've lots of threads in the run-ready state (which mostly happens when you have lots of threads, period). Given java's fondness for threads, and solaris' already outstanding handling of systems with thousands of threads, this seems like a smart optimisation choice.
So, with the necessary Solaris installed, your existing Tomcat running on your existing JVM will see all the benefits.
## W.Finlay McWalter ## http://www.mcwalter.org ##
It's interesting that you should mention that, because one of the early multi-threaded processors (at Tera) was specifically designed to solve that problem. The theory was, and still is, that if one thread has to stall it's OK because there are still plenty of others that can keep running from cache. So no, you won't have N threads all running without waits and yielding N threads' worth of performance, but you'll still have enough live threads to give you more performance than you'd have with a single-threaded core.
Only time will tell which way it will really go. Most likely, there will be some workloads on which this approach works extremely well, some on which it provides no benefit, and a few on which you would have been better off with a "fat" single-thread CPU design. One thing to remember is that if the system has X threads, cache pollution and memory bandwidth are going to be problems either way. The fact that the multi-thread processor can still get some work done on some threads even while others are blocked waiting for memory will probably allow it to maintain an advantage over a faster single-thread processor that blocks completely more often.
Slashdot - News for Herds. Stuff that Splatters.
Niagara might have on-chip Ethernet controllers as well as an on-chip TCP/IP Offload Engine. For network intensive applications this could help performance in several ways. Firstly communication between the Ethernet controllers and the CPUs would be internal to the chip, and would not require using system bandwidth. With on-chip buffers, even I/O buffers in main memory could possibly be eliminated (or at least reduced) which would help reduce the burden on main memory bandwidth, as well as improving latency. Niagara also seems to have built-in SSL acceleration, which helps reduce CPU load and improve overall performance. It would be interesting if Niagara has hardware GZIP acceleration too, as dynamically compressing HTML pages for browsers which say they support it can achieve 10:1 to 20:1 compression ratios.
(excerpt from the article)
FYI, the server is using a single 500MHz UltraSparc IIe CPU...
A great number of people use sparcs to run Oracle databases.
Current Oracle licensing schemes require that clients pay PER CPU CORE, for multi core processors. This screws anyone that uses Sun boxes, because the cores are US2 based. So the Oracle client has to pay heaps of cash to use, effectively, a 5 year old processor design. In addition, Oracle licensing requires that if your server has the capacity to hold more than 4 processors (eg cores) thes you have to pay the "enterprise" rates.
So in conclusion, the price of Oracle on a 2 cpu Xeon, AMD, or Ultra sparc 3 is about $6000. The price for Oracle on a 2 cpu Niagra (8 cores each) will be $320,000. Only an idiot will use this cpu (or this database). Since a lot of companies have a huge investment in Oracle, they will have no choice but to switch to x86 hardware. Sun is going to kill themselves with this design, despite the fact that the design, in itself, will greatly improve the throughput of their servers.
Oracle licensing is heavily slanted toward intel arcitecture, they have always penalized people for using risc based processors.
The backplane is what facilitates communication between CPU boards. Yes, they *rate* throughput at 9.6GBps, and that may be the rate. Of course they have more throughput than (typical) Intel machines; those are generally lower-cost machines and they don't have the margins to support high-end features such as high-bandwidth backplanes. My point is that Sun's CAN'T really improve, as they've nailed the clock speed to support multi-speed CPU boards. IBM's backplanes scale 1:3 with CPU speed; you have to have all CPUs at the same rate (e.g. 1.1GHz, 1.45GHz, 1.9GHz), but the backplane is scaled at 1/3, or 367MHz, 483MHz, 633MHz, etc. IBM's backplane CAN increase sustained throughput as faster CPUs are installed, for better overall system scalability.
As for domains with greater h/w isolation vs. LPARs/VPARs with more flexibility; all I'm saying is that IBM and HP have designed in the ability to offer single and sub-CPU system images because as CPU speeds increase, how many systems will really require 4 / 8 / 16 + CPUs in a single system image? Our company is stinking with 1 and 2 CPUs DB servers and we could see having sub-CPU LPARs for tech test / dev servers...
150 Mhz * 288 bits
What part of hyperthreading and "both Intel and IBM are working on other highly parallel processors and AMD is expected to eventually introduce a dual-core Opteron." says to you that "Intel or IBM are not going into that direction that far."
It might be just the way I'm reading it but the only difference is that Intel started small (hyperthreading) and still currently rely on several physical processors. IBM's Power already has multiple cores, and this isn't the first time a dual core Opteron was mentioned.
It seems to me that in a manner of speaking Sun is just currently ahead of the pack. Ultra4 is already a dual core, and with the way Solaris handles multiple threads and multiple processors, I doubt its much of a leap to have it perform very well with an 8 core module. They saw that Ultra4 did what it was expected, Solaris worked well, and took the next step and said how far can we take this. I doubt that at the very least Power wont have 4 or 8 cores itself in the future.
I don't see that too many changes would have to be made to Solaris to make some very good use of this processor, so this could be a very good thing for Sun. Just think of apps that are licensed per-processor, and now you have one that you have to pay for that can do the work that several were doing before.
"I use a Mac because I'm just better than you are."
Well the thing with NIO is, you can write what used to be a one-thread-per-client app with a single thread. So it will actually reduce the need for threads overall, you will just use one per CPU.
Karma: It's all a bunch of tree-huggin' hippy crap!
The Niagara processor and its successor, Rock, are based almost entirely on the Hydra processor that Professor Kunle Olukotun developed at Stanford University. He co-founded the company, Afara Websystems, that Sun Microsystems purchased. If you want to know how Niagara works, just check out the Hydra processor.
The reason that Sun Microsystems abandoned the UltraSPARC V and successors is that the design teams who developed the UltraSPARC processors after the UltraSPARC II were just horrible. Normally, when engineers develop the microarchitecture and eventually the Verilog model of the chip, a documentation engineer documents all aspects of the chip. In the case of Sun Microsystems, there was no documentation engineer. Ultimately, on the very day that Sun released its processor to the market, no documentation existed.
Even Sun's own engineers did not have the documentation to develop the boards that would accept an UltraSPARC processor. The whole experience is incredibly stupid but true. Most engineers on the processor teams are Indians or Taiwanese, and they just "do not do documentation". Various Linux gurus complained about the lack of documentation needed to port Linux to the latest version of the UltraSPARC. Sun would have loved to produce the documentation if it existed. Unfortunately, it just did not exist.
UltraSPARC V had the same problem. The whole design process for the UltraSPARC V was a mess, and canceling the project fixed the mess.
Sun does not have the engineers with the skills to build a fat-core processor. So, Sun moved to thin-core processors like Niagara. They are easier to build and to document. They simply matched Sun's skill set, which is derived mostly from foreigners.
Unfortunately, for Sun, what is easy for Sun to design and build is also very easy for IBM and HP to design and build. If you IBM and HP engineers are reading this article, you are in luck. Just check out the Hydra processor, and you will know the 80% of microarchitecture of the Niagara processor. Fortunately, for you guys, building a Hydra-based processor that executes the Power instruction set architecture (ISA) or the HP ISA is much easier than building a processor that executes the SPARC ISA. Those damned 128-register register windows diminish the number of cores that can be squeezed onto the die.
I would like nothing more than to see Sun's processor department setting by 2008. Sun should not be in the business of designing processors. The UltraSPARC-III fiasco should have been a big clue.
If Sun were purely a software house, we'd have a chance of making a profit.
2) Solaris 10 with N1 Grid containers should give Solaris the finer grain control that users want.