Domain: spec.org
Stories and comments across the archive that link to spec.org.
Comments · 448
-
Re:Macintoshes
It was never about performance per se -- there are plenty of faster things out there than the Core 2 Duo.
Please tell me what processor does better than Core 2 Duo on SPEC CPU?
-
Re:Hmm, so which better reflects real-world usage?
It seems unfortunate that The Tech Report is the one that has to step up and measure energy efficiency. OK, so AMD is more efficient at idle and Xeon is more efficient at 100%. Who ever really runs at either of those levels? What about 10%, 20%, 30%, etc. Those are real-life utilization rates. SPEC is looking into doing something. So is the EPA. Maybe they can get together and figure it out.
-
Meaningful, normalized values of watt/performance?
It's very useful to have some normalized way of measuring watts/performance, as they try to do in this article. But at least they could have used a more general and useful benchmark, like those offered by www.spec.org.
-
Re:Nice chip but... will we get to see the benefit
All this is a big yawn. Two important metrics: Frame rates and http://www.spec.org/cpu2006/results/ / http://www.spec.org/cpu2000/results/ .
Also, you want to save big on power, try making mass storage solid state or holographic. Who cares about faster CPUs at the moment when memory needs to get faster and less latent and less reliant on pipelining and more geared towards faster random access and throughput along with faster mass storage without moving parts.
AMD: Losing in SPEC after having a huge lead and having the superior HyperTransport architecture (direct descendant of the EV7 bus from Alpha's glue-less SMP).
AMD is now a partner with WildTangent, what a joke. http://www.theinquirer.net/default.aspx?article=35 085
Also, AMD bought ATI and is seriously de-focused - I'm deeply saddened that Intel wont be getting the same level of competition as before because it seems to me AMD/ATI are not headed in the right direction.
PS: Ever notice Intel "VIIV" is VI IV or 6 4 in roman numerals? Lol, right.
I'm no Intel shill, but these new Conroe/Woodcrest CPUs, well, try them out. You'll see. -
Re:Nice chip but... will we get to see the benefit
All this is a big yawn. Two important metrics: Frame rates and http://www.spec.org/cpu2006/results/ / http://www.spec.org/cpu2000/results/ .
Also, you want to save big on power, try making mass storage solid state or holographic. Who cares about faster CPUs at the moment when memory needs to get faster and less latent and less reliant on pipelining and more geared towards faster random access and throughput along with faster mass storage without moving parts.
AMD: Losing in SPEC after having a huge lead and having the superior HyperTransport architecture (direct descendant of the EV7 bus from Alpha's glue-less SMP).
AMD is now a partner with WildTangent, what a joke. http://www.theinquirer.net/default.aspx?article=35 085
Also, AMD bought ATI and is seriously de-focused - I'm deeply saddened that Intel wont be getting the same level of competition as before because it seems to me AMD/ATI are not headed in the right direction.
PS: Ever notice Intel "VIIV" is VI IV or 6 4 in roman numerals? Lol, right.
I'm no Intel shill, but these new Conroe/Woodcrest CPUs, well, try them out. You'll see. -
Show me the SPEC numbers!
ExtremeTech has a plethora of application and synthetic benchmarks on QuadFX, including gaming and media-encoding tests."
Bleh. The benchmarks I'd really like to see are the ones geared towards scientific computing, like STREAMS and SPEC . Nowadays the Intel chips seem to score better on those "gaming" and "media-encoding" benchmarks, but that doesn't neccessarily mean that Intel FPUs are better for general scientic computing. (in fact, experience so far shows the opposite)
-
Monty Python: Argument Clinica point
shooting by
passing through
nearly or entirely without mass?
ephemeral without interception
like a neutron
through a mind
unnoticed
The poem "a point" is Copyright © 2006 Gary W. Longsine, All Rights Reserved.
Perhaps I was insufficiently succinct. Let me clarify. I will assume that you are genuinely curious about the Mach kernel, and not merely a trolling closeted Windows Fan Boy
You made an entirely bogus claim about Mach performance above the 2-4 CPU count which you did not support. Your follow-up makes it clear that you have read the Wikipedia page, but perhaps don't fully grok the IPC microkernel performance issue which is the stuff of Monday Night Hacker armchair kernel designer legend.
In contrast, I provided as a counter argument:- ample historical context which suggests that your claim is wrong, and
- pointers to authoritative soruces, and
- some hints that perhaps I might know a little more about the narrow subject of how many CPUs the Mach kernel has been running on for a very long time (more than 2 to 4, I assure you) than one might hope to glean from buying scrap iron.
(Please check yourself into to the Argument Clinic for further treatment regarding the mechanics of argument.)
One-off hot rods are not required. Apple could buy or rent large systems from companies that have them off the shelf, like Unisys, Fujitsu and otheres (Intel), Sun (SPARC), HP (PA-RISC) or IBM (Power) . At various times since at least 1992 the NeXT/Apple version of the kernel compiled on some or all of those hardware architectures. It probably still does in the labs, even if Darwin, the fork of the source that builds production releases of Mac OS X, may not. Steve Jobs has demonstrated that he values the portability of Mach, and is willing to devote resources to make sure that it remains portable. If market conditions changed, Steve Jobs could wake up one day and decide he wants to ship Mac OS X for SPARC and have it running on his desk by the end of the week if not the end of the day. In a few months, Mac OS X Server for SPARC could be a shipping product. People doubted this for years when NeXT Fanboys said this, but history has proven that this can happen. It can be running and ready to hand out a CD to thousands of developers before IBM even knows you're going to switch processors to one that has a roadmap that extends beyond 2003. You doubt such a scenario? What if Google one day decided they want to convert their data centers to SPARC because they can squeeze 64 cores into the power envelope used by 4 cores in their current data centers, and that they wanted to use development tools for Mac OS X to build a new service. Do you think Apple would be in position to take advantage when a member of the board calls Steve and says something like, "Hey Steve, I have an idea I want to talk with you about..." You bet your stock options, Apple could have a demo in front of Google ASAP.
This issue regarding Mach IPC that has your undies in a bunch is pretty clearly not a problem in the production kernel, judging by the SPECint and SPECfp and other benchmarks that Apple is getting on multiple CPU architectures. Don't take my word for it. Get a box yourself and put Linux , FreeBSD, and Windows on it and see if you come up with similar numbers. Undoubtedly you would. If the IPC inside the Mach kernel was such a performance problem with multiple CPU, this problem would undoubtedly be measurable in a dual or quad core system. The fact that the internet isn't abuzz with people posting bona fide exposes about this issue probably means it's not an issue. (Be sure to use benchmarks which are designed to measure performance on multi-processor machines like the Spec Int Rate benchmark).
Better yet, get a friend at a university or at some place that makes such iron, a -
Re:Rediculous
The second sentence of the linked web page on the FX 4500 X2 says: "Architected with two NVIDIA Quadro FX 4500 graphics processing units (GPUs)". And, as indicated by viewperf results it runs in SLI mode (compare that to a normal fx4500
-
Re:Rediculous
The second sentence of the linked web page on the FX 4500 X2 says: "Architected with two NVIDIA Quadro FX 4500 graphics processing units (GPUs)". And, as indicated by viewperf results it runs in SLI mode (compare that to a normal fx4500
-
Re:why: because its powerful and cost-effectiveThe article is looking at business transformation and evolution, part of a collection on that subject in the same Fortune issue.
MySQL is completely free of charge for all companies, commercial or not, provided the company isn't redistributing MySQL outside the company, notably as part of its own products. Support contracts are per-server (except for the MySQL Cluster engine), not per-seat and are optional (though recommended for any serious business, of course).
Those who do get to pay are those who distribute non-open source applications with MySQL and/or its libraries outside their own company.
If you do want to compare on cost and performance:
- MySQL with Network Silver support for four years delivering 712 java operations per second: $5,985
- Oracle 10g EE 8 core delivering 15% fewer JOPS, three years maintenance: $531,200.
Source: SPEC jAppServer2004 results and licensing fees from the companies. Please see the SPEC page for full disclosures and system descriptions.
That translates to massive savings coupled with tremendous real-time load capacity, particularly with multiple servers in a modern cost-effective scale-out architecture, and is part of why MySQL is so popular.
-
Re:Processors?
Parent wrote:
> "10,240 Itanium processor"
> They could have done the same with 7,869 Athlon 64 processors.
No. Not even close.
It's true that a 2600MHz Athlon64 runs the same speed as a 3800MHz P4 on floating point, but a 1600MHz Itanium2 runs 25% faster than either of those (again, I'm talking floating point). It would take about 13,000 2600MHz Athlon64's to equal the performance of 10,240 1600MHz Itanium2's.
[ See for yourself at http://www.spec.org/cpu2000/results/cfp2000.html ] -
Re:AMD Vs Intel: Round 9
Really? Is it not weird, then, that Sun's octuple-core T1 processor outclassed the competition with at least 2:1, and normally closer to 3:1, in the last SPECweb round?
-
Re:Octacore - T1 "weak cores"?
See here: http://www.spec.org/ [spec.org].
T1000 and T2000 do not appear as a result of the search. Can you point me to the result?
C// -
Re:Octacore - T1 "weak cores"?
Huh?
A computer is measured on the ability to get real work done quickly, not what year the design was created. That is irrelevant. The 8 core T1 CPU supports 32 native threads running since most of the time a CPU is waiting for IO (ram, disk). If Sun got this so wrong, then why is Intel also following this technique?
The workload is important, so for a DB server, the T1 CPU isn't the best fit. But for an application server - stand back! Checkout the price/performance ratio.
Due to non-disclosures that I don't know if they've expired, I can't provide any more details. The 8 core T2000 will eat the lunch performing application server duties of a group of CPUs from Intel (or AMD) that add up to 8 cores for the same cost. See here: http://www.spec.org/.
I will give you that almost any current multicore CPU will toast the T1 on floating point - there's only 1 FPU shared across all 8 cores. -
Systems Designer
I work as a system designer leveraging HP-PA-RISC, IBM, Sun, and HP-win32/64 servers. I primarily work in the UNIX space, but occassionally have to purchase Windows servers. Enough background.
Prior to the UltraSPARC 4+ and the T1 processor from Sun, the Power5 core was approximately equiv to 4 UltraSPARC4 CPUs. For example, a P550 http://www-03.ibm.com/systems/p/hardware/entry/550 q/index.html with 4 cores, is roughly equiv to a V1280 with 10 CPUs. When you add up the total cost of ownership, most software packages are licensed either by core or CPU count. 4 CPU licenses are always always less costly than 10 CPU licenses.
Now the Power5+ processors (which I've already purchased almost 30 servers), feel 2x faster than a Power4. The P550Q Sun's T1000/T2000 application servers are also very impressive, but they aren't general purpose and the workload needs to match what these servers are designed to handle or you will be disapointed. OTOH, a $8k list price with appsrv performance like a $50k box is impressive (assuming that is a true number). I haven't gotten any T1-based servers http://store.sun.com/CMTemplate/CEServlet?process= SunStore&cmdViewProduct_CP&catid=141649 yet.
These servers are too new for many benchmarks outside what each vender claims. I've found IBM and Sun to be truthful in their performance for nearly available systems. They have to be since it is too easy to see where they might have lied in just a few months by checking http://www.spec.org/ -
Re:AAAARRRRRGGHHHH!!!
But to read about them putting this much money into a piece crap like that Itanium after the way they chucked out the Alpha, is expecially galling when you consider that in HP's own internal testing, Alpha EV8s and 9s consistently wipe the floor with even the latest Itaniums.
The Alpha EV8 never made it to silicon, and the Alpha EV9 never even entered the design stage. In short: you're full of shit. Then again, it appears that the latest "Madison 9M" Itaniums run a good 20-50% quicker than the last (EV78z) Alphas to be made.
IOW, you're full of shit, with shit for icing on top. :) -
Re:Well, from what I remember from the KeynoteSPEC MARK
SPEC = Standard Performance Evaluation Corporation
Formerly System Performance Evaluation Cooperative
http://www.pcmag.com/encyclopedia_term/0,2542,t=SP ECmark&i=51813,00.aspAn organization founded in 1988 to establish standard benchmarks for computers. Its first benchmark was a single CPU rating known as the "SPECmark," in which one SPECmark was equivalent in performance to a VAX 11/780. Although SPEC benchmarks continue to rate CPUs, SPEC has a variety of benchmarks to measure graphics subsystems as well as Java, and Web, mail, application and file servers.
And no, it isn't a free download from anywhere
http://www.spec.org/order.html
CPU2000 V1.3- Retail ($500)
- Upgrade ($250)
- Educational/non-profit ($125)
-
Re:Any equivalent for Linux?
SPEC viewperf
http://www.spec.org/benchmarks.html#gpc -
Re:Intel dominates in 1 out of 8 tests?
3D Max, Maya, etc.
http://www.spec.org/gpc/opc.static/vp81info.html
vs. POVRay, Doom, and some OSS video encoder.
So I'd say the OP's point still stands. AMD for kids, Intel for professionals. -
What labs test?
Slashdot title suggests lab testing - special environment, procedures and so on. TFA is nothing like that - just a summary of features and prices.
So far, the only serious test of MySQL I have seen is jAppServer2002.
I expect TPC-C results for MySQL to be published in 2006, unless Oracle bought Innobase only in order to kill it. -
You call that a list? *this* is a list.
Heh, you call that a list?
SPECcpu beats this hands down. THG is great and all, but SPEC are a non-profit organization *dedicated* to measuring the performance of computing systems. Believe me when I say their "CPU 2000" benchmark is not only the standard benchmark, but the *best* standard benchmark out there. It's cross-platform: Windows, Linux, HP-UX, AIX, whatever: you name it, it's been tested. It's cross-compiler: GCC, Intel ICC, AMD/Pathscale, IBM xlC, they're all here.
Here's the list. It's big.
Enjoy. -
Re:SPEC whips those sorry gamer "benchmarks"OK wise guy. Whatever. Let me sell you a nice 486SLC2-66 with a 33 MHz external coprocessor. Most people don't like the benchmark numbers, but I'm sure you won't be swayed by that foolishness. The external coprocessor helps with math. You don't get that with most systems, so this is a special deal.
Go read up on SPEC. It's not like BogoMIPS.
-
Re:Checkout Mirapoint
I completely agree here, while I work for a smaller ISP that started off with 10,000 mail accounts using Mirapoint's solution, we have easily scaled to 35,000 accounts using their scalable appliance approach. The implemenation has a super-hardend BSD base with no known vulnerabilities (we've used them for 5 years with no successful hacks). Their new Commtouch based Anti-Spam and highly reliable Sophos antivirus solution are some of the best in the industry.
In fact, if you want to see a hardware solution that will handle 1,000,000+ users check out Spec Mail (as well as compare to others, but do keep in mind the price and actual cost per user):
http://www.spec.org/mail2001/results/res2004q1/mai l2001-20040126-00034.d.html
And a SAN based solution :
http://www.spec.org/mail2001/results/res2003q2/mai l2001-20030519-00031.d.html
I don't work for Mirapoint, but I've been putting my trust and customers in their hands for email for several years. Everything from security, easy scalable upgrades, and performance that can be managed by a couple of admins (rather than a large team) would be hard pressed to find a solution to beat them for the money for this scale of a project. -
Re:Checkout Mirapoint
I completely agree here, while I work for a smaller ISP that started off with 10,000 mail accounts using Mirapoint's solution, we have easily scaled to 35,000 accounts using their scalable appliance approach. The implemenation has a super-hardend BSD base with no known vulnerabilities (we've used them for 5 years with no successful hacks). Their new Commtouch based Anti-Spam and highly reliable Sophos antivirus solution are some of the best in the industry.
In fact, if you want to see a hardware solution that will handle 1,000,000+ users check out Spec Mail (as well as compare to others, but do keep in mind the price and actual cost per user):
http://www.spec.org/mail2001/results/res2004q1/mai l2001-20040126-00034.d.html
And a SAN based solution :
http://www.spec.org/mail2001/results/res2003q2/mai l2001-20030519-00031.d.html
I don't work for Mirapoint, but I've been putting my trust and customers in their hands for email for several years. Everything from security, easy scalable upgrades, and performance that can be managed by a couple of admins (rather than a large team) would be hard pressed to find a solution to beat them for the money for this scale of a project. -
Re:Who to talk to
Communigate has a SPEC entry with a cluster system capable of hosting roughly a million users (10K emails a minute, SPEC says 2M users, I cut that in half =) Mirapoint has a much more modest cluster capable of 5K emails a minute, so I'm sure they could make you one capable of scaling to 1M users. SPEC benchmarks for email solution are available her.
-
Commercial Package Options
If you are intererested in commercial packages, either Sun's Java System Messaging Server or Openwave's Mx product will easily scale to a million accounts and beyond. Many of the larger ISPs are using these packages or have their own custom mail server. Other possibilities may be Mirapoint(who offers an appliance type solution) or Sendmail.com
If you are into benchmarks, the folks at SPEC have published results from several packages. -
When all else fails... goto spec.org
Using this as a reference point (and from recommendations I've heard)...
I recommend CommuniGate. -
Why do you like the Itanium's design?
Since you like the design, can you please explain why the Itaniums have 2x to 4x the number of transistors but are only in the same performance league as the P4s or Athlon64/Opterons? Is Intel burying the Itanium by making Itaniums with more transistors than they need? Or is the Itanium that inefficient?
See the SPEC CPU2000 results.
And the Itanium physical specs. You can click on the side bar for other CPU physical specifications.
With 2x to 4x the transistors you could get a dual core or even two dual core x86 CPUs.
Don't forget each of the x86 cores taken alone will perform quite well, even in FPU tasks. The Itanium is 2x to 4x faster for some SPEC FPU subtasks, but is slower in others.
If it becomes easy for compilers to parallelize execution across many VLIW/EPIC units, then would it be so much harder for them to parellelize execution across multiple x86 cores?
Heh, or start running some of those FPU tasks on commodity GPUs ;). -
Huh?
Nothing wrong with Itanium?
Look the SPEC CPU2000 benchmark results.
Compare the performance of the top Itaniums with the top P4s and Opterons.
Also compare[1]:
number of transistors
(don't forget to factor caches as well).
die area used.
power consumption.
price
Now can you really say there's nothing wrong with the Itanium?
The Itanium 2 needs about 210-410 million transistors to perform in the same ballpark as P4s or Opterons with about half to 1/4th the number of transistors.
A dual core Athlon64/Opteron with 1MB cache only needs 154 million transistors, 2MB cache versions need 233 million transistors.
Academicians and "True Believers" can talk about VLIW/EPIC and fancy compilers, but I argue if the application you run is so easy to parallelize so that it makes really good use of all the VLIW/EPIC units, then will it really be so hard to make it work in parallel and make good use of both the cores of a dual core opteron?
Maybe in theory there's nothing wrong with the Itanium. But in practice there's nothing that great about it, except for some FPU tasks (but if that's the case how about a bunch of DSPs?)...
[1]
Itanium
P4
PM
Opteron/Athlon64
Dual core Athlon64/Opterons
Let me know if the above site has the numbers wrong. -
Re:Hardware Cost
I belive you are refering to page 5 of the PDF where the study compares the price for the hardware each OS requires.
In TFA on page 4 of 9 :
...
Hardware acquisition costs were calculated as follows:
1. The closest matching system was selected from the SPECjbb® database, yielding a
known test configuration and its expected performance. ...
http://www.spec.org/jbb2000/ is the report quoted. They are using SPECjbb2000 (Java Business Benchmark) to ensure that they compensate for inefficient OSes. Though this only test the performance of Java programs it can be a good indication.
And on the second page : ...
Linux is 40 percent less expensive than a comparable x86-based Windows solution and 54
percent less than a comparable SPARC-based Solaris solution, based on a 3-year period of
ownership for a system supporting 100,000 operations per second on the SPECjbb® benchmark. ...
BTW WRT ".. really there is not much difference in TCO"
Even with the same hardware it would be $50k vs. $67k with Linux @ 27% lower TCO. -
The summary is completely wrong (as usual)
"Rather than clustering a lot of smaller servers together, large ISPs can now use fewer systems to handle massive traffic load."
From the testing reports linked in the article we have:
- CommuniGate Pro Dynamic Cluster - Backend Servers (4 systems)
- High Speed Network File Server for Main Mail Storage (1 system)
- CommuniGate Pro Dynamic Cluster - Frontend Servers (5 systems)
For a total of 10 systems seperated into 3 levels (front end, backend, data storage).
High quality editing for Slashdot as usual.
Source: http://www.spec.org/mail2001/results/res2005q3/ma
i l2001-20050707-00039.html -
The tester responds...
Just thought I'd add a few details and address some of the questions here. My name is Thom O'Connor and work for CommuniGate Systems (CGS), and was the one who put together and ran these tests - you can (mostly) verify this by looking at the comments in the source on the results page.
First off, on the SPECmail test itself. SPECmail is a standardized test (the only one I'm aware of for email) that attempts to closely regulate a level playing field for measuring email performance. It is critical to understand that this is not just measuring SMTP. The 30 million message a day text is a little vague, but it is important that this includes a distribution of delivery, relayed, and retrieved email. Sure, anyone can just relay many millions of messages an hour.
SPECmail does POP and SMTP, so the test measures not just MTA behaviour but also local delivery and then retrieval of the messages. The SPECmail test also uses Quality of Service (QOS) measurements such that a message injected via SMTP to the system MTAs (the CommuniGate Pro Frontend servers in this diagram) must then be delivered locally into the users' account, then be retrieved within 60 seconds. Satisfying the QOS criteria during the benchmark is often the most difficult part.
So, SPECmail itself just does POP and SMTP, which is a little 1990s I agree, but SPEC is coming out with a SPECimap test in the near future, and CGS is also very interested in seeing a SPEC VoIP/SIP test for measuring CommuniGate Pro's Real-Time capabilities.
A few others questions I've seen raised here:
1. The CommuniGate Pro Dynamic Cluster described in this test is fully and completely appropriate for production use in all aspects. In fact, if you're running a 2+ million user ISP on a CommuniGate Pro Dynamic Cluster, we'd recommend you to use these results as a guide for your architecture (although load balancers should be added to the gateway point for all inbound connections). In fact, CGS has ISP customers running architectures which match the layout of the described system almost exactly. All systems in the Cluster service all accounts - you could lose 4 Frontend Servers and 3 Backend Servers, and all users could still access their email (albeit with decreased capacity).
2. HyperThreading was disabled in the BIOS because the downloadable Solaris 10 x86 operating system would not (yet?) support the Intel x86_64 Potomoc chipset properly. That said, on top of the recent security vulnerabilities on the topic, we have also discovered miscellaneous threading and even NFS issues related to having HyperThreading enabled on Linux 2.6, FreeBSD 5.4, and Solaris 10 x86 systems.
3. On NFS...NFS is used safely and securely in this test. The integrity of data storage is one of the major criteria that the SPEC organization closely evaluates when reviewing a SPECmail submittal. Obviously, there are many ways to cheat and/or cut corners using Solid State Disks, unsafe RAM for message queueing, and other techniques that you would never want to use on your production message system. However, the test described here was performed using a standard (albeit excellent) BlueArc Titan Storage System with write caching only in NRRAM and using proper mount options and layout for security, redundancy, and data integrity.
Hope this clears up any misconceptions. Obviously, I'm clearly biased about the work here, but assembling and then passing a SPECmail test of this size is a gigantic effort. If anyone thinks -
The tester responds...
Just thought I'd add a few details and address some of the questions here. My name is Thom O'Connor and work for CommuniGate Systems (CGS), and was the one who put together and ran these tests - you can (mostly) verify this by looking at the comments in the source on the results page.
First off, on the SPECmail test itself. SPECmail is a standardized test (the only one I'm aware of for email) that attempts to closely regulate a level playing field for measuring email performance. It is critical to understand that this is not just measuring SMTP. The 30 million message a day text is a little vague, but it is important that this includes a distribution of delivery, relayed, and retrieved email. Sure, anyone can just relay many millions of messages an hour.
SPECmail does POP and SMTP, so the test measures not just MTA behaviour but also local delivery and then retrieval of the messages. The SPECmail test also uses Quality of Service (QOS) measurements such that a message injected via SMTP to the system MTAs (the CommuniGate Pro Frontend servers in this diagram) must then be delivered locally into the users' account, then be retrieved within 60 seconds. Satisfying the QOS criteria during the benchmark is often the most difficult part.
So, SPECmail itself just does POP and SMTP, which is a little 1990s I agree, but SPEC is coming out with a SPECimap test in the near future, and CGS is also very interested in seeing a SPEC VoIP/SIP test for measuring CommuniGate Pro's Real-Time capabilities.
A few others questions I've seen raised here:
1. The CommuniGate Pro Dynamic Cluster described in this test is fully and completely appropriate for production use in all aspects. In fact, if you're running a 2+ million user ISP on a CommuniGate Pro Dynamic Cluster, we'd recommend you to use these results as a guide for your architecture (although load balancers should be added to the gateway point for all inbound connections). In fact, CGS has ISP customers running architectures which match the layout of the described system almost exactly. All systems in the Cluster service all accounts - you could lose 4 Frontend Servers and 3 Backend Servers, and all users could still access their email (albeit with decreased capacity).
2. HyperThreading was disabled in the BIOS because the downloadable Solaris 10 x86 operating system would not (yet?) support the Intel x86_64 Potomoc chipset properly. That said, on top of the recent security vulnerabilities on the topic, we have also discovered miscellaneous threading and even NFS issues related to having HyperThreading enabled on Linux 2.6, FreeBSD 5.4, and Solaris 10 x86 systems.
3. On NFS...NFS is used safely and securely in this test. The integrity of data storage is one of the major criteria that the SPEC organization closely evaluates when reviewing a SPECmail submittal. Obviously, there are many ways to cheat and/or cut corners using Solid State Disks, unsafe RAM for message queueing, and other techniques that you would never want to use on your production message system. However, the test described here was performed using a standard (albeit excellent) BlueArc Titan Storage System with write caching only in NRRAM and using proper mount options and layout for security, redundancy, and data integrity.
Hope this clears up any misconceptions. Obviously, I'm clearly biased about the work here, but assembling and then passing a SPECmail test of this size is a gigantic effort. If anyone thinks -
The tester responds...
Just thought I'd add a few details and address some of the questions here. My name is Thom O'Connor and work for CommuniGate Systems (CGS), and was the one who put together and ran these tests - you can (mostly) verify this by looking at the comments in the source on the results page.
First off, on the SPECmail test itself. SPECmail is a standardized test (the only one I'm aware of for email) that attempts to closely regulate a level playing field for measuring email performance. It is critical to understand that this is not just measuring SMTP. The 30 million message a day text is a little vague, but it is important that this includes a distribution of delivery, relayed, and retrieved email. Sure, anyone can just relay many millions of messages an hour.
SPECmail does POP and SMTP, so the test measures not just MTA behaviour but also local delivery and then retrieval of the messages. The SPECmail test also uses Quality of Service (QOS) measurements such that a message injected via SMTP to the system MTAs (the CommuniGate Pro Frontend servers in this diagram) must then be delivered locally into the users' account, then be retrieved within 60 seconds. Satisfying the QOS criteria during the benchmark is often the most difficult part.
So, SPECmail itself just does POP and SMTP, which is a little 1990s I agree, but SPEC is coming out with a SPECimap test in the near future, and CGS is also very interested in seeing a SPEC VoIP/SIP test for measuring CommuniGate Pro's Real-Time capabilities.
A few others questions I've seen raised here:
1. The CommuniGate Pro Dynamic Cluster described in this test is fully and completely appropriate for production use in all aspects. In fact, if you're running a 2+ million user ISP on a CommuniGate Pro Dynamic Cluster, we'd recommend you to use these results as a guide for your architecture (although load balancers should be added to the gateway point for all inbound connections). In fact, CGS has ISP customers running architectures which match the layout of the described system almost exactly. All systems in the Cluster service all accounts - you could lose 4 Frontend Servers and 3 Backend Servers, and all users could still access their email (albeit with decreased capacity).
2. HyperThreading was disabled in the BIOS because the downloadable Solaris 10 x86 operating system would not (yet?) support the Intel x86_64 Potomoc chipset properly. That said, on top of the recent security vulnerabilities on the topic, we have also discovered miscellaneous threading and even NFS issues related to having HyperThreading enabled on Linux 2.6, FreeBSD 5.4, and Solaris 10 x86 systems.
3. On NFS...NFS is used safely and securely in this test. The integrity of data storage is one of the major criteria that the SPEC organization closely evaluates when reviewing a SPECmail submittal. Obviously, there are many ways to cheat and/or cut corners using Solid State Disks, unsafe RAM for message queueing, and other techniques that you would never want to use on your production message system. However, the test described here was performed using a standard (albeit excellent) BlueArc Titan Storage System with write caching only in NRRAM and using proper mount options and layout for security, redundancy, and data integrity.
Hope this clears up any misconceptions. Obviously, I'm clearly biased about the work here, but assembling and then passing a SPECmail test of this size is a gigantic effort. If anyone thinks -
Re:Not that great - try RTFAThis isn't just a mail relay, this is (from spec's site):
A standardized mail server benchmark designed to measure a system's ability to act as a mail server servicing email requests, based on the Internet standard protocols SMTP and POP3. The benchmark characterizes throughput and response time of a mailserver system under test with realistic network connections, disk storage, and client workloads.
So that includes users connecting, picking up email, deleting from their data store etc etc etc.
Disclaimer: I have two friends who work for Bluearc but have no other connection to the company
-
Re:Not that great
> Firstly I assume this is just a raw delivery setup
It is not. In addition to the SMTP service, the benchmark models POP3 service as well. From the FAQ, http://www.spec.org/mail2001/docs/faq.html :
"SPECmail2001 is an industry standard benchmark designed to measure a system's ability to act as a mail server compliant with the Internet standards Simple Mail Transfer Protocol (SMTP) and Post Office Protocol -Version 3 (POP3). The benchmark models consumer users of an Internet Service Provider (ISP) by simulating a real world workload. The goal of SPECmail2001 is to enable objective comparisons of mail server products." -
for NFS
If you would look at the details, the mail cluster uses NFS. Invented at Sun.
-
They use Solaris [BSD still dying]
Hey, look here -- they use "Sun Solaris 10 x86" for the OS.
There is still life in BSD. -
Note about CommuniGate Pro
While the focus of this article is on Groupware products, CommuniGate Pro is unique in that it is scalable to millions of users. It also broke the SpecMail record. Read more here.
-
the numbers:
Athlon 64 FX-57: (2.8GHz!) 104W TDP
Pentium 4 571: (the 3.8GHz demon) 115W TDP
So there you have it, the maximum power consumption on non-pathological "power virus" code, for AMD's and Intel's highest-clocked CPUs. If you find a Mack truck that runs on 11% more fuel than a Fiat, let me know, I think that would be kinda cool to drive around in ;)
FWIW, there are certainly some situations where the P4 will jump ahead of the A64FX on performance more than 11%, e.g. mp3 encoding (LAME): 13%, kribibench software renderer (17%), and so on.
And that's all Pentium 4. Don't laugh, but if you can live without absolute leading performance, the Pentium M simply trounces most everything else out there in terms of power consumption:
Opteron 275 @ 43W per core
Pentium M @ 21W (single core)
Basically, power consumption depends on what you buy. Both AMD and Intel offer processors spanning a huge range of power consumption levels, but over the last couple of years Intel has had the upper hand in terms of performance at the low end, because while AMD are selling essentially the one core (K8) in many different guises, Intel have been selling two completely different cores (P7/Netburst' and P6++/'Centrino') for two quite different markets. -
the numbers:
Athlon 64 FX-57: (2.8GHz!) 104W TDP
Pentium 4 571: (the 3.8GHz demon) 115W TDP
So there you have it, the maximum power consumption on non-pathological "power virus" code, for AMD's and Intel's highest-clocked CPUs. If you find a Mack truck that runs on 11% more fuel than a Fiat, let me know, I think that would be kinda cool to drive around in ;)
FWIW, there are certainly some situations where the P4 will jump ahead of the A64FX on performance more than 11%, e.g. mp3 encoding (LAME): 13%, kribibench software renderer (17%), and so on.
And that's all Pentium 4. Don't laugh, but if you can live without absolute leading performance, the Pentium M simply trounces most everything else out there in terms of power consumption:
Opteron 275 @ 43W per core
Pentium M @ 21W (single core)
Basically, power consumption depends on what you buy. Both AMD and Intel offer processors spanning a huge range of power consumption levels, but over the last couple of years Intel has had the upper hand in terms of performance at the low end, because while AMD are selling essentially the one core (K8) in many different guises, Intel have been selling two completely different cores (P7/Netburst' and P6++/'Centrino') for two quite different markets. -
SPEC CPU2000 Benchmark
If you search on the SPEC CPU2000 Benchmark for "Advanced Micro Devices", you will see that AMD uses the Intel compiler. If the Intel compiler is that bad, why do they use him?
http://www.spec.org/cgi-bin/osgresults
I personally compared Intel C++ 8.1 Compiler, gcc (MinGW and Cygwin) and Visual Studio .Net 2003 using the stream and the SPEC CPU2000 on AMD Opteron and Athlon 64. The benchmark result while using Intel C++ 8.1 Compiler was a bit faster than gcc and much faster compared to the VC++ Compiler with -O3 Optimization. Same thing than using the auto vectorized option and enabling SSE2.
May be programs could run even faster if the Intel Compiler would be fair, it's still the fastest Compiler. -
Re:You're still the moron.
My stats are not misleading, the 2 way spec measures throughput, while the uniprocessor measures raw speed. System architecture doesn't account for the vast majority of the G5's crappy throughput. If you just want raw speed, the G5 is 71% as fast as the Opteron, or the Opteron is 41% faster - a "29% difference" is a worthless number without direction.
Spec tests the processor, not the compiler. The point of the benchmark is to use test the system under optimal conditions, and Apple used GCC, which is bad at optimizing for x86. Re: hand coding - go learn something about the complexity of modern processors and you'll find that hand coding has gone the way of lookup tables.
How is spec not accountable? People bitched and moaned about Apple's spec fraud, why wouldn't the same happen if it was Dell? You want spec test conditions? Top500 seems to be missing that. You want verification? I refer you to the Apple fiasco. Top500 seems to be missing that as well.
You points about the G5's supposed effeciency and use in consoles have no bearing on it's performance in benchmarks.
I'm still waiting for evidence of "SPEC is subject to all kinds of problems.", why Apple doesn't use your "benchmark" to advertise their products, these supposed Intel optimizations, evidence of why spec isn't accountable, and why I am "blinded by marketing". -
You are a moron.
The basic SPEC methodology is to provide the benchmarker with a standardized suite of source code based upon existing applications that has already been ported to a wide variety of platforms by its membership. The benchmarker then takes this source code, compiles it for the system in question and then can tune the system for the best results. The use of already accepted and ported source code greatly reduces the problem of making apples-to-oranges comparisons.
If you'd read the link I provided, you'd see that Lapack was created because of changes in supercomputer architecture, providing a more accurate benchmark.
Linpack is type of test: linear algebra, and again, if you'd bothered to read the links, or even my post, you'd notice that 26 different types of test > 1 type.
Finally, you're using a very muddled secondhand source based on 1 test to desperately hang onto your delusion that the G5 is anywhere near the Opteron.
Continue reading the following until you realize that the G5 is crap. crap crap crap. And keep your pie hole shut before you remove all doubt about being a fool as well.
SPECint_rate2000
2200 Opteron 68.1 64.2
2200 PowerPC 970 21.5 20.2
SPECfp_rate2000
2200 Opteron 69.1 63.9
2200 PowerPC 970 20 19.2 -
Re:Your facts are wrong, wrong, wrong
Five things: Linpack was obsolete in the 80s, so Lapack was created.
A one program benchmark is essentially worthless, so spec tries to remedy this with 12 integer programs and 14 fp programs.
Rpeak is hardly a valid benchmark, since it can only be achieved with instructions doing basically nothing, and the erratic differences between Rpeak and Rmax on the top500 make it even more useless.
The G5 has similar architecture to the POWER4+, so I guess its SPECfp2000 would be around 1400 +/-100.
And finally, these benchmarks don't matter if you have the program that you intend the machine to run, say Photoshop, and it's faster on the Mac than x86.
This still leaves the results I gave that the G5 is as third as slow as an Opteron in both int and fp rates valid. -
Re:Your facts are wrong, wrong, wrong
Five things: Linpack was obsolete in the 80s, so Lapack was created.
A one program benchmark is essentially worthless, so spec tries to remedy this with 12 integer programs and 14 fp programs.
Rpeak is hardly a valid benchmark, since it can only be achieved with instructions doing basically nothing, and the erratic differences between Rpeak and Rmax on the top500 make it even more useless.
The G5 has similar architecture to the POWER4+, so I guess its SPECfp2000 would be around 1400 +/-100.
And finally, these benchmarks don't matter if you have the program that you intend the machine to run, say Photoshop, and it's faster on the Mac than x86.
This still leaves the results I gave that the G5 is as third as slow as an Opteron in both int and fp rates valid. -
Re:Mourn this...I don't know of of any official G4 benchmarks, but IBM submitted G5 (970) results of the Spec CINT2000 test to spec.org, and compared to a contemporary (October 2004) Pentium 4, it's not very impressive.
G5: 2200MHz, 986 SPECint_base2000 => 0.448 SPECint_base2000 per cycle
P4: 3466MHz, 1701 SPECint_base2000 => 0.491 SPECint_base2000 per cycle
It looks like the G5 is both slower (ca. 9%) per cycle than the P4, and (much) slower overall. If the G4 were twice as fast per cycle as the P4, it would have to be more than twice as fast per cycle as the G5. This would mean a 1.67GHz G4 would be roughly as fast as a 3.66GHz G5. This is obviously not so, or Apple wouldn't be using 2.7GHz G5s in its most powerful desktop systems and 1.67GHz G4s only in its laptops.
I did find some unofficial SPECint2000 results suggesting a score of 187 for a 1.0GHz G4, which would be about 312 for a 1.67GHz G4. This is presumably the peak result, not the base (as above), which means it overstates the true performance. Even so, 312 is extremely poor compared to anything from Intel, and the resulting figure of only 0.187 SPECint2000 per cycle suggests the G4 is only about 38% as fast per cycle as the P4, or 42% as fast per cycle as the G5.
Some PowerPC supporters have long claimed that CPU benchmarks like SPECint are unfair to PowerPC, in comparison to Intel and AMD, but given the results of Apple's own demonstrations, such benchmarks appear to have been reasonably accurate all along.
In any case, a CPU which is more efficient per cycle is not necessarily a better design, just a different design. For example, HP PA-RISC and DEC Alpha were always very close competitors in terms of performance, but with very different designs. The PA-RISC designers had focussed on efficiency per cycle, so the resulting CPUs ran at relatively low clock speeds (and were unable to scale to higher ones). The Alpha designers, in contrast, had focussed on scalability to higher frequencies, and so Alphas always ran at much higher clock speeds than PA-RISC. Neither was a better or worse design, since each got about the same amount of processing done in the same amount of time.
-
Re:Mourn this...I don't know of of any official G4 benchmarks, but IBM submitted G5 (970) results of the Spec CINT2000 test to spec.org, and compared to a contemporary (October 2004) Pentium 4, it's not very impressive.
G5: 2200MHz, 986 SPECint_base2000 => 0.448 SPECint_base2000 per cycle
P4: 3466MHz, 1701 SPECint_base2000 => 0.491 SPECint_base2000 per cycle
It looks like the G5 is both slower (ca. 9%) per cycle than the P4, and (much) slower overall. If the G4 were twice as fast per cycle as the P4, it would have to be more than twice as fast per cycle as the G5. This would mean a 1.67GHz G4 would be roughly as fast as a 3.66GHz G5. This is obviously not so, or Apple wouldn't be using 2.7GHz G5s in its most powerful desktop systems and 1.67GHz G4s only in its laptops.
I did find some unofficial SPECint2000 results suggesting a score of 187 for a 1.0GHz G4, which would be about 312 for a 1.67GHz G4. This is presumably the peak result, not the base (as above), which means it overstates the true performance. Even so, 312 is extremely poor compared to anything from Intel, and the resulting figure of only 0.187 SPECint2000 per cycle suggests the G4 is only about 38% as fast per cycle as the P4, or 42% as fast per cycle as the G5.
Some PowerPC supporters have long claimed that CPU benchmarks like SPECint are unfair to PowerPC, in comparison to Intel and AMD, but given the results of Apple's own demonstrations, such benchmarks appear to have been reasonably accurate all along.
In any case, a CPU which is more efficient per cycle is not necessarily a better design, just a different design. For example, HP PA-RISC and DEC Alpha were always very close competitors in terms of performance, but with very different designs. The PA-RISC designers had focussed on efficiency per cycle, so the resulting CPUs ran at relatively low clock speeds (and were unable to scale to higher ones). The Alpha designers, in contrast, had focussed on scalability to higher frequencies, and so Alphas always ran at much higher clock speeds than PA-RISC. Neither was a better or worse design, since each got about the same amount of processing done in the same amount of time.
-
Re:Hating x86Why do any of these supposed problems matter? If an x86 (or amd64) CPU (a) works, and (b) is faster than the alternatives, why should anyone care that it happens to implement a fairly archaic instruction set, on top of a completely modern core? Hardly any code is written in assembly language any more, and x86/amd64 assembly is actually very easy to read, write and understand (especially compared to 'advanced' instruction sets like IA64).
Despite all of the PowerPC advocacy from Apple and Mac users, Apple's demo of OS X on a Pentium 4 has pretty well proved the point that Intel CPUs are faster. Other benchmarks, like the Spec CINT2000 test, have long shown the G5 is no match for the Pentium 4 too.
The Pentium 4 is much faster than the G5, or any other PowerPC, so what exactly do you think the 'x86 world' are 'paying for'? Moreover, why do you think Apple switched to x86, and why do you think Sun are so interested in amd64 (which is essentially x86 with 64-bit extensions and some other improvements), if these architectures are so bad? Do you really think you know so much more about CPU architectures than they do?
-
Re:Hating x86Why do any of these supposed problems matter? If an x86 (or amd64) CPU (a) works, and (b) is faster than the alternatives, why should anyone care that it happens to implement a fairly archaic instruction set, on top of a completely modern core? Hardly any code is written in assembly language any more, and x86/amd64 assembly is actually very easy to read, write and understand (especially compared to 'advanced' instruction sets like IA64).
Despite all of the PowerPC advocacy from Apple and Mac users, Apple's demo of OS X on a Pentium 4 has pretty well proved the point that Intel CPUs are faster. Other benchmarks, like the Spec CINT2000 test, have long shown the G5 is no match for the Pentium 4 too.
The Pentium 4 is much faster than the G5, or any other PowerPC, so what exactly do you think the 'x86 world' are 'paying for'? Moreover, why do you think Apple switched to x86, and why do you think Sun are so interested in amd64 (which is essentially x86 with 64-bit extensions and some other improvements), if these architectures are so bad? Do you really think you know so much more about CPU architectures than they do?