Slashdot Mirror


Intel Lindenhurst Xeon DP Platform Discussion

Steve from Hexus writes "Hexus.net has a article looking at Intel's latest Xeon platform: Lindenhurst, discussing the Paxville dual-core processor, E7520 core-logic, where it could go right for Intel, and where it could all go wrong." From the article: "If you're I/O bound by your threads in any way, you can hit problems (all threads touch the MCH, then there's a 266MiB/sec bus link to the I/O processors to cross, then the data hits disks or network hardware). If you're memory subsystem bound in any way, especially on a majority of compute threads, performance is likely gone. There's just too much resource sharing for it to all conceivably work well, especially compared to Opteron. I can forsee many a scenario where dual-core Opteron will give Paxville Xeon DP a beating."

22 of 111 comments (clear)

  1. Men in Black? by schon · · Score: 5, Funny

    there's a 266MiB/sec bus link

    Wow - that's a *LOT* of Tommy Lee Joneses and Will Smiths!

    1. Re:Men in Black? by swillden · · Score: 4, Informative

      Wow - that's a *LOT* of Tommy Lee Joneses and Will Smiths!

      :-)

      Looking past the joke, for anyone who may be wondering why that 'i' is there, they're just being accurate. "MiB" is the abbreviation for "mebibyte", which is 2^20 bytes. The more "common" notation, "MB", is the abbreviation for "megabyte", which is 10^6 bytes.

      The terms "gibibyte", "mebibyte", "kibibyte", etc. were defined in 1998 by the IEC to disambiguate "megabyte", etc. The "giga", "mega", "kilo" prefixes from the SI units have always referred to powers of 10. With the advent of computers, it became convenient to use them to refer to powers of two that are close to powers of 10. So, "kilo" was used to mean 1024, "mega" was used to mean 1048576 and "giga" was used for 1073741824. The context was generally sufficient to disambiguate those usages from the standard powers-of-ten usages. Basically, everyone figured that if you were talking about computers, the prefixes referred to powers of two.

      But there are plenty of computer-related contexts where the prefixes have their traditional meanings. Hard disk drive storage sizes, for example, are measured with powers of 10 by drive manufacturers, but file systems generally use binary prefixes This is why your 80GB drive shows up as only 74.5GB "formatted". It's not that lots of space is wasted by the formatting; the issue is that 80*10^9/2^30=74.5. The two measurements are using different units. Data rates are also traditionally specified in powers of 10. RAM sizes are powers of two.

      So, to disambiguate the prefixes while not disturbing the traditional meanings, the IEC coined a new set of binary prefixes, along with corresponding abbreviations. The new prefixes all end in "bi", for "binary".

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
  2. gooooo Intel! by tomstdenis · · Score: 4, Interesting

    cost 3 times as much as the 820D ... it's a copy of the 820D ... see where I'm going with this?

    The dual-core intels may cost half as much as the dual core Athlon64s but they still suck twice as bad. What you save in initial purchase cost you lose in electricity bills and time doing work.

    The fact they're STILL making Netburst based processors just sickens me. Give it up already and go P6 or something new. I mean if they put half the money they put into the netburst into the P6 designs of late they'd already have a 2.5Ghz P6 core that would give AMD a run for their money.

    I think the cats out of the bag for the most part. And not like you're gonna sell a lot of dual-core based Dells to grandma so she can write emails.

    Times like this make me feel proud I'm an AMD whore :-)

    Tom

    --
    Someday, I'll have a real sig.
    1. Re:gooooo Intel! by IAmTheDave · · Score: 2, Insightful

      The fact they're STILL making Netburst based processors just sickens me. Give it up already and go P6 or something new. I mean if they put half the money they put into the netburst into the P6 designs of late they'd already have a 2.5Ghz P6 core that would give AMD a run for their money.

      Agreed. What ever happened to Intel leading the pack? Their processors are bloated, slow, and quite unfortunately behind the curve.

      --
      Excuse my speling.
      Making The Bar Project
    2. Re:gooooo Intel! by tomstdenis · · Score: 2, Insightful

      They invested too heavily on the Mhz-myth of the Netburst. To turn around and say "whoops, we're wrong" is hard. That and they have partners that ALSO invested in it.

      What does Dell use? "Dell uses Intel Pentium four processors (cue P4 sound theme)" ...

      It's probably not easy to say "Dell uses Intel P6 processors because the P4 sucks ass, we're sorry, we lied all this time." There is also a huge cultural gap between the engineers and marketters/VPs. I'm sure if any of the engineers escaped and bought an AMD64 box they would be envious. Provided of course, they're not full of their own shit to see outside their little box.

      Truth be told I'd love to try a ~2.4Ghz PentiumM as a desktop processor. It's probably loads faster than a 1.8Ghz sempron and can hold it's own on the power usage front. It would make a great work station or compute box. Upgrade the core to x86_64 and they'd be set.

      Alas, they are their own undoing.

      Tom

      --
      Someday, I'll have a real sig.
    3. Re:gooooo Intel! by Jeff+DeMaagd · · Score: 2, Insightful

      "The fact they're STILL making Netburst based processors just sickens me."

      So?

      The reason that Intel is still making Netburst processors is because chip development is a lot slower than the "speed of internet". Figure two to three years from concept to production. AMD took that long or longer to put out their A64 line. This is why Intel can't make large architecture shifts in a month.

    4. Re:gooooo Intel! by cowbutt · · Score: 3, Insightful
      They invested too heavily on the Mhz-myth of the Netburst. To turn around and say "whoops, we're wrong" is hard. That and they have partners that ALSO invested in it.

      Saying 'MHz-myth of the Netburst' is a bit harsh. There was a time when it made sense - if it allows Intel to sell processors that perform faster than AMD's and retail for similar prices, who cares about the clock speed required to do this? Heck, this was pretty much DEC's strategy for the Alpha - design an architecture that's easily scalable to ever-faster clock speeds, and ramp up the performance by aggressively increasing the clock speed.

      But it was short-sighted of Intel to over-invest in such a strategy without any guarantees about power consumption, consequent heat output, or the growing importance of those issues to its customers.

      In the long run, though, this won't kill Intel, and they'll be back. I'd also expect them to learn from the experience, the same way that after the infamous Pentium FP bug, every processor has had field-upgradeable microcode to (hopefully) eliminate the chance that they'll need to perform a recall of that size - and expense - ever again.

    5. Re:gooooo Intel! by imroy · · Score: 3, Informative
      Heck, this was pretty much DEC's strategy for the Alpha - design an architecture that's easily scalable to ever-faster clock speeds, and ramp up the performance by aggressively increasing the clock speed.

      Except the Alpha was a RISC processor (and a pretty clean one at that), so its short pipelines didn't lose as much performance to branch miss-predictions as the P4/Netburst does. IIRC, both the P4 and Athlon CPU's had to get up to around 1.4-1.5GHz before they beat the performance of the 800MHz 21264, the last and fastest Alpha produced. *sigh*

  3. why I don't build a new PC... by pointbeing · · Score: 3, Insightful

    Got a plain old dual processor 1GHz box that with video and hard drive upgrades is still competent. It does everything I need it to do, although processor- or memory-intensive processes are getting a bit sluggish. Rendering video takes a little time, but that's more because the application I use renders in a single thread - but I can play games and render video at the same time ;-)

    I still believe if you could remove all the latency from I/O subsystems in a modern PC you'd have more processor than you could use by a longshot - IMO high-end PCs just wait for data faster than older machines, and a lot of the performance boost you see with a new machine is simply masking latency in other subsystems.

    PCI-X and improved memory bandwidth will solve some of these problems, but it's a bandaid at best. I do tend to chuckle at people buying the newest/fastest peripheral, not understanding that a lot of the time the peripheral will talk faster than the nine(?) year-old PCI bus that's feeding it.

    When troubleshooting performance issues the component that's working at 100% capacity is *always* the bottleneck - and with most home and business users, that bottleneck is almost never the CPU itself.

    --
    we see things not as as they are, but as we are.
    -- anais nin
  4. Pointless? by plumby · · Score: 3, Insightful

    I'll admit that I'm no great expert on the details of multi-core, hyper-threaded CPU design, but from what's in the article isn't the memory access bottleneck a rather fatal, and obvious, flaw in the whole design? Unless I'm missing something, I'm really struggling to see how this got off the drawing board. What is it's point if the only applications that can ever take advantage of it are the very few that rarely need to access main memory?

    1. Re:Pointless? by mprinkey · · Score: 3, Informative

      I've thought the same. I have racks of single core 3.0 GHz Xeons that strain the memory bus to the limit. Adding more cores to that mix is a waste. So, the new cluster is dual-core AMDs. The Intel architecture is generally good for the codes that we run, but I couldn't justify not buying AMDs. Price, thermal footprint, and performance all went that way.

      Protip to Intel: Stop trying to feed your users this crap.

    2. Re:Pointless? by tomstdenis · · Score: 2, Informative

      They wanted to get their Netburst cores into the DP world as quickly as possible.

      Where AMD uses the HT bus for their 757 and 939/940 parts Intel was still using the good ole 64-bit FSB of yesteryear.

      Most of what Intel does nowadays in the processor world is entirely market driven. The Netburst is a good example. High clock rate, low efficiency processor. Sounds good on paper but works poorly in practice. The EMT64 extensions are another example. A lot of code on the P4 in 32-bit mode takes roughly the same number of cycles on the 64-bit P4s with the notable exception being 64-bit math [e.g. additions and multiplies].

      For example, most block ciphers are the same speed on both the 540J and 820D [in terms of clock cycles]. I think partially because they're just using rename registers for the additional GPRs. But compare the AthlonXP to the Athlon64 and there is a huge difference. The Athlon64 is an improvement over the 32-bit cousin. They didn't just slap 64-bits on the core they actually made it better.

      I refer to my nice chart again

      Operations per second at doing RSA-1024 decrypt

      AMD64 = 2.2Ghz
      AMD32 = 1.8Ghz
      P4 = 3.2Ghz
      Nocona = 2.8Ghz

      At the 32-bit side of things the AMD32 can match or beat the P4 even though it's slower by 1.4Ghz. At the 64-bit side there simply is no comparison. I mean the dual-core RSA on the Nocona can't even match the SINGLE-CORE RSA on the Athlon64.

      How pathetic is that?

      Ever since the 64 came out Intel has basically been a poser in the CPU world. The only really proud achievements [outside the pure sciences they do in the background] are the ARM and P6 core designs...

      Tom

      --
      Someday, I'll have a real sig.
    3. Re:Pointless? by magarity · · Score: 2, Insightful

      isn't the memory access bottleneck a rather fatal, and obvious, flaw in the whole design? Unless I'm missing something
       
      What you're missing is that Intel's PC CPU business is all about the CPU. The chipset and all that other tedious little stuff is just there only because it has to be for the CPU to function. Their entire focus is CPU, CPU, CPU. Look how fast it runs through clock cycles! Look how many cores and pseudo-cores (HT) it has! They've been doing this for ages. Recall the first generation of Pentium 2's had to deal with PPro chipsets because introducing a new chipset for a new CPU took a far back corner burner to the new CPU itself as long as it could be made to function.

  5. I/O Bound via DP by faqmaster · · Score: 4, Funny

    Yep, I'd say that if both her input and her output are busy, she's DP.*

    *See, kids? This is why you should avoid too much pr0n, it just totally warps your mind.

    --
    Are you...Are you some kind of genius?
    No, ma'am, I'm just a regular Slashdot reader.
  6. Re:Who comes up with these names? by schon · · Score: 2, Funny

    What's next, DVDA?!!!

    Hehehe.. type that into Google and hit "I'm feeling Lucky". Good for a laugh. Almost as funny as the National Association of Marlon Brando Look Alikes (or even funnier, as it's real.)

  7. Re:Lindenhurst? They ARE running out of names... by mooingyak · · Score: 2, Interesting

    I used to live on Lindenhurst in NY (on LI). I was once told that it had the most bars per square mile in all of the US.

    --
    William of Ockham had no beard. The most likely explanation is that it was chewed off by squirrels every morning.
  8. Regarding the electricity consumption... by hirschma · · Score: 2, Interesting

    I just put together a Xeon based server. It was a rare case where a Xeon solution met my needs better than an Opteron based solution.

    My company is _very_ sensitive to power consumption. So, I picked a very new motherboard from Tyan, and a Xeon that supported Enhanced Speed Step. I figured that I'd install cpudyn, like I did with all of our AMD boxes, and save a few bucks on electricity.

    So, cpudyn doesn't work... because Speedstep isn't supported by Tyan's BIOS. I email Tyan, and I find out two things:

    * Tyan wasn't aware that Speedstep was an option on the Xeon platform,

    * That none of their BIOS suppliers are supporting Speedstep at this time.

    Amazing! Intel put this in the CPU as a way to compete with this great feature from AMD, but you CANNOT USE IT.

    Most certainly my last Intel purchase, ever.

    jh

    1. Re:Regarding the electricity consumption... by tomstdenis · · Score: 5, Interesting

      Better though is that the D series can only clock down to 2.8Ghz whereas the AMD64s can go down to around 1Ghz [depending on your part]. Clocking from 3.2Ghz to 2.8Ghz doesn't save you that much power [maybe 10W at most ...].

      My AMDX2 is sitting here running Linux and is clocked when idle to 1Ghz ... at 32C with a copper heatsink. The processor draws around 20-30W when idle compared to the Intel processors which draw nearly double that at idle.

      In no way is a Netburst based processor a wise decision over the offerings of AMD.

      Tom

      --
      Someday, I'll have a real sig.
  9. Slashdot posts anything these days by tayhimself · · Score: 4, Insightful
    GamePC has real benchmarks showing the Paxville Xeons getting blown away by Opterons. http://www.gamepc.com/labs/view_content.asp?id=pax ville&page=1&cookie_test=1/

    The Hexus article is just a summary of their results along with several inaccuracies.

    If you're I/O bound by your threads in any way, you can hit problems (all threads touch the MCH, then there's a 266MiB/sec bus link to the I/O processors to cross, then the data hits disks or network hardware). If you're memory subsystem bound in any way, especially on a majority of compute threads, performance is likely gone.
    This is misleading. First off, the MCH is a 6.4 GB/s link so I dont understand how it could bottleneck I/O even if you're compute bound. The 266 MB/s IO bus is for legacy peripherals (USB/serial/SATA). Considering SATA-I (what the ICH5R supports) is 150 MB/s per channel, and USB is 400 Mb/s I cant see how this is a big problem. If you want fast (SCSI/FibreChannel/SATA-OII HW raid) disks and network, there are PCI-X 64bit and PCIe x4, x8 slots that you can have your important I/O subsystem hanging off of.

    Here is a link to the intel datasheets for the chipsets which shows 3 x8 PCIe interfaces for the 7520 and 1 for the 7320. http://www.intel.com/products/chipsets/E7520_E7320 /

    All that being said, the CPU itself is a dog.

  10. Re:FSB @ 200Mhz quad piped? by Lonewolf666 · · Score: 4, Informative

    For a single CPU, the quad piped 200Mhz FSB does fine. It can fully utilize two channels of DDR 400 RAM, which is the standard on the better desktop mainboards. A single AMD CPU does not better.
    Things are different with multiprocessor setups:
    Here each Opteron has its own memory interface, while the Xeons have to share one FSB. As a result, the total Opteron memory bandwith is proportional to the number of sockets. Total Xeon bandwith does not grow with more sockets.
    This does show up heavily in reviews of 2-processor machines, expect it to be worse in 4- and 8-way-systems.

    --
    C - the footgun of programming languages
  11. Why, thanks for bring it up! by ratboy666 · · Score: 2, Interesting

    The article gets the point of Hyperthreading... backwards.

    Yes, the memory interface gets congested, so the processor takes a stall. But, instead of just leaving the ALU idle, it has another thread in reserve to schedule on it. Thus improving the utilization of the ALU subsystem.

    And THAT'S the point of this "Hyperthreading" thang...

    The rest? Well, if the local L1/L2 cache isn't big enough, you are going to suffer. Yes, a bigger pipe to memory would help, but you are STILL several times slower than you could be. That's why you have the cache.

    Anyway, its a balancing act.

    --
    Just another "Cubible(sic) Joe" 2 17 3061
  12. Re:II have a dual Xeon with hyper threading by magarity · · Score: 2, Informative

    Your Xeon system with the SCSI disks is hugely faster doing DBMS than the system with the SATA drive in large part (probably larger than the other reasons you've listed, although those do matter) because DBMSs tend to throw a heck of a lot of disk IO commands at the disk subsystem all at once. The SCSI disks and their controller are simply better able to handle the barrage. I'll be that a test with the drive subsystems reversed shows that while the Xeons are still faster, the P4 is only somewhat behind, not waaaay behind.