Slashdot Mirror


First 16-Core Opteron Chips Arrive From AMD

angry tapir writes "After a brief delay and more than a year of chatter, Advanced Micro Devices has announced the availability of its first 16-core Opteron server chips, which pack the largest number of cores available on x86 chips today. The new Opteron 6200 chips, code-named Interlagos, are 25 per cent to 30 per cent faster than their predecessors, the 12-core Opteron 6100 chips, according to AMD."

140 of 189 comments (clear)

  1. Compared to Intel? by Ed+Avis · · Score: 2

    So... how do these compare to the new Sandy Bridge chips Intel announced on the same day? There must be some overlap of the target market - whether to buy a quad-socket Intel server or dual-socket AMD one, for example.

    --
    -- Ed Avis ed@membled.com
    1. Re:Compared to Intel? by Anonymous Coward · · Score: 1

      The Sandy Bridge chips released so far are all "Extreme" versions which suck power so much you'd be insane to use them for a server.

    2. Re:Compared to Intel? by Surt · · Score: 4, Interesting

      This would compete with the Xeon-E chips that aren't out yet. But in terms of performance about 75%, so this is the equivalent of a 12-core intel chip.

      --
      "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
    3. Re:Compared to Intel? by rrossman2 · · Score: 1

      Not sure what Tyan has planned and what the chips can do, but tyan had boards that supported 4 quad core opterons plus you could add a "daughter board" that allowed you to add 4 more (plus more ram slots)

      Now that setup using 16 core cpus in an eatx format would be crazy

    4. Re:Compared to Intel? by 0123456 · · Score: 2

      Given that an 8-core Bulldozer already needs its own power station to operate, I can't imagine Intel could have a worse TDP than a 16-core.

    5. Re:Compared to Intel? by ByOhTek · · Score: 1

      Yeah. I could ditch my furnace in the winter with a computer like that... Might even have to open a few Windows.

      --
      Self proclaimed typo king, and inventor of the bear destroying coffee table (patent not pending).
    6. Re:Compared to Intel? by ByOhTek · · Score: 1

      If they aren't out yet, how can you know? I wouldn't trust the performance benchmarks from either manufacturer.

      --
      Self proclaimed typo king, and inventor of the bear destroying coffee table (patent not pending).
    7. Re:Compared to Intel? by unity100 · · Score: 1

      intel cant field more than 6 cores at the same time in even sandy bridge E. multithreaded apps like server apps, shine in bulldozer.

    8. Re:Compared to Intel? by Talderas · · Score: 2

      The idle heat would be sufficient, no? I don't see why you would need to open some windows just to ramp up the temperature unless you're using this thing to few heat for a sauna.

      --
      "Lack of speed can be overcome. In the worst case by patience." --Znork
    9. Re:Compared to Intel? by the+linux+geek · · Score: 5, Informative

      Intel's server chips are 8- and 10-core, and outperform Opterons by a considerable margin.

    10. Re:Compared to Intel? by Kjella · · Score: 2

      Even the fastest Sandy Bridge-E draws less power than a Bulldozer even at much higher performance. It also costs 3-4 times as much, so performance/$ is quite shitty (hey, it's an extreme $999 proc) but you the winner in performance is clear. But thanks for trolling, come again.

      --
      Live today, because you never know what tomorrow brings
    11. Re:Compared to Intel? by beelsebob · · Score: 4, Interesting

      Put simply, the AMD ones are slower than the intel ones by about 2 fold per core. This isn't because AMD sucked at design, so much as their marketing department sucked at telling the truth. In reality, we're looking at 8 core AMD CPUs with 2 integer units per core - i.e. no more 16 core than intel's are 16 core chips because of hyperthreading.

      Once that's ironed out, the AMD chips turn out to have rather good performance if you want lots of integer work done, and the Intel chips to have rather good performance if you want anything else done.

    12. Re:Compared to Intel? by beelsebob · · Score: 4, Interesting

      What's the Xeon E5-2650L, 2650, 2660, 2665, 2670, 2680, 2690 and 2687W then?

      Hint: they're all 8 core SNB-E chips. Second hint - AMD's 16 "core" CPUs don't have 16 cores – they have 16 integer units. They only have 8 instruction fetch units, 8 decode units, 8 L2 caches, etc. That is, they're 8 core CPUs with strong integer support. SNB-E's particular strength is floating point, but it tends to beat the opterons at pretty much anything that isn't heavily integer biased.

    13. Re:Compared to Intel? by Surt · · Score: 1

      This assumes that performance is not significantly different from the desktop line, which is usually the case.

      --
      "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
    14. Re:Compared to Intel? by Surt · · Score: 2

      Slight correction, on threaded workloads, we'd be talking about a 6-core chip, intel runs 2 threads per core.

      --
      "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
    15. Re:Compared to Intel? by KingMotley · · Score: 1

      Well except the 130TDP of the 3690x is less than the 140TDP of the (almost equivalent) 6282 SE from AMD. Don't let facts get in the way of your beliefs.

    16. Re:Compared to Intel? by bloodhawk · · Score: 1

      You need to find a better line for such fanboyish, using stuff that is easily known and proven wrong is just silly. Intel server lines are 8 and 10 core. So far they have also trounced AMD in performance, though it would be nice if AMD can edge closer or even pass them with something new. competition is much needed in this area and something intel has not had for a few years now.

    17. Re:Compared to Intel? by unity100 · · Score: 4, Interesting

      is that why there have been 3 supercomputer orders in the last 3 weeks with amd's bulldozer opterons ?

    18. Re:Compared to Intel? by beelsebob · · Score: 2

      Really? Given that an 8 "core" bulldozer FX-8150 gets beaten by a 4 core i5 2500, you would reasonably expect that this 16 "core" bulldozer would get beaten by an 8 core sandy bridge chip with no hyperthreading at roughly the same clock speed. A little bit of imagination might convince you that a 6 core with hyprethreading might perform similarly too.

      AMD – 16 "core" bulldozer – $1000
      Intel – 6 core + HT Xeon E5-1650 at much higher clock – $583.
      Alternatively, if you want to be able to stick the intel chips in NUMA
      Intel – 6 core + HT Xeon E5-2640 at the same clock as the AMD chip – $884, but with only 95W power consumption.

      Final alternative:
      Intel –4 core (with no HT) Xeon E5-2609 at roughly the same clock –$294, stick two of them in, and there you are.

    19. Re:Compared to Intel? by the+linux+geek · · Score: 1

      SPECcpu results, TPC-H, and personal experience.

    20. Re:Compared to Intel? by Vancorps · · Score: 2

      Which would be what? I sounds to me like databases and webservers benefit greatly from the AMD approach. Alternatives such as render farms use GPUs, so what strength is Intel actually offering?

    21. Re:Compared to Intel? by beelsebob · · Score: 2

      Not really, no –databases and web servers don't spend their time doing parallel integer work, they spend their time doing logic work. Sandy Bridge kicks the snot out of it there.

    22. Re:Compared to Intel? by gilboad · · Score: 2, Interesting

      While I do agree that AMD is *well* behind Intel's latest and greatest in the 1P / desktop world, I fail to see how you could make such bold statement, unless you have had the chance to compare and AMD 4S machine to Intel 4S machine (say, Opteron 62xx based HP DL585G7 vs. Xeon 75xx/E7 based HP DL580G7).

      In my experience (and I venture and guess that is just as good as yours, if not better) the picture is far from being black-and-white and greatly (!!!) depends on the application that is being tested. The pictures becomes even more complex, once you factor in the Xeon E7 excessive price. ... So I ask again, have you had any experience in benchmarking the Opteron 6200 or are you simply making things up as you go along?

      - Gilboa

    23. Re:Compared to Intel? by Luyseyal · · Score: 1

      If the logic is parallelizable, then the AMD chips could be a good choice. A webserver would be a good example of parallel logic in run-of-the-mill software were it not hampered by all that pesky I/O.

      -l

      --
      Help cure AIDS, cancer, and more. Donate your unused computer time to worldcommunitygrid.org. Join Team Slashdot!
    24. Re:Compared to Intel? by sneakyimp · · Score: 1

      How about we take all the energy we are putting into this pissing contest and do some actual benchmarking?? Put up or shut up.

      COME ON EVERYONE! BENCHMARK! BENCHMARK! BENCHMARK!

    25. Re:Compared to Intel? by sneakyimp · · Score: 2

      "Honey, it's kinda cold. Can you fire up Linpack on the server?"

    26. Re:Compared to Intel? by sneakyimp · · Score: 1

      BENCHMARK!
      In the meantime, *please* STFU.

    27. Re:Compared to Intel? by im_thatoneguy · · Score: 1

      Render farms don't use GPUs. Good luck fitting a 3D scene into 1GB of memory!

      Maybe for a couple specialty applications custom written for a few narrow pipeline tools but certainly not the backbone which is still all PRman, Arnold, Vray, Brazil and Mental Ray. None of which use the GPU yet. Only production renderer nearing GPU acceleration is Final Render and maybe Brazil/Vray for specialized passes.

    28. Re:Compared to Intel? by hairyfeet · · Score: 1

      Because thanks to Intel rigging their compiler, which they do to this very day, you can't just benchmark because without knowing what compiler the benchmark software was compiled upon the benchmark is useless? I mean Nvidia kicked ass on Q3 with the FX series until you changed the exe to Quack.exe and then whoops! It turned out to be a scam. Same thing here as Intel runs any CPU that gives a CPU-ID of Authentic AMD a pile of shit code while the Intel chip gets the latest SSE optimization, not exactly a fair test now is it?

      --
      ACs don't waste your time replying, your posts are never seen by me.
    29. Re:Compared to Intel? by greg1104 · · Score: 1

      Databases that are working on in-RAM workloads (so not I/O bound) spend most of their time moving data pages around in memory. There are few computational components to database work, compared with how often chunks of data are touched. Neither the floating point or integer speed is the real limiting factor on how fast that can happen. The size of the CPU caches and the speed of the CPU->RAM interconnect are the important factors.

      I've been working on a memory oriented benchmark aimed at testing for this particular area of performance for a while now. The Intel vs. AMD situation is very complicated. It depends quite a bit on how many concurrent programs are running, especially on the big servers where you can't fully utilize all of the memory channels available. I see Intel as having an edge on smaller systems, their performance with only one or two cores going can be much better. At larger active core counts, the two manufacturers are much closer to equal. I don't see this new product line as changing that.

    30. Re:Compared to Intel? by Surt · · Score: 1

      I think that Intel's hyperthreads and AMD's Bulldozer 'cores' both use a resource sharing arrangement, and in neither case are full cores. The benchmarks bear this out: intel's hyperthreading is nearly as good as AMD's.

      --
      "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
    31. Re:Compared to Intel? by ak3ldama · · Score: 1

      That is why I stick to the benchmarks on Linux for fair comparisons of AMD and Intel. Not to mention most gaming benchmarks (or real performance) don't give two shits about what cpu it is ran with. The main benchmarks I've seen so far have been from phoronix. It seems like a perfectly decent cpu though maybe the price should drop a bit. My two cents: the AMD A-series is the way to go, especially in laptops like this A6-3400.

      --
      "but money is the God of Algiers & Mahomet their prophet." - Rich. O'Bryen June 8th 1786
    32. Re:Compared to Intel? by hairyfeet · · Score: 2

      Oh I have NO problem with FOSS benchmarks, its one of the few places where you can be sure to get a rigging free test. Not real big on the A-Series though, I find that the Deneb and Thuban chips are a better deal ATM and in many tests Deneb and Thuban stomp Bulldozer. it looks like Bulldozer is gonna be another Phenom I, where it took them a generation to get the bugs out and crank up the clocks like they did with Phenom II.

      But I'd say the E series is another story altogether. Its priced the same as Atom but frankly stomps it and gets scores usually above Atom+ION which is a more expensive option. its great for netbooks and all in ones, I've even built a couple of HTPCs with it and it works great in that role, quiet as a churchmouse while having no trouble with 1080p. I liked what I saw enough i put my own money where my mouth was and sold my athlon II Wind for a EEE with an E-350 and i just love the thing. its light, gets great battery life, never gets hot, and just about every video format under the sun is accelerated with DXVA.

      So I say stay away from the new socket for now, go with a Deneb or Thuban and by the time that is long in the tooth and you are ready to upgrade the chips after Piledriver will be out and they'll have any performance problems licked. Again I intend to put my money where my mouth is and after the holidays upgrade from Deneb quad to thuban. Do I need it? Not really but I WANTS IT precious, I WANTS IT!

      --
      ACs don't waste your time replying, your posts are never seen by me.
    33. Re:Compared to Intel? by Surt · · Score: 1

      Why not provide some evidence that it can't. Every benchmark says you're wrong and I'm right.

      --
      "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
    34. Re:Compared to Intel? by c++0xFF · · Score: 1

      You're right -- it's not really 16 cores. But nor is it just 8 cores with HyperThreading. Bulldozer is an interesting compromise that gives every concurrent thread of execution its own set of computation resources. This should result in faster execution than an 8-core machine, with or without hyperthreading, but probably isn't quite as faster as a true 16-core machine.

      Also, (and I might be very wrong here), I thought there were 16 floating point units, too.

    35. Re:Compared to Intel? by cheesybagel · · Score: 1

      If you actually did your math you would realize AMD's Bulldozer 16-core has the same peak theoretical FP performance as a 16-core Sandy Bridge would if it existed. A Sandy Bridge 256-bit AVX instruction typically has 2 cycles latency while Bulldozer's 2x128-bit AVX has 1 cycle latency for the same math operation. At the same time Bulldozer should have twice the integer performance. My guess is there is some bottleneck, hardware bug, or lack of OS/compiler optimizations to enable it to perform adequately.

  2. Re:Only 16? by unity100 · · Score: 1

    20 next year, 24 the next, and so on.

  3. really 16 core? by neuro88 · · Score: 1

    Hmmm... According to the article, these new chips seemed to be based on the bulldozer architecture, so it might be better to think of these opterons as 8 core chips that have really good hyperthreading.

    1. Re:really 16 core? by ackthpt · · Score: 1

      Hmmm... According to the article, these new chips seemed to be based on the bulldozer architecture, so it might be better to think of these opterons as 8 core chips that have really good hyperthreading.

      Hold your horse, cowpoke.

      Just because it's based upon doesn't mean it will suffer the same issues as the Bulldozer. Perhaps this is the core which really works well, while the more consumer oriented Bulldozer is the red-headed stepchild.

      --

      A feeling of having made the same mistake before: Deja Foobar
    2. Re:really 16 core? by Zan+Lynx · · Score: 4, Interesting

      Maybe...

      It'll be interesting. Most server applications are integer-only and never touch the floating point units. That should mean that Bulldozer designs work close to the full core count in contrast to the poor benchmarking results it puts out in Photoshop filters and video encode.

    3. Re:really 16 core? by Anonymous Coward · · Score: 1

      Exactly. The way I see Buldozer is that it is a good chip for things like web hosting, databases, middleware (ie. "the cloud"). Floating point performance is not that important if your threads do not do floating point. Heck, even if 1/2 of the threads do floating point, then you are fine.

      Frankly, I only care how fast each thread can run and access memory. This is what is important in server consolidation. Floating point, meh.

    4. Re:really 16 core? by the+linux+geek · · Score: 2

      They both have the same issues, including that each module (two 4-issue cores) has a single 4-instruction decoder in front of it. Cache latency is also likely to be similar if not the same.

    5. Re:really 16 core? by gweihir · · Score: 1

      Well, as Intel hyperthreading is basically brain-dead (had to disable it for decent performance as some things were glacially slow), really good hyperthreading just means usable hyperthreading for me. If Interl did not have so much money, AMD would have blown them away a long time ago. Intel technology sucks badly.

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    6. Re:really 16 core? by AmiMoJo · · Score: 1

      It's just a shame Bulldozer is their desktop CPU. I really don't know how they could have screwed up so badly... Something major must have happened for them to end up releasing their next gen architecture in a state where even at a high clock speed performance is worse than the previous (and much cheaper) generation for most customers.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    7. Re:really 16 core? by gweihir · · Score: 1

      Does not help if the designs they put on the chips are stupid. And they are.

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
  4. Bulldozer Cores are not that Great by TheTyrannyOfForcedRe · · Score: 4, Interesting

    The "cores" in Bulldozer are not your typical first-class x86 core. Bulldozer "cores" are worth 2/3 of a modern x86 core. The 6200 is more like a 10 core. Add to that the crappy IPC and I'm not impressed.

    I was excited about Bulldozer before it was released. It's not often that CPU makers take chances on radical new architectures. Too bad this one turned out to be a huge pile of fail.

    --
    "Liechtenstein is the world's largest producer of sausage casings, potassium storage units, and false teeth."
    1. Re:Bulldozer Cores are not that Great by synapse7 · · Score: 1

      Hopefully they can be improved upon. I remember the first P4s had enough suck to be the target of a class-action suit.

    2. Re:Bulldozer Cores are not that Great by Theovon · · Score: 5, Informative

      Your description in inaccurate, but that's not surprising since most slashdot readers don't know much about CPU architecture.

      Bulldozers are essentially full-fledged cores, where the two cores in each module are mostly independent. There are two completely independent integer pipelines, so people seem to want to harp on the fact that the FPU is "shared". It's really a single split FPU, where each half can execute independent instructions, as long as the data width is 128 bits or less. Only when it is executing 256-bit AVX instructions is there any competition for resources. This is a very sensible design decision, since you don't find enough AVX software right now to justify completely dedicated AVX logic. (Plus, IIRC sandy bridge's FPU is only 128 bits wide and issues AVX instructions in two cycles, so what's the difference?) Moreover, even with AVX-heavy workloads, most software won't issue AVX instructions every cycle, and two AVX-heavy tasks on the same module won't really run into much contention. Assuming my memory of Sandy Bridge's FPU is correct, then Bulldozer has the advantage of having lower latency within the FPU on isolated AVX instructions.

      The PROBLEM with Bulldozer is that they just have not done some of the really aggressive and costly things that Intel has done in their design. Bulldozer is still a 3-issue design. While going to 4-issue doesn't help that much that often, it still gives Sandy Bridge a slight edge. But where SB REALLY gets its advantage is the huge instruction window. Intel found clever ways to shrink the logic for various components so that they could make room for a much larger physical register file and reorder buffer. As a result, SB can have many more decoded instructions in flight, which exposes more instruction-level parallelism and, critically, absorbs more memory access latency.

      A Sun engineer (discussing Rock, among other things) once described modern CPU execution as a race between last-level cache misses. When you have a miss on your L3 cache, it can cost hundreds of cycles, upwards of 1000. During that miss, the CPU fills up its reservation station with other instructions and then stalls, waiting on something to retire. This won't happen for a long time. Because of the disparity in speed (and latency) between compute and memory access, this is typically the most significant bottleneck. By enlarging the instruction window, SB can achieve much higher throughput, and it shows in the benchmarks.

      This is Bulldozer's Achilles' heel. I know there are a few benchmarks where Bulldozer is faster than SB, but they're not typical workloads with typical memory footprints. Anyhow, so if you're going to rag on Bulldozer, rag on it for the right reasons. Bulldozer's "shared" FPU is a red herring.

    3. Re:Bulldozer Cores are not that Great by ericloewe · · Score: 2

      Bulldozer was very poorly handled from the beginning. What really suprises me is that they tried the NetBurst approach: when all else fails, go for clocks. Unfortunately, ARM seems to be focusing on a similar strategy (more cores, higher clocks, less focus on IPC)... Anyways, I don't buy their "poorly optimized" story. They knew all about it and could've waited - surely they realized at the early stages of development that OSes aren't optimikzed for this yet. They could've delayed Bulldozer and pushed out yet another incremental upgrade to the Phenoms - the die shrink alone would probably yield better results than those achieved by Bulldozer. Meanwhile Intel is able to get away with what is essentially 50% more performance in multi-threaded applications, 0% more in single-threaded ones (save minor influences from the memory subsystem and cache, which surprisingly have a HIGHER latency than SB). All this for around 100% more cash, plus added costs for "high-end" motherboards (still lacking native USB 3.0 from the chipset, along with only two native SATA 6.0Gb/s ports), quad-channel memory and a cooler.

    4. Re:Bulldozer Cores are not that Great by Artraze · · Score: 5, Informative

      The OP right, and seems to understand the issues far better than you. It isn't that the FPU is shared, it that nearly _everything_ is shared: Instruction cache, fetch and decode, FPU, L2 data cache. The only things that aren't shared are L1 data and integer operations (scheduler and ALU).

      Instruction issuing and and cache misses are big performance areas, but these are precisely the resources the cores share! You're running two threads off (with the exception of L1 data) the same caches and instruction fetches. So, in reality, the second core in bulldozer is much more like ultra-hyperthreading than it is a second core. I think the fact that they're even listed as cores is a marketing strategy that has backfired pretty hard.

      P.S. L3 cache has proven to be quite useless in many workloads... It helps a bit in servers, IIRC, but that's about it. So it's more a race to L2 cache, which, again, is a shared resource. AMD, in fact, has indicated that it may drop the L3 from desktop parts.

    5. Re:Bulldozer Cores are not that Great by Tomato42 · · Score: 1

      Then why Bulldozers are slower than Phenom II's in file compression (rar, zip, 7z, pick your poison) clock for clock? That's definitely not a shared FPU problem...

    6. Re:Bulldozer Cores are not that Great by mestlick · · Score: 1

      There are a few big mistakes about Bulldozer here.

      The FP is completely shared between the integer clusters. The FP is 4-wide and the two clusters compete for all the resources in the FP.

      Each Bulldozer integer cluster is 4-wide. The shared instruction fetch is also 4-wide.

      Sandy Bridge has 168 instructions in flight and Bulldozer has 128 per cluster. Sandy Bridge has a combined FP/INT scheduler with 54 entries. Bulldozer has separate schedulers with 40 INT per cluster and 60 FP entries.

      You are correct about BDs Achilles heal. The L2 and L3 latencies are longer than SB. I think the solution is to reduce the latencies, not increase the in flight window size.

    7. Re:Bulldozer Cores are not that Great by Rockoon · · Score: 1

      If you look at the performance numbers comparing Phenom II x4 830 (2.8ghz) to the new A8-3850 (2.9ghz) you see that the lack of L3 isnt a problem at all when you can also pack on twice as much L2.

      --
      "His name was James Damore."
    8. Re:Bulldozer Cores are not that Great by WilliamBaughman · · Score: 1

      Your description is also inaccurate. Instruction decode and L2 cache are shared between cores in Bulldozer modules as well; I wouldn't ding Bulldozer for the shared L2 cache but the L1 cache is write-through, and there doesn't seem to be enough cache bandwidth to keep both integer cores busy. Bulldozer is not a 3-issue design, it is a 4-issue design. With regards to Bulldozer's Achilles' heel, I think that its deficiency in single-threaded performance comes more from actual cache misses and latency than the smaller instruction window. I could be proven wrong by architectural studies that come out in the future. Either way, those studies will be interesting.

    9. Re:Bulldozer Cores are not that Great by loufoque · · Score: 1

      (Plus, IIRC sandy bridge's FPU is only 128 bits wide and issues AVX instructions in two cycles, so what's the difference?)

      My SSE code converted to AVX runs two times faster (not all of it though -- certain instructions do run in two cycles)

    10. Re:Bulldozer Cores are not that Great by goarilla · · Score: 1

      Huh, Northwood was a big improvement over Willamette !

  5. Re:Only 16? by ackthpt · · Score: 4, Interesting

    Pfft, how much harder can it be to design one with 32 :)

    Design? Easy.

    Manufacture? Tricky.

    Make work? Trickier.

    To read about? Interesting.

    --

    A feeling of having made the same mistake before: Deja Foobar
  6. Poor performance by Anonymous Coward · · Score: 1

    I have a test machine with the 12-core version and the single-core performance is truly dreadful. Intel chips that are several year older perform way better in this regard. Even with a workload where the 16 cores can all be used to the fullest extent, I doubt the performance comes close to modern Intel chips.

    1. Re:Poor performance by nomel · · Score: 2

      This isn't the point. You get 16 cores (slowish compared to top of the line, they may be) that will fit in a single socket on a single motherboard, with a single power supply. This is a *huge* cost saving for machines that it makes sense to use them in...servers, where single core performance is relatively stupid to consider.

    2. Re:Poor performance by the+linux+geek · · Score: 3, Insightful

      Servers need single-thread too; think stuff like big database writes, joins, ERP, and CRM. Think outside the embarrassingly-parallel web-serving box.

      If multithreaded performance was all that matters, the Sun Niagara chips would have done a lot better than they did.

    3. Re:Poor performance by timster · · Score: 1

      The big question I have is if it will be like AMD's previous 12-core chips, where you could get 4 of them crammed into a 2U server for not all that much money. 4-Xeon configurations are way more expensive.

      --
      I have seen the future, and it is inconvenient.
    4. Re:Poor performance by DarkOx · · Score: 3, Insightful

      Umm, Joins can be done in parallel, in lots and lots of cases. ERP and CRP are applications that ought to see big improvements form more cores, if you have more than a few users anyway. It also simplifies things, you don't have to figure out how to architect the thing to run across 10 hosts anymore, good multi-core systems deliver there performance these apps need if you can get the disk IO solved. A good SAN with mutlipath support and multiple HBAs can get there.

      Niagara failed because each individual core was too slow, a comparable cost Intel CPU could do in serial with one core two jobs, in less time than Niagara could do one job with on core. The question is here for most paralleled work loads like a database where all cores will be used are AMDs 16 core chips at least 62% the speed of Intel's 10 core chips on core vs. core basis? If true other things being equal for *some* work loads these Opterons will be better.

       

      --
      Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
    5. Re:Poor performance by the+linux+geek · · Score: 1

      The whole point of the Niagara was to provide zero-impact context switches and to effectively hide latency, and to get close to 100% utilization as a result. On embarrassingly parallel workloads (web serving was the one Sun hyped hardest) a 32-thread T1 or 64-thread T2 did quite a bit better than its Intel contemporaries. The problem is that a lot of workloads expect to be able to consistently issue more than 200 million instructions per second per thread to do well, even when you are hiding latency by doing fast thread switches. Databases, a workload you cite as parallelizing well, tended to run like crap on Niagara.

      With Bulldozer, you effectively have 16 2-issue cores, since each module has a shared 4-instruction decoder. I'm skeptical that it will perform all that much better on multithreaded integer workloads than the T3 did, which had 16 1-issue cores with aggressive multithreading and latency hiding on top. On the other hand, Westmere-EX is 10-core, 4-issue, 2-way SMT-capable per core, and has big caches with good latency numbers. On both a technical level and based on early benchmarks (SPECcpu), things don't look good for Interlagos.

  7. Wish List by Nom+du+Keyboard · · Score: 3, Informative

    I so much want some real competition for Intel. Competition that doesn't artificially limit clock speeds and fuse off perfectly good working features in order to market a dozen overlapping and conflicting SKUs at a dozen different price points. And working drivers, current standards (DirectX 11 and OpenCL for starters), and USB-3 that doesn't require a $50 cable between every device would be nice.

    --
    "It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
  8. Re:Only 16? by sjames · · Score: 1

    So what are you waiting for? Hop to it and corner the market!

    Go ahead, I'll just wait over here and read the paper.

  9. Intel vs AMD's philosophy as of late by Anonymous Coward · · Score: 1

    Intel: "Let's improve the memory controller's bandwidth, increase our IPC and also improve our platform by adding more PCIe lanes to the chipset that enthusiasts will find a use for"
    AMD: "MOAR COARS!!!111!one"

    1. Re:Intel vs AMD's philosophy as of late by level_headed_midwest · · Score: 4, Interesting

      Eh, how about this:

      Intel: I know, let's try to see just how many features/cores/cache we can fuse off in our dies and different socket combinations to try to make *puts pinky finger to mouth* one MILLION SKUs! Oh, and while we're at it, let's add a FOURTH memory channel, because more is better! Sure, we could get all the bandwidth we need with two DDR3-1866 or -2133 channels and that you really only get about three channels' worth of bandwidth because we have to clock the IMC down to DDR3-1333 with two modules per channel- but we still have FOUR channels! Oh, and we forgot, it's the start of a new quarter so we need to release a new socket. Can't let those socket suppliers get lazy making last quarter's socket design. What, you guys want us to release Sandy Bridge-based Xeon MPs because MP platforms actually need that much bandwidth and core count? We just released the Westmere-based ones a few months ago! Don'tcha know that Xeon MPs run two years behind everything else? Geez, what did you do, wake up yesterday? Next you'll want us to stop crippling our chips, stop using a new socket every other month or something ridiculous like that. Where do you guys get those ideas?

      AMD: Based on market analysis, most server applications use primarily integer code and require a lot of bandwidth, memory capacity, and a high core count. We don't have over a hundred billion dollars in market cap to fund several parallel R&D teams to design a specific CPU for every edge use case, so we will design a CPU that is highly modular, has good integer performance (because that's what our research indicated most server apps are), and has a lot of cores. Experience with Intel's HyperThreading is less than stellar with regards to predictable performance, so we will use our CMT approach that leads to better integer performance than HyperThreading but doesn't increase the die size by a huge amount, since we can't afford to make 400-600 mm^2 dies like Intel does to have a lot of physical cores. Oh, and we'll continue to use the existing server platforms out there so our customers can drop-in upgrade and we'll also not change any feature sets in the SKU stack other than the clock speed and number of enabled modules and their associated caches. We do apologize for being "late" with these parts since we usually release server and client at about the same time...

      --
      Just "gittin-r-done," day after day.
    2. Re:Intel vs AMD's philosophy as of late by Anarke_Incarnate · · Score: 2

      AMD already had the on die memory controller. Their answer to intel's Hyperthreading was real cores. The QPI bus that intel uses is very similar to the one AMD pioneered with Hypertransport. Let's not forget that AMD64 (oh, did you want me to call it EM64T or x86_64?) was a product of AMD's engineering effort rather than forcing people toward the EPIC architecture which seems to be niche based.

    3. Re:Intel vs AMD's philosophy as of late by CajunArson · · Score: 1

      2005 Called they want their list of bragging rights back. Oh and hypertransport is mostly technology that AMD bought from Digital along with parts of the Alpha team. They get some credit for bringing their version to market first, but it's hardly like they came up with the idea for Hypertransport out of thin air. As for x86-64, AMD brought it to market first, but Intel had internal builds of 64 bit enabled X86 chips around for some time, which is why they could bolt it onto the P4 and not require a brand new microarchitecture.

      Of course, what people forget most about x86-64 is that Microsoft was the big proponent of a 64 bit x86 chip since it meant they could start moving into larger scale data server applications. The irony is that EPIC, for all the hate it gets in this website, is dominated by Linux and other UNIX derivatives where MS is almost completely absent.

      --
      AntiFA: An abbreviation for Anti First Amendment.
    4. Re:Intel vs AMD's philosophy as of late by CajunArson · · Score: 1

      Oh.. and integrated memory controller:
      1. The 486 had one too.
      2. Look at Bulldozer's atrocious memory performance: There's a difference between slapping any memory controller on-die and slapping a *good* memory controller on-die. Intel has the good one.

      --
      AntiFA: An abbreviation for Anti First Amendment.
    5. Re:Intel vs AMD's philosophy as of late by Anarke_Incarnate · · Score: 1

      Which is why they fought tooth and nail and then implemented a poorer version of it (original EM64T had issues with pointers) after the fact. Somebody forgot their history.

      The problems with EPIC are the poor performance for certain applications as well as limited jumps in performance compared to the leaps that x86 gets, despite being a problematic architecture.

    6. Re:Intel vs AMD's philosophy as of late by Anarke_Incarnate · · Score: 1

      Intel's memory controller has issues too. The configuration of servers with high amounts of RAM becomes needlessly complicated when dealing with intel as to prevent the performance dropping when stepping from 1333 to 1066 to 800MHz in certain configurations. The new Interlagos chips run at 1600. Call me when you have actual benchmarks of Interlagos, and finally decide that explicitly parallel jobs benefit more from actual cores than from hyperthreading.

    7. Re:Intel vs AMD's philosophy as of late by CajunArson · · Score: 1

      --> Call me when you have actual benchmarks of Interlagos,
      You don't have them either. What's amusing is that I'm using known data from 1/2 of an interlagos chip (Bulldozer) at much higher clockspeeds than what Interlagos will operate at to make my assumptions. There's plenty of data from just the 6 core 3960 and 3930 chips that came out today that indicate that even desktop Bulldozer x 2 with theoretically perfect scaling won't beat the upcoming Xeons. You ain't gonna get perfect scaling and you ain't gonna get desktop Bulldozer clock speeds. You're just hoping that reading John Fruehe's blog will result in a miracle.

      --> and finally decide that explicitly parallel jobs benefit more from actual cores than from hyperthreading.

      Nobody has ever argued that hyperthreading is better than *real* cores. AMD's problem is that they didn't really introduce *real* cores but this bizzaro quasi core setup. In practice it looks like AMD's solution is about on par with hyperthreading, so you can insult Intel all you want and scream MOAR COARSS just like AMD told you to, but that won't make Interlagos magically destroy Intel.

      --
      AntiFA: An abbreviation for Anti First Amendment.
    8. Re:Intel vs AMD's philosophy as of late by yuhong · · Score: 1

      Don't forget Intel pricing Xeon MP at thousand of dollars per CPU while AMD rags about the lack of this "4P tax".

    9. Re:Intel vs AMD's philosophy as of late by Anarke_Incarnate · · Score: 1

      The point was "There are no benchmarks" so hold off judgement until we see how it handles work. AMD did introduce real cores, what they do have, however are a poorly implemented shared FPU design. It is like they are trying to shoe-horn server chips at people. FPU matters a lot less to most server tasks.

    10. Re:Intel vs AMD's philosophy as of late by Anarke_Incarnate · · Score: 1

      While not an issue right NOW, the original EM64T did have an issue with pointers. The issue was that there was no hardware IOMMU for them. Thus, in order to DMA memory above 32bit allocation, they had to use pointers.

      http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/3/html/Release_Notes/as-amd64/RELEASE-NOTES-U2-x86_64-en.html

      Software IOTLB — Intel® EM64T does not support an IOMMU in hardware while AMD64 processors do. This means that physical addresses above 4GB (32 bits) cannot reliably be the source or destination of DMA operations. Therefore, the Red Hat Enterprise Linux 3 Update 2 kernel "bounces" all DMA operations to or from physical addresses above 4GB to buffers that the kernel pre-allocated below 4GB at boot time. This is likely to result in lower performance for IO-intensive workloads for Intel® EM64T as compared to AMD64 processors

    11. Re:Intel vs AMD's philosophy as of late by Anarke_Incarnate · · Score: 1

      Actually I am not an AMD Fanboy. I currently only own 1 AMD machine, an older Dell that I bought back in '07 that acts as a fileserver. My 2 laptops, and desktop are all running intel CPUs. At work, we use intel almost exclusively (I'm a systems engineer). At my previous company, we did switch to AMD, at my recommendation, because they needed the actual threads and HT would have cost them about 30% performance.

      My issue with intel was not the fact that there are electrical signaling issues when trying to install a lot of memory, but rather that intel's design required a multiple of 3 instead of 4 which often caused a dance of the slots. I have spoken to AMD and Intel engineers. AMD's new offering will run at 1600 until you fill past a major point, then step down to 1333. Intel does not make it as simple, and when dealing with field service it is easier to have them work on an AMD machine.

    12. Re:Intel vs AMD's philosophy as of late by Anarke_Incarnate · · Score: 1

      Ok, I tried to be somewhat cordial, but perhaps you need the proverbial smack in the mouth.

      HT is a performance sync in a lot of applications. Maybe you work in a small shop and do stupid things, but I do not. If the word enterprise is a foreign one, perhaps you should pay attention. Oracle recommends turning off HT on databases, due to performance problems. Solid NTP is near impossible with HT without locking NTP down with processor affinity. Also, changing clock speeds with any form of turbo or reduced clock speed can impact your time resolution. When you need to be within about 300 microseconds (that's right, not milli, micro) between machines, that is a Big F'n Deal! If machines are not all performing the same, this can cause an SEC investigation if the results are shown to favor someone in the market more than others. Oh, you didn't work for an exchange? I did.

      My points were that HT is not necessarily the performance boost that it is marketed as. It SHOULD be turned off in numerous cases and still causes issues when there is contention of resources. That can cause a context switch, which in turn, is usually a 50-150 microsecond penalty. The reason AMD was chosen was the application needed more threads than could be serviced by a similar intel setup (needed to go to 4 sockets rather than 2) without HT. HT was a non starter.

      Also, the reason that a 3 channel memory configuration was a pain in the ass was not because I, as a systems engineer couldn't figure it out, but again, as enterprise seems to be a difficult concept for you, imagine having to do a field upgrade or change, where all machines have to be identical, but 5 different field service techs are dispatched, all with the equivalent of a GED and rudimentary grasp of English or technology. I did not sit in the data center. Do you know how lights-out data centers operate? The machines had to be serviced around the world. The configurations being simpler on the AMD side made it more likely that the field service guys would not fuck it up.

      Lastly, I give a shit about your Tyan motherboard (Oh, and I did know about the socket name, thanks very much. I was there for NDA screenings of AMD's roadmap at my previous employer's premises). I am talking about vendors like HP, Dell and IBM who have world support. Tell me when a technician from Tyan is going to drop into a DC in Switzerland to do a rip and replace, then I might care.

      So, in conclusion, your working on a few racks of machines is irrelevant. When you have to design an infrastructure around how an application behaves in a complex interconnected way, on different continents as well as taking into account the serviceability of the machines, latency, throughput, and overall performance characteristics to the point where the machines cannot be more than X feet apart due to preferential lengths of network cable you and I can talk more.

    13. Re:Intel vs AMD's philosophy as of late by Anarke_Incarnate · · Score: 1

      Sink, not sync

  10. Re:Only 16? by unixisc · · Score: 1

    lesser incremental value? Even more difficult!

  11. sandy bridge ep 95W by Chirs · · Score: 3, Informative

    There will be server versions as well...I've seen specs (publicly available) for an 8-core (16-thread) sandy bridge EP with a 95W TDP. I suspect it's clocked a bit lower and maybe binned for efficiency.

  12. Re:how do they compare ? by PIBM · · Score: 2

    1: You can buy your new sandy bridge from newegg or such right now, while those new bulldozers are nowhere to be found.
    2: Overclocking any chip is bound to require a lot more power than the TDP no matter which one you are using.
    3: Dozer's core, as you said, feel like they are dozing on the job..

  13. Re:Only 16? by beelsebob · · Score: 1, Informative

    Pffft, it's only 8 cores anyway, 8 cores each with 2 integer units. It's no more 16 core than intel's 8 cores with hyperthreading.

  14. Re:Only 16? by jessehager · · Score: 1

    It's 8 cores per chip *and 2 chips per package* for a total of 16 cores.

  15. yes by unity100 · · Score: 2

    that must be why 3 supercomputers with dozer opterons have been ordered in the past 3 weeks.

    1. Re:yes by cheekyboy · · Score: 1

      AMD still has a 5% server market share.

      3 orders might push it to 6%

      --
      Liberty freedom are no1, not dicks in suits.
    2. Re:yes by unity100 · · Score: 1

      these are supercomputers. not servers. but you are right, dozer will push server share up, a lot.

  16. Re:Only 16? by beelsebob · · Score: 3, Informative

    No, 8 integer cores per chip, but 4 actual real cores. For a total of 8 cores across 2 chips.

  17. fool. by unity100 · · Score: 1

    they are like 3/4 cores. neither 1 core, nor half core.

    1. Re:fool. by beelsebob · · Score: 2

      The problem is, while this is true, bulldozer also suffers from being a fairly crappy arch design compared to sandy bridge. The result is that AMD's 8 "core" bulldozer is only roughly as fast as intel's 4 core i5 without hyperthreading. Extrapolate this to bolting two 8 "core" bulldozers together and you get to... well, that would only be about as fast as an 8 core sandy bridge with no hyperthreading, or a 6 core with hyperthreading. Given that Intel is selling 6 core E5 Xeons with hyperthreading for less than the $1000 AMD is asking for this, that really isn't boding well is it. This of course is then forgetting that this Bulldozer is very underclocked to keep power consumption down. This really doesn't look promising for AMD.

    2. Re:fool. by Afell001 · · Score: 1

      1) Sandy Bridge is on its second generation. It inherits from the long line of progression from the Core legacy and has done very well considering the amount of money that Intel has pumped into developing these processors. To say that these chips are very mature would be an understatement.

      2) AMD has invested a fraction of the R&D expense that Intel has sunk into developing SB/Core architecture when comparing it to BD development. On top of that, BD is in its infancy and is exploring new paths to try and gain efficiencies. I think BD developers need to be proud of their accomplishment, even if it doesn't quite match up clock-for-clock against SB. As the design for these processors matures, and AMD releases a few more Steppings, we will probably see improvements in power usage and performance.

      3) As this was a new model, none of the OS kernels out today use these processors in the most optimal way. As the architecture matures, I'm sure that the OS developers will redevelop thread initiation and assignment to make better use of BD's assets. This in itself will net better performance even without improvements in the overall design.

      You might think I am just rooting for the underdog, but as a consumer, so should you. Without AMD to keep Intel on it's toes in the X86 market, we will eventually see new chips from Intel that are nothing more than speedbumps, but at prices that will make it difficult for anyone to afford. Intel still prices competitively where AMD still has alternative product, but look at where AMD has not kit to compete. Intel will price there accordingly, because they can. No competition means that the price will float as high as demand.

      I try to alternate my personal machines. One year, I will buy AMD, while the next, I will buy Intel. For one machine, I may buy NVidia graphic cards, while another will use AMD. The home media server in the closet is due for an upgrade. I went with an Intel Xeon build three years ago. This time, I will build it with BD Opterons. Do you think anyone besides me will notice the difference, unless I told them? Probably not.

  18. Re:Only 16? by kvvbassboy · · Score: 4, Funny
  19. When are multiple cores going to help me? by craftycoder · · Score: 4, Interesting

    I just got a fancy 8 core T7500 Dell workstation and only one of my compilers actually takes advantage of the multiple cores when it is compiling. As a result this expensive desktop is only 15% faster in terms of time to compile than the 4 year old PC it replaced (the new PC has twice the ram as the old though which may account for some of that speed increase). I am seriously unimpressed with all these cores. Maybe they are useful for something, but I've not found anything that I do that shows significant improvement. Putting my development projects on a SSD did much more for my work flow performance than this fancy new computer, that is for certain.

    1. Re:When are multiple cores going to help me? by Anonymous Coward · · Score: 4, Informative

      You're doing it wrong.

      make -j8

    2. Re:When are multiple cores going to help me? by fyngyrz · · Score: 2

      Try doing DSLR image editing with Lightroom or Aperture. Those cores make one hell of a difference.

      --
      I've fallen off your lawn, and I can't get up.
    3. Re:When are multiple cores going to help me? by Anonymous Coward · · Score: 1

      Consider taking advantage of "make -j" if you use that tool.

    4. Re:When are multiple cores going to help me? by onefriedrice · · Score: 1

      I just got a fancy 8 core T7500 Dell workstation and only one of my compilers actually takes advantage of the multiple cores when it is compiling.

      If your compiler isn't threaded, then at least run multiple compile jobs simultaneously--this is probably better anyway. If your build system can't do this, your tools are broken.

      --
      This author takes full ownership and responsibility for the unpopular opinions outlined above.
    5. Re:When are multiple cores going to help me? by JohnnyBGod · · Score: 2

      And if it's too much for one machine, use distcc.

    6. Re:When are multiple cores going to help me? by Renegrade · · Score: 1

      Yeah, they do in image editing.

      However, there will always be things that must be done in series, and always a maximum speed-up you can get from multiprocessing. (Amdahl's Law comes to mind) Plus, you'll often hit other bottlenecks, especially if you have an obscene number of cores. Memory, disk, video, network..

      Memory has always been a problem after the 6502 era. Even single core systems splat into the performance barrier that is main memory.

      I'd rather have a single-core system that's 8x faster than an 8-core system. However, it's my belief that we're seeing crazy core increases not because it's the best way to better performance, but rather that the CPU makers are hitting walls (or at least massive difficulties) with traditional speed increases (mhz/ipc/branch prediction accuracy/etc).

      Intel engineer: Our new architecture, with the die shrink, is about five percent faster...
      Intel manager: How are we going to sell that to people for $300-1200??
      Intel engineer: Well, we COULD put two/four/six/eight of them into a single die, as they're much smaller than before..
      Intel manager: Do it!!
      Intel engineer: Sir, it would cost us thirty or forty percent more due to--
      Intel manager: Nobody's going to buy a 5% increase without this! DO IT NOW!!

    7. Re:When are multiple cores going to help me? by friedmud · · Score: 3, Informative

      What do you mean by "only one of my compilers actually takes advantage of the multiple cores when it is compiling"?

      Are you on Windows? Because any compiling done in linux with a "make" based (or similar) build system can use as many cores as you can throw in a machine (regardless of the actual compiler it's running). It should be the same in Windows...

      Don't look to your compiler to be multithreaded... look at the build system (i.e. in Visual Studio there should be an option somewhere to tell it how many processors to use while compiling). For make you just do "make -j8" to use 8 "jobs" total for compiling (i.e. 8 instances of the compiler will be running).

      Here is a test for one of my software projects doing "make -j#" where # is 1,4,8,12,16,24:

      1 : 15m9.614s
      4 : 3m57.947s
      8 : 2m6.354s
      12 : 1m33.426s
      16 : 1m25.559s
      24 : 1m17.345s

      That is on my dual 6-core hyperthreaded Mac workstation (so it had 12 "real" cores and 12 "hyperthreads"). You can see that hyperthreads definitely aren't as good as real cores... but do provide some speedup. That said, I thank God every time I compile (which is all day long) for the cores he has bestowed upon me...

      Good to hear that you are already on SSD... because parallel compiling does need speedy disk to keep the processors humming. The timings above are for two 256GB SSD's in RAID0.

    8. Re:When are multiple cores going to help me? by craftycoder · · Score: 1

      Neither my Java nor Android compilers do a good job of taking advantage of the multiple cores from within Eclipse. I am able to get significant improvement when compiling GWT projects by giving it a 6 core directive. I save about 40% of the time it used to take. Plain old Java and Android showed little improvement though.

    9. Re:When are multiple cores going to help me? by craftycoder · · Score: 1

      I mostly work with Eclipse doing Java, Android, and GWT. Only GWT offered an effective way to use those cores. It is VERY possible that I just don't know how to use Eclipse to the best of its ability, but I can tell you that Eclipse never pushes more than one core during a build except when its building GWT projects for me (I had to tell it explicitly to do that though).

    10. Re:When are multiple cores going to help me? by lexman098 · · Score: 1

      To everyone who's replying about compiler switches to activate multithreading: it's only relevant in this instance (maybe). I can totally sympathize with the disappointment in this machine, as I too have recently begun using this exact computer and have realized zero gains in having 5 extra cores. My work day involves running vim and about 50 verilog logic simulations, which are single threaded, and having this supercomputer under my desk (which I connect to remotely anyway) is absolutely useless. I wish they would give me a crappier one and a $4000 bonus instead.

    11. Re:When are multiple cores going to help me? by Bitmanhome · · Score: 1

      Yeah, you're screwed, sorry. Eclipse integrates nicely with Ant, but Ant doesn't do multi-core builds either. And Ant tasks are very heavy, so parallelizing them wouldn't help much anyway. You might try rebuilding your build process in plain 'make' and try that -j option.

      Also, I'm sorry you have to use GWT. That thing was just absurdly slow last time I used it, to the point that it would be faster to hand-code JavaScript.

      --
      Not that this wasn't entirely predictable.
    12. Re:When are multiple cores going to help me? by cbhacking · · Score: 1

      Well, you could consider using a better compiler, or a better configuration for it. Many parts of compilation parallelize reasonably well, especially if you have a lot of source files. Some things will have dependencies on other parts (which limits parallelism) and some have dependencies on the entire previous stage (which severely limits or prevents it, for that stage).

      Besides, unless you're just building a pure build machine (and I doubt it, if your compilation setup is so bad), multiple cores can help a lot in other places too. Things like background syntax checking and storing symbol information can be done in parallel with your workload. Looking up stuff online, or streaming music or even video, can be done without impacting the performance of your dev tools. Many web browsers themselves will get much faster (even Firefox to some degree, since it's multi-threaded even though it still uses a single process). There's plenty of places for heavy workloads to be spread across cores.

      Granted, 16 cores is more worloads than I almost ever have, but software is also becoming increasingly parallel. My build system defaults to splitting the workload across 4 cores, some of my games can use 5 or 6 pretty well, and my computer can remain responsive for doing other things too.

      --
      There's no place I could be, since I've found Serenity...
    13. Re:When are multiple cores going to help me? by craftycoder · · Score: 1

      That's what I thought. I research it every couple months when I get annoyed by multi minute builds. I never get any answers.

      GWT is slow and deployment is a little cumbersome, but the code is so elegant I just don't care. I love GWT. I wish Google provider more libraries, but I'm pleased with it. I'm not certain it has a future though.

      I loath Ruby. What's a fella to do if he wants a strongly typed object oriented website?

    14. Re:When are multiple cores going to help me? by lexman098 · · Score: 1

      Scratch that. $1000 bonus.

    15. Re:When are multiple cores going to help me? by evilviper · · Score: 1

      only one of my compilers actually takes advantage of the multiple cores when it is compiling.

      Send your octo-core my way, I'll see that it gets some use...

      For any RPM based Linux distro, just edit your RPM macros file to add eg. -j8 option to make, and every "rpmbuild" will max-out all 8 cores with 8 instances of gcc operating on different files each.

      And if you're lzma compressing the RPMs in question, and they're a non-trivial size, you can get a pretty good speed-up using either parallel-xz or p7zip across multiple cores. If you're packing-up large quantities of data in RPMs, or just using xz in general, we're talking a serious number of wall-clock hours savings.

      For video encoding, while you lose a bit of quality with threading (so I discourage it on mere dual-core systems) you can see a pretty impressive speed-boost. And for video-decoding, multithreading is a no-brainer.

      In conclusion, you bought an SUV before measuring if it had enough cargo room to haul your toys, and lost out. Those who need to haul different cargo find it a grat solution. There will always be some usage cases which don't benefit.

      The real benefit is servers, though. I can't remember the last time you could get as big a performance boost on your server from upgrading the CPU as you can today, going from dual-core to 16 core, without needing a new mobo due to socket changes. And if you're lucky, your server can take 2 or 4 of them...

      --
      Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
    16. Re:When are multiple cores going to help me? by Bitmanhome · · Score: 1

      What's a fella to do if he wants a strongly typed object oriented website?

      I think JBoss is the usual answer to that. That only takes care of the back end, but GWT has your front end covered anyway, and the more code you can move into JBoss, the less you have to crank through GWT's slow processor.

      --
      Not that this wasn't entirely predictable.
    17. Re:When are multiple cores going to help me? by cthulhu11 · · Score: 1

      I was about to post the same thing -- Lightroom can profitably use as many as 8 cores. More cores vs fewer faster cores depends a lot on the workload.

    18. Re:When are multiple cores going to help me? by craftycoder · · Score: 1

      GWT is only the front end. I use Glassfish for the back end of websites. That is just J2EE stuff so it's not nearly as slow as GWT though it does only use one core. In essence I already am doing as you suggest.

      As a developer and not a gamer or video maker I stand by my original complaint. These multi-core processors have been a step in the wrong direction for me from a performance standpoint. Some of us would benefit from 4 cores that packed as much punch (whatever that really means) as the the 8 cores we can buy. I really do appreciate that my OS doesn't lock up anymore when it's working hard. I just get annoyed when I'm waiting and waiting on a build and my CPU is pegged at 13%.

      I also feel like the marketing of these processors is confusing. Back in the day, when I was doing my first profession software projects on an IBM XT it was very clear what the performance boost would be when transitioning to the 80286 processor or the addition of the 80287 to your IBM AT. I continued to understand what I was buying throughout the next decade or two. I chose the 486DX 50 rather than the DX2 66 because the distinction was as clear. At some point though the marketing materials just got too confusing. Maybe it's just that I'm old now, but I can't figure out what they are selling or why I would choose one processor from another anymore. I went to Dell and ordered the fanciest one (based on price). That clearly was not the right answer. I've asked people who appeared knowledgeable on the topic to explain it to me, but the answers sounded more like BS than CS. Perhaps as it has become more difficult to differentiate their products based on merit they chose to obscure their offerings with lingo and slogans in hopes of gaining sales through confusion. Or, maybe I'm just to old and dumb to get it anymore.

    19. Re:When are multiple cores going to help me? by BuildMonkey · · Score: 1

      I run a recent Dell T3500, 24GB RAM and dual GTS450 graphics cards. The extra cores help a lot with Adobe After Effects. A. LOT. Not so much with Adobe Premiere, because you get far more bang for your buck with a medium-to-high end NVIDIA graphics card: Premiere makes excellent use of CUDA, particularly for encoding. (By default, Premiere will only see and use "pro" cards: Quadro, Fermi, Tesla. There is an easy configuration hack that lets it use any 200 series or better card. My GTS450 encodes full 1080 HD in real-time. ) There is a caveat there: only for encoding in the foreground, and it only uses a single NVIDIA graphics card: SLI does not matter.

      I find it strange the the foreground encoding uses GPU acceleration, but batch encoding does not. So the extra cores would help you there, too.

      In my programming, (large scale network simulation, real-time audio processing) any cores beyond 2 do little to nothing. When I re-compile the latest version of Boost, the extra cores substantially speed the build, but this is something I only do 2-3X per year.

    20. Re:When are multiple cores going to help me? by Bitmanhome · · Score: 1

      I believe we live in "the between times:" The laws of physics have made faster cores impossible, so we now have multi-core chips .. but we don't have enough cores to make multi-core software effective. You can either run on one core and ruin performance by not taking advantage of the chip, or you can run on all cores and ruin performance with synchronization overhead.

      I suspect this problem won't be resolved until we top 100 cores, where the new programming paradigm (whatever that turns out to be) will be able to be effective. In the mean time .. we're just screwed.

      --
      Not that this wasn't entirely predictable.
    21. Re:When are multiple cores going to help me? by craftycoder · · Score: 1

      That rings true to me. I don't know the numbers, but I do know that a lot of software I use is not very thoughtful about using available resources. Until developers or our tools are smarter about using the resources on the target devices we will continue to see disappointing performance numbers. We've been spoiled by Intel for a long time now. I think its our turn to start writing better software because Intel isn't saving our bacon anymore.

  20. Re:Only 16? by afidel · · Score: 5, Insightful

    No, there are 16 integer pipelines with one scheduler and 4 logic units each, 16 128bit floating point units that can also be combined into 8 256bit units, and 8 fetch/decode units. This is not a MCU, it's one chip with the above mentioned components. Whether it's 16 cores or 8 or 4 modules is kind of academic unless you are trying to optimize a scheduler for it in which case the label's still don't matter, only the actual implementation and achievable performance matter.

    --
    There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
  21. Re:how do they compare ? by KingMotley · · Score: 1

    No, they aren't.

  22. Re:Only 16? by beelsebob · · Score: 4, Interesting

    The basic point is that it has a total of 8 instruction fetch units, it has a total of 8 instruction decode units that they feed, and it has a total of 8 chunks of L2 cache. The fact that each of these 8 cores has 2 integer units on it is neither here nor there –hell, for years cores have had several floating point units on them, it didn't make them more than one core. Not only that, but this CPU behaves badly when the scheduler treats it as 16 cores instead of 8. The bottom line is that this chip in every single way behaves like an 8 core CPU, more so, it's slower than intel's 8 core CPUs at a similar clock even with hyper threading disabled.

  23. Re:Only 16? by Killjoy_NL · · Score: 1

    pffff why the troll mod, it's funny and on topic :)
    probably not very accurate, but still quite enjoyable :)

    --
    This is the sig that says NI (again)
  24. Re:how do they compare ? by beelsebob · · Score: 1

    Really? Because this looks like the FX-8150 getting beaten 3 ways silly by even an i5-2500 at photoshop:
    http://images.anandtech.com/graphs/graph4955/41688.png

  25. Re:how do they compare ? by unity100 · · Score: 1

    really.

    http://www.tomshardware.com/reviews/fx-8150-zambezi-bulldozer-990fx,3043-15.html

    radial blur, shape blur, median, polar coordinates.

    This test employs threaded filters, taxing as many cores as we throw at it. Zambezi’s eight integer units capitalize, flying past the Core i5 and Core i7, outright trouncing the six-core Phenom II X6 1100T, too.

  26. Re:Only 16? by Vancorps · · Score: 2

    What are you basing this on? As someone that runs both database and web servers using both AMD and Intel I find your conclusions to be completely counter to my experience and to the experience of almost everyone I know that does virtualized infrastructure.

    I ran into a number of problems when I first tried to deploy them because SQL 2005 wouldn't install on it. SQL 2008 runs just great with 24 cores as they were dual processor 12 core servers. I have no reason to think the 16 cores variants would be much different.

  27. Re:how do they compare ? by unity100 · · Score: 5, Informative

    and many, many, moooreeee

    -mainconcept http://www.lostcircuits.com/mambo//i...&limitstart=17
    -mediashow http://www.guru3d.com/article/amd-fx...ssor-review/14
    -h.264 http://www.guru3d.com/article/amd-fx...ssor-review/14
    -vp8 http://www.guru3d.com/article/amd-fx...ssor-review/17
    -sha1 http://www.guru3d.com/article/amd-fx...ssor-review/17
    -photoshop cs5 http://www.lostcircuits.com/mambo//i...&limitstart=14
    -photoshop cs5 http://www.tomshardware.com/reviews/...x,3043-15.html
    -winrar, faster than 2600k http://www.techspot.com/review/452-a...pus/page7.html
    -winrar, improves over x6 http://www.tomshardware.com/reviews/...x,3043-16.html
    -7-zip better than 2600k here: http://images.anandtech.com/graphs/graph4955/41698.png http://www.anandtech.com/show/4955/t...x8150-tested/7
    -7-zip same perf as 2600k http://www.tomshardware.com/reviews/...x,3043-16.html
    -POV-ray, faster than 2600k http://www.legitreviews.com/article/1741/10/
    -POV-ray http://www.nordichardware.se/test-la...art=15#content
    -x264(2nd pass AVX enabled) http://www.anandtech.com/show/4955/t...x8150-tested/7
    -x264 (2nd pass, better overall than 2600k) http://www.bjorn3d.com/read.php?cID=2125&pageID=11108
    -x264 (2nd pass +.3 than SB2600k) http://www.legitreviews.com/article/1741/7/
    -handbrake; http://www.legitreviews.com/article/1741/9/
    -truecrypt; http://www.bjorn3d.com/read.php?cID=2125&pageID=11111
    -solidworks; faster than 2600k http://www.techspot.com/review/452-a...pus/page7.html
    -abbyy filereader http://www.tomshardware.com/reviews/...x,3043-16.html
    -C-Ray, as fast as $1k i7-990X, http://i664.photobucket.com/albums/v.../c-rayir38.png

  28. Re:how do they compare ? by beelsebob · · Score: 1

    Good work digging up all the graphs where Bulldozer manages to get between the i5 and the i7 (which, based on its price point *it damn well should*, being priced half way between the two). Unfortunately, while you've dug up a nice bunch of places it just about holds its own, there many times more where the Sandy Bridge chip eats it for breakfast, including heavily multithreaded work. As I said above – Bulldozer is good at very multithreaded integer work, and pretty much nothing else.

  29. Re:Only 16? by Chrisq · · Score: 1

    Pfft, how much harder can it be to design one with 32 :)

    To run at the same speed - very difficult. Think about twice the heat unless you make major changes

  30. Crippling chips by Quila · · Score: 2

    It's common, live with it. Every Cell processor in a PS3 comes with eight cell processing units, with one disabled. That way they can set the standard for seven and use most of the chips that come off the line.

    Even AMD had a problem with too-good yield about ten years ago, so they restricted the clock and sold "crippled" low-end chips that were technically rated to run at much higher speeds.

  31. Re:how do they compare ? by unity100 · · Score: 1

    there many times more

    yes. then instead if shooting from the hip, recount those times and occasions.

  32. Re:how do they compare ? by Rockoon · · Score: 1

    Logic work *is* integer work, fool.

    --
    "His name was James Damore."
  33. Re:how do they compare ? by unity100 · · Score: 1

    Bottom line – Bulldozer isn't good at multithreading, it's good at integer work. Unfortunately, servers are mostly logic work, so sandy bridge is likely to destroy it.

    oh boy. i just saw this. you dont know shit.

    'servers are mostly logic work' hahahahaa. luckily someone else gave your answer.

    next time, dont talk without knowing shit. 'servers' mean heavily multithreaded integer work. in these, bulldozer excels. and that is also one of the reasons why there have been 3 amd opteron (bulldozer 16 core) supercomputer orders in the past 3 weeks. NOT intel. amd. opteron, bulldozer. SUPERcomputer.

  34. Re:how do they compare ? by unity100 · · Score: 1

    Bottom line – Bulldozer isn't good at multithreading, it's good at integer work. Unfortunately, servers are mostly logic work, so sandy bridge is likely to destroy it.

    excuse me but you have posted the same bullshit without knowing SHIT about what you are talking on the second time here. apparently you havent read what you have been told about how logic work being integer work by another slashdotter.

    i replied to you on your ignorance in the other post. 3 supercomputers that are bulldozer based, in the past 3 weeks. a supercomputer a week. yes. sandy bridge e must be 'LIKELY' to destroy bulldozer in heavily multithreaded workloads.

    how about not talking on stuff you dont know shit about next time, and not coming up like a moron as a consequence ? please.

  35. Re:how do they compare ? by unity100 · · Score: 1

    even the i5 beats the shit out of it

    are you aware that the tooling process and silicon cutting in the factories for this chip, has not matured yet ? do you even know what these mean ?

  36. Re:Build your own tablet? by Eggbloke · · Score: 1

    derp, wrong article

    --
    I care not for your karma and your mod points.
  37. Re:how do they compare ? by beelsebob · · Score: 1

    No, no it's not, logic work includes all kinds of things like branch prediction, pipeline length and hence amount flushed when it all goes titsup, etc. Notably Bulldozer, does terribly at this, but not so badly at pure integer work.

  38. Re:Only 16? by Mashiki · · Score: 1

    Doesn't really matter until developers get off their asses and start including multi-threading code. You'd think that after multicore and multiprocessor usage started jumping through the roof, that you'd see it.

    --
    Om, nomnomnom...
  39. Re:Only 16? by certain+death · · Score: 3, Informative

    It matters to virtualization. Higher density equates to more systems on a single server, which equates to less power for the same number of servers.

    --
    "My immediate reaction is "WTF? What kind of moron doesn't make things 64-bit safe to begin with?" Linus
  40. Re:Only 16? by RalphTheWonderLlama · · Score: 1

    The term "core" has about lost its meaning so this is all useless arguing. Funny that you tried to use how the chip "behaves" as definition for how many cores it has. Is that a better way? Come on.

    --
    simple, fast homepage with your links: http://www.ngumbi.com/
  41. Re:how do they compare ? by zixxt · · Score: 1

    Good work digging up all the graphs where Bulldozer manages to get between the i5 and the i7 (which, based on its price point *it damn well should*, being priced half way between the two). Unfortunately, while you've dug up a nice bunch of places it just about holds its own, there many times more where the Sandy Bridge chip eats it for breakfast, including heavily multithreaded work. As I said above – Bulldozer is good at very multithreaded integer work, and pretty much nothing else.

    Nice Trolling

    --
    ---- GENERATION 26: The first time you see this, copy it into your sig on any forum and add 1 to the generation.
  42. Re:Only 16? by Pence128 · · Score: 1

    A M D is real-ly real-ly great. MOAR CORES!

    --
    404: sig not found.
  43. read the I7-3000s TH review by cheekyboy · · Score: 1

    You want fast IO? check the i7-3ks

    And dont forget power, the intels use less on idle and on max usage.

    Compute power usage over 4years, and the AMD will use the same power as it cost to buy an intel.

    Dont forget AES speeds are 6-8x FASTER on Intel.

    Every benchmark, single core and combined cores yield faster results.

    But hey if you want a 48core amd server and its $5000 cheaper for you, go for it. ( if youre utilizations is 20% it doesnt matter )

    --
    Liberty freedom are no1, not dicks in suits.
  44. I am an AMD fan by Travoltus · · Score: 1

    and I can clearly see that beelsebob has done his/her research.

    http://hardware.slashdot.org/comments.pl?sid=2524922&cid=38053256

    --
    --- Grow a pair, liberals... stop letting the Republicans bully you!
  45. wow by unity100 · · Score: 1

    branch prediction, pipeline length and all the calculations happen over what ? floating point ?