Slashdot Mirror


Next-Gen Intel Chip Brings Big Gains For Floating-Point Apps

An anonymous reader writes "Tom's Hardware has published a lengthy article and a set of benchmarks on the new "Haswell" CPUs from Intel. It's just a performance preview, but it isn't just more of the same. While it's got the expected 10-15% faster for the same clock speed for integer applications, floating point applications are almost twice as a fast which might be important for digital imaging applications and scientific computing." The serious performance increase has a few caveats: you have to use either AVX2 or FMA3, and then only in code that takes advantage of vectorization. Floating point operations using AVX or plain old SSE3 see more modest increases in performance (in line with integer performance increases).

176 comments

  1. Would that improve hashing speeds in, say, Bitcoin by d33tah · · Score: 1

    Would that improve hashing speeds in, say, Bitcoin?

  2. Let's see... by bluegutang · · Score: 5, Funny

    " Next-Gen Intel Chip Brings Big Gains For Floating-Point Apps "

    How much of a gain? More or less than 0.00013572067699?

    1. Re:Let's see... by kimvette · · Score: 0

      FTFS:

      While it's got the expected 10-15% faster for the same clock speed for integer applications, floating point applications are almost twice as a fast

      HTH

      --
      The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
    2. Re:Let's see... by 0100010001010011 · · Score: 5, Informative

      It's a joke. The Intel P5 Pentium FPU had a bug where

      4195835/3145727=1.333739068902037589 The correct answer is 1.333820449136241002.

    3. Re:Let's see... by unixisc · · Score: 1

      Okay, so how will it compare w/ the Itanium?

    4. Re:Let's see... by kimvette · · Score: 2

      Oh right, that bug an Intel rep laughably claimed one would only encounter once every 2,500 years or so. I'd forgotten about that.

      --
      The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
  3. Re:Would that improve hashing speeds in, say, Bitc by Anonymous Coward · · Score: 0

    That does not sound like something that would benefit from faster floating point operations.

  4. So in other words by Anonymous Coward · · Score: 0

    Certain kinds of apps will get a nice performance boost if they're running in house, or on a vendor managed server. If the customer installs the software, then no.

    1. Re:So in other words by swilde23 · · Score: 1

      Regular users will see the regular increase (roughly the same as the integer increase).

      But, anytime a chip releases a new feature that relies on specific code, of course only "certain kinds of apps" will get a boost.

      Or maybe I'm misreading the summary (because, I don't read articles)

      --
      There are 10 types of people in the world. Those that understand this sig, and those that beat up people who do.
    2. Re:So in other words by 7-Vodka · · Score: 1

      Unless you can just recompile your OS and all your software with a new version of GCC...

      --

      Liberty.

  5. Hope it's going in the new Mac Pro by GlobalEcho · · Score: 3, Interesting

    I hope there's really a new Mac Pro coming and that it has these chips in it! I do a heck of a lot of PDE solving, statistics and simulations, and would love to have a screamin' machine again.

    1. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 5, Insightful

      Do you really need a Mac for that? If not, it seems you're limiting your potential by having to wait for the holy artifacts to be released.

    2. Re:Hope it's going in the new Mac Pro by semi-extrinsic · · Score: 5, Interesting

      If you're doing numerics, what the fuck (if you'll pardon my French) are you doing buying Apple? I'm working on two-phase Navier-Stokes solvers myself, and I just bought a new rig consisting of 3 boxes each with a Intel Core i7 @ 3.7 GHz, 12 GB RAM, an SSD drive and a big-ass cooling system. In total that cost less than the Mac Pro with a single Core i7 @ 3.3 GHz listed in that article.You're paying 3x more than you should, and you get what extra? A shiny case? Puh-lease.

      --
      for i in `facebook friends "=bday" 2>/dev/null | cut -d " " -f 3-`; do facebook wallpost $i "Happy birthday!"; done
    3. Re:Hope it's going in the new Mac Pro by spire3661 · · Score: 3, Interesting

      Why not just do that on real workstation hardware and tap into it remotely?

      --
      Good-bye
    4. Re:Hope it's going in the new Mac Pro by Charliemopps · · Score: 1

      He gets to tell his friends he bought an apple... apparently he keeps friends that care.

    5. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 1

      Most physics researchers (source: physics PhD) use Mac desktops/laptops and Linux servers. Macs are perfect environments for a mix of coding and general computing, with good support for *nix tools. Anything serious gets done on a cluster. I've seen this in several universities, all of them top tier (e.g. Oxford, Imperial, UCL, Warwick), so it's not isolated.

      But hey, this is Slashdot.

    6. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      Your Navier-Stokes solutions aren't going to be anywhere near as hip and thin as his.

    7. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      Because that doesn't allow you to show off your "Oooh, shiny!!". Duh.

    8. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      Mac's have their advantages but in the places I visited (more experimental physics than theoretical) Windows PC are still far more common. Roughly 60% Windows, 20% Mac and 20% Linux here. But this is not set in stone as we only need two or three Windows only software packages that you can run in a virtual machine.

    9. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      Since when was the Mac Pro not "real" workstation hardware?

    10. Re:Hope it's going in the new Mac Pro by mozumder · · Score: 2

      The Mac Pros use Xeon chips, which are usually updated about 1 year after the mainstream Core processors are out.

    11. Re:Hope it's going in the new Mac Pro by alen · · Score: 1

      Since it sells with 2 year old cpu's
      Or was it 2 generation old cpu's

    12. Re:Hope it's going in the new Mac Pro by newcastlejon · · Score: 1

      If your experimental labs are anything like our workshops you'll probably find them running a few ancient Win95/DOS tools that don't take kindly to being cooped up in a VM without direct access to hardware. As I think back, though, I do recall a lonely old G3 being used as a data logger.

      --
      If God forks the Universe every time you roll a die, he'd better have a damned good memory.
    13. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      How stupid. Most workstations companies buy are no different. Almost no company is buying bleeding edge hardware in their workstations.

    14. Re:Hope it's going in the new Mac Pro by mozumder · · Score: 0

      The Core i7's are consumer-grade processors and are slower than the Xeon's the Mac Pros use, they don't even use ECC cache memory. Good luck running a week-long simulation job with one random bit-error in your data. So yes, Core i7's are amateur junk, and using them in a pro workstation is a good way to get you fired from your job, because you do not know professional requirements. "Herpaderp why can't we just use my overclocked Core i7 herpaderp! It's good for gaming! It should be good enough for this nuclear simulation! herpaderp!"

      So if you want absolute speeds, ECC reliability, & cheap prices, you have to go with Apple Mac Pro workstations. Even Dell & HP can't even compete against Mac Pros. Have you actually spec'd out equivalent systems from Sun, IBM, Dell & HP? Go ahead, try it, and see how much you save. There's a reason smart people use Mac Pros. These are physicists, not noob morons like you dorks. Your best bet is to learn from them.

      And don't make your boss laugh before he fires you when you tell him you actually want to build your own system...

    15. Re:Hope it's going in the new Mac Pro by petermgreen · · Score: 1

      The mac pros currently ship with westmere based CPUs. The most recent comparable CPUs are sandy bridge based. So even if you count both new core designs and die shrinks as "generations" it's still only one generation behind comparable CPUs.

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
    16. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 1

      Thank you for that imagery :D

    17. Re:Hope it's going in the new Mac Pro by fyngyrz · · Score: 1

      Not to put too fine a point on it, he gets OSX, the OSX ecosystem, the vast majority of the *nix ecosystem, the ability to VM several varieties of the Windows ecosystem *or* any one of a number of pure *nix ecosystems, all in parallel if he likes, the ability to drive a bunch of monitors (I've got six on mine), all manner of connectivity, and yes, perhaps last and even perhaps least, probably one of the best cases out there -- it's not just shiny. it's bloody awesome.

      I don't even *like* Apple the company -- they piss me off more than I can adequately say for a list of reasons I won't bore you with -- but my Mac Pro was worth every penny for all the things it brings to the table. Could it be better? Yep. Will it be better next time around? Almost certainly.

      Now go back to being happy with your stuff, and we'll go back to being happy with ours.

      --
      I've fallen off your lawn, and I can't get up.
    18. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      Your argument fails due to facts. The Mac Pro line is not available with Core i* processors. It's only available with Xeon processors.

    19. Re:Hope it's going in the new Mac Pro by GlobalEcho · · Score: 1

      If you're doing numerics, what the fuck (if you'll pardon my French) are you doing buying Apple?

      Fair question. It turns out, PDE solving etc. isn't all I do, so while I like my machine to be reasonably fast at the numerics, I require it to work well as a general-purpose computer, too. To me, Windows, Linux and FreeBSD fail to meet that criterion.

      I do small-to-medium problems locally without having to think about remote execution issues, and then farm truly heavy numerics out to parallel processing farms like anybody else (aside from the PDE solvers, much of what I do is embarrassingly parallel). It's really quite nice, say, running some giant calculation in Mathematica or Matlab and then being able to click-n-drag the output plot into presentation software. That workflow is unavailable in Linux, and probably full of pitfalls on Windows.

      [Same answer to the poster who wonders why bother to wait. ]

    20. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 1

      He buys the special edition that comes with a dildo

      Oh, I get it! Because Mac owners are homosexuals...that's funny! Stupid homosexuals. Mod parent up!

    21. Re:Hope it's going in the new Mac Pro by Aardpig · · Score: 3, Insightful

      Erm -- ECC memory is slower than non-ECC memory, I think.

      --
      Tubal-Cain smokes the white owl.
    22. Re:Hope it's going in the new Mac Pro by washu_k · · Score: 5, Informative

      The Core i7's are consumer-grade processors and are slower than the Xeon's the Mac Pros use

      This is completely incorrect. The current Mac Pros use Nehalem based Xeons which are two generations back from the current Ivy Bridge i7s. Xeons may have differences in core count, cache and/or ECC support but their execution units are the same as their desktop equivalents. The base Mac Pro CPU is equivalent to an i7-960 with ECC support. The current Ivy Bridge i7s are a fair bit faster.

    23. Re:Hope it's going in the new Mac Pro by IWannaBeAnAC · · Score: 1

      Most of the people in the physics department here use windows desktops, but pretty much all of the numerics people use linux desktops. Naturally, all of the computing clusters are linux. It seems that virtually all laptops are macs though, which is curious. Possibly people would like to use macs on the desktop but there is some barrier (eg, purchasing or IT administration policies) ? I'll have to find out!

    24. Re:Hope it's going in the new Mac Pro by viperidaenz · · Score: 1

      The current top of the line Mac Pro has a pair of 3 year old CPUs (2x 6 core E5645/50/75, released Q1, 2010). You can't compare to any current HP or Dell etc as they use newer generation Xeons.

      A 12-core MacPro in NZ costs $6100.
      A top of the line 12-core Dell costs $6200.

      Dell has E5-2630 CPU's, Mac has E5645. Dell wins there, more cache, newer CPU.
      Dell has 16GB ECC Ram, Mac as 12GB. Dell wins there, 1600mhz, 128gb max, Mac is 1333Mhz, 64gb max.

      I'm sure a 2 year old Dell is cheaper than a brand new Mac Pro.

      Of course, with those same CPU and RAM specs, there are cheaper Dells, down to $5300. So you save $900 and get more performance.

    25. Re:Hope it's going in the new Mac Pro by KonoWatakushi · · Score: 5, Informative

      ECC memory is only marginally slower. Considering error rates and modern memory sizes, it is far past time that it became a standard feature. The extra cost would be totally insignificant if were standard, and not used as an excuse to gouge people on Xeons.

    26. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      Gay Mac user spotted! *whoop* *whoop* Gay Mac user spotted!

    27. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      http://store.apple.com/au/browse/home/shop_mac/family/mac_pro

      2999 dollars (au) for a mac pro with 6gb ram and a single quadcore 3.2ghz Xeon

      Got a quote yesterday for a HP with a 3.4 ghz Xeon 8gb ram, hardware raid and 1tb drive (2x 500gb presumably raided) including Windows 2008 r2 standard (so you can't say I didn't pay for OS)

      for 2248 dollars.

      What were you saying about mac pros being cheaper?

      About the only thing "better" in the mac pro; was the ATI 5770. Which is cheaper than the 700 dollar price premium.

    28. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      Other than OSX and the higher price tag; what was the point of the rest of your comment?

      Don't other PCs provide you access to nix and windows VMs? or is that a Mac "feature". "all in parallel?" driving bunches of monitors? I NEVER SAW THAT ON A PC EVER! and connectivity! never seen that since macs.

      I guess he gets older Xeon processors for his extra money. Cause that's better right. They don't make them like they used to right!!

    29. Re:Hope it's going in the new Mac Pro by LordLimecat · · Score: 2

      Youre paying at least double for the same hardware on a Mac. The Mac cited in the article has 2x 6-core Xeons @ 2.4gHz. Those (assuming E5645s) can be had for ~$575 each, with a motherboard at ~$275. Everything else is pocket change; a whole right with SSDs etc could be had for under $1700.

      But Im sure someone somewhere will explain why the aluminum makes the extra $2000 for the Mac worth it.

    30. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      no. Just Steve Jobs was the homosexual. You can either willingly take his cock, or call it rape.

    31. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      You've made it sounds so much better now. It's only 1 generation behind!

    32. Re:Hope it's going in the new Mac Pro by epyT-R · · Score: 2

      It depends. Depending on the generation of xeon, you pay for the privilege of some combination of ECC RAM/cache, more cache, and multisocket capability. In many cases (like the pentium 4 era), you got a p4 with more cache that wasn't much faster than the desktop variant, even with 'enterprise' loads like databases! In the pentium 3 xeon days, you got marginal benefits with the extra cache, yet paid A LOT more for the hardware. With Xeon, the performance boost rarely justified the cost. Intel knew this, so that's why, these days, multisocket capability is a xeon exclusive: to make you pay dearly for that privilege.

      Obviously, if you truly need these features you'll have no choice but to pay up, but these chips failure rates and performance are not any different than the consumer models of the same design at a given clockspeed. They're built on the same manufacturing technology and it is unlikely that intel bins either variant beyond the clockspeeds and TDP stamped on the box. While I don't deny that some critical systems need things like ECC, your post reads like a typical arrogant mac user perspective: someone desperate for social exclusivity trying to justify his overexpenditure.

    33. Re:Hope it's going in the new Mac Pro by fyngyrz · · Score: 1

      Other than OSX and the higher price tag; what was the point of the rest of your comment?

      The point was, and is, that he's happy with his Mac. I'm sorry you don't get it.

      Don't other PCs provide you access to nix and windows VMs?

      They don't, however, provide you with access to OSX. It's the combination of all of them, all working at once, that really brings the whupass. And you won't be doing that in any legit, supported fashion on anything but a Mac. That's well worth the candle. See, this is part of the "and you get OSX" point; when Windows is running over there in it's VM, I just drag files into OSX or the other way, share filesystems, run any combination of apps on any OS I like. You may have multiple monitors on your windows machine, but again, you don't have as solid an environment, and you don't have it at all unless you're doing so in a most unsafe and unsupported manner. Which again is fine if that's what you want to do, but the point AGAIN is that not everyone wants to run that way.

      Remember how this started: Guy made a harmless remark hoping for X within the context of stuff he liked to run. He got jumped by people criticizing his choice. Surprising? No. Hardly. But it isn't reasonable, either. I'm saying to you, be reasonable. We have our reasons, you have yours, fine, leave off now.

      I guess he gets older Xeon processors for his extra money. Cause that's better right. They don't make them like they used to right!!

      Sigh. I'm sure you're very happy. Good. Wonderful. KThxBye.

      --
      I've fallen off your lawn, and I can't get up.
    34. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      >virtually all laptops are macs though, which is curious.

      Not really. The laptops really are great hardware regardless of which OS you run on it. Unless of course you are one of those people with an irrational hatred of all things Apple.

    35. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      Apple and Pro should not be used in the same sentence. Yes this will get modded down, but Apple cares more about the general consumer than pro level customers. It is where they make their money.

    36. Re:Hope it's going in the new Mac Pro by IWannaBeAnAC · · Score: 1

      >virtually all laptops are macs though, which is curious.

      Not really. The laptops really are great hardware regardless of which OS you run on it. Unless of course you are one of those people with an irrational hatred of all things Apple.

      I was referring to the dichotomy of using windows on the desktop but a mac laptop.

    37. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      If you're doing fluid simulations, what the shit (if you'll pardon my Mandarin) are you doing targeting x86?

      For the price of a high-end Core i7, you could buy two discrete GPUs that were each ten or twenty times as fast as the i7 at single-precision math. A Radeon HD 7970 with 6GB VRAM, for example, costs about half as much as an i7-3970X, and it doesn't need a whole motherboard to itself. That Radeon can manage over 3.7 TFLOPS, versus the i7's ~150 GFLOPS. And unlike a Titan or a Malta, that's something you can buy straight off the shelf, today.

      Okay in some applications, ECC is an important consideration, and so may be double precision performance, and so may be access to more RAM than you typically get in a discrete GPU - but if price/performance is the main consideration, then your best bet as a development platform is a high-end Geforce or Radeon, using CUDA or OpenCL. What they lack in precision, you can often make up for with a smaller simulation timestep or more solver iterations.

      I just think it's cute that you're scoffing "dude, why go with the third best option when you could go with the second best?"

    38. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      Funny how you immediately made that logical leap. Straight dudes can enjoy anal play. Bigoted much?

    39. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      Yawn, mozumder trolling again...

    40. Re:Hope it's going in the new Mac Pro by cpotoso · · Score: 1

      ???? Why do you need a mac for that? I run mac laptops and even imacs. Even have a mac pro from 2006 (at that time a good deal, 8 xeon 3GHz, not much more expensive than the equivalent Dell). Last month, a Dell Precision workstation with 2 hex core xeons (+ hyperthreading, making them effectively 24 cores--don't scream at me, I have benchmarked MY programs and for all practical purposes it acts as 24 CPUs) for just over $2k (including 32 GB ram, 3 TB disk). Runs linux nicely and the parallelism beats any bewoulf cluster due to the faster in-cpu or in-motherboard connection. The mac pro is about 2 cpu generations outdated, and 35% more expensive. Sigh!

    41. Re:Hope it's going in the new Mac Pro by cpotoso · · Score: 1

      You would still do a lot better getting an imac for your regular software and a linux machine for the computation. X11 makes all transparent too. And still spend less... See my post above.

    42. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      Lots of wanna-be high-horsepower computer kids with bux to burn and not much sense go for Apple kit. Its not bright, but oh so kewel. In the real world, I buy my computer in pieces and put it together myself, knowing which components are shit and which arent. I'm not alone. Google assembles all their hardware from pieces. And if you go to TOP500.org, you look at machines with hundreds of thousands of processors, and its usually beowulf clusters, assembled from pieces, and (oh horrors) more than 95% of the operating system families of those multi-million dollar supercomputers are running Linux! GCC will take advantage of the new instruction set soon. Also to parent: what are you building the Navier-Stokes solvers for? Weather forecasting? Jet turbine design? Airfoil design? Its computational fluid dynamics, but that is such a big field.

    43. Re:Hope it's going in the new Mac Pro by Jeremi · · Score: 1

      But Im sure someone somewhere will explain why the aluminum makes the extra $2000 for the Mac worth it.

      The case is very nice, but it's not worth $2000 extra.

      The ability to run MacOS/X (without "hackintosh" style shenanigans) is really nice, and is worth $2000 extra if you have that kind of money lying around (or, more realistically, if your employer does).

      If you think $2000 extra is too much to spend, you're probably right. On the other hand, plenty of people will spend an extra $20,000 on a nicer brand of car; sometimes people want what they want, and are willing to pay extra for it.

      --


      I don't care if it's 90,000 hectares. That lake was not my doing.
    44. Re:Hope it's going in the new Mac Pro by unixisc · · Score: 1

      Aside from the salient point about him being happy w/ the Mac, the GP's other point - somehow missed - was that OS-X brings with it FBSD userland, which therefore makes available most if not all of Unix features. If he had a Wintel PC, he'd have had to run a Linux or a VirtualBSD VM, and if he had Linux, there would be a paucity of applications for it. Here, since it's OS-X/FBSD, it's very unlikely that he'll need Linux, except maybe to run any specialized program developed only for Linux. But if he does, he can do it under a VM.

      The only thing he'd be missing here would be Windows, but that's the trade one makes for OS-X. And with Microsoft trying to replace Windows 7 w/ Windows 8 in the long run, OS-X is a good choice. Yeah, there are issues like the iOSification of Mountain Lion, but one could try using a previous version.

    45. Re:Hope it's going in the new Mac Pro by unixisc · · Score: 1

      Why not just do that on real workstation hardware and tap into it remotely?

      What 'real workstation' is left? The only workstations available these days are x64 workstations. SPARC, POWER, MIPS and even Itanium workstations are dead. Where exactly could one buy a RISC workstation anymore, if one wanted to get it, get the latest and greatest version of Debian or *BSD and run w/ it? Everything is now Intel/AMD, and all the CPUs that had superior floating point are either dead, or exclusive to servers that would cost millions.

    46. Re:Hope it's going in the new Mac Pro by semi-extrinsic · · Score: 1

      Okay, I'll answer this. From my use of a Core i7, you can tell that I'll be coding a serial app with some OpenMP in the slow parts to utilize all 4 cores. Now that's much easier than writing the same code for a GPU. If I were serious in developing the "blazing fast" stuff (which I'm not, my focus is on implementing new multi-physics models) I could spend the same amount of effort and target MPI instead of GPUs, and then go run it on the 22,000 core cluster in the basement next door.

      --
      for i in `facebook friends "=bday" 2>/dev/null | cut -d " " -f 3-`; do facebook wallpost $i "Happy birthday!"; done
    47. Re:Hope it's going in the new Mac Pro by semi-extrinsic · · Score: 1

      Ehm, I have to say that ECC is over-hyped by server hardware vendors, especially for CFD applications. The failure rate for modern RAM is 1 bit error per 1 GB per 1 month of simulation. To be honest, a typical CFD code will have to handle much worse errors than that due to random programming bugs (if you think your 35,000 lines-of-code program is bug free, well... it's not) etc., such that if it crashes or becomes unphysical from 12 bit errors in a month, you're screwed anyway.

      On the other hand, if you're running a COBOL program that calculates how much you're employees taxes are going to be, then you really want ECC.

      --
      for i in `facebook friends "=bday" 2>/dev/null | cut -d " " -f 3-`; do facebook wallpost $i "Happy birthday!"; done
    48. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      Most physics researchers (source: physics PhD) use Mac desktops/laptops and Linux servers. Macs are perfect environments for a mix of coding and general computing, with good support for *nix tools. Anything serious gets done on a cluster. I've seen this in several universities, all of them top tier (e.g. Oxford, Imperial, UCL, Warwick), so it's not isolated.

      But hey, this is Slashdot.

      perhaps in your neck of the woods. Most of my fellow (physics) PhD students use windows laptops with GNU and/or linux VMs to interact with linux clusters, because they're affordable and provide good performance compared to macs. I work fulltime in physics simulation and there is not a single mac in house.

    49. Re:Hope it's going in the new Mac Pro by lordbeejee · · Score: 1

      Not really. The laptops really are great hardware regardless of which OS you run on it. Unless of course you are one of those people with an irrational hatred of all things Apple.

      Same specced other brand laptop gets close to half the price of macbooks so it's not always irrational hatred, some people have limited funds.

    50. Re:Hope it's going in the new Mac Pro by fa2k · · Score: 1

      Actual error rates in good memory are very low. I didn't see a single error for a year. The main benefit of ECC on workstations is to detect memory that is slightly bad, which passes hours of memtest86, but still gives you errors maybe every month on your workload. This requires you to monitor ECC errors, and get alerts, so you can replace the DIMMs that give errors repeatedly. The problem is that ECC monitoring for the new Intel chips is not available in Linux (as far as I can tell, cheeky plug for my stackexchange question http://unix.stackexchange.com/questions/67999/how-to-monitor-ram-ecc-errors-on-intel-processor-in-linux )

    51. Re:Hope it's going in the new Mac Pro by tyrione · · Score: 0

      If you're doing numerics, what the fuck (if you'll pardon my French) are you doing buying Apple? I'm working on two-phase Navier-Stokes solvers myself, and I just bought a new rig consisting of 3 boxes each with a Intel Core i7 @ 3.7 GHz, 12 GB RAM, an SSD drive and a big-ass cooling system. In total that cost less than the Mac Pro with a single Core i7 @ 3.3 GHz listed in that article.You're paying 3x more than you should, and you get what extra? A shiny case? Puh-lease.

      Mac Pro: 64GB of ECC RAM. Yes a kick ass infrastructure throughout the entire experience. Don't get me wrong, just building a Corsair 650D, Vishera FXX-8350, 32GB of 1866 DDR3, 256 SSD, Twin 2 TB Black WD Drives, waiting on the 8000 series AMD GPGPUs for Crossfire, AX850 Corsair Power Supply, Corsair H110 Water Cooloer and splitting time between Debian and FreeBSD I'm already invested heavily to the tune of $2300-$2500 and no OS X Development. Running LLVM/Clang with OpenCL and the R600 stack soon to be ready for LLVM/Clang 3.3 as a Mechanical Engineer I look forward to seeing how fucking sweet this beast shits all over your set up.

      Then again, Apple's OS X OpenCL and GCD throughout is so advanced I know pairing it up with a Mac Pro and whatever the hell Apple comes out with with the same code via LLVM/Clang/LLDB would allow me with the AMD 7000 series now supported to really start eating up those numerical analysis needs, not to mention FEA/CFD. I tell you what: Go buy both rigs and learn Cocoa Dev and what both systems will offer. As soon as FreeBSD 10 is out with binary graphics drivers from AMD I will not even waste my time on Linux and I'll get ZFS to boot.

      The Dev Tools on OS X are robust, far more mature than Linux irregardless of the BS about Qt 5.x and of course with WebGL/WebCL and WebKit 2.x nightly on either set up I have lots of options. Then with LLVM/Clang 3.3 comes the addition of OpenMP amongst other new and robust tools that the GCC crowd knows is pushing them to the limit. I can't fucking stand C++ and C/ObjC gives me the best of both worlds. I've still got FORTRAN to boot on both systems as well.

      Most Engineers have multiple high-end systems to do their work. You pissing about spending for the Mac tells me you haven't owned the Mac Pro, used OS X extensively and either choose or cannot choose to have both configurations for your needs.

      The beauty of the upcoming Mac Pro is the price point drops and that will piss off the competition.

      When you're incorporated doing work you also write down these purchases.

    52. Re:Hope it's going in the new Mac Pro by TheRaven64 · · Score: 1

      You can make RAM even faster - returning the result in a single cycle every time - if you don't care whether the result it returns is the correct one...

      --
      I am TheRaven on Soylent News
    53. Re:Hope it's going in the new Mac Pro by drinkypoo · · Score: 1

      On the other hand, plenty of people will spend an extra $20,000 on a nicer brand of car; sometimes people want what they want, and are willing to pay extra for it.

      The problem with this notion is that often the people are not buying a nicer brand of car, they're buying a prettier brand of car. A Lexus is just a Toyota with more asphalt and the same shit construction and the same shit handling. But a BMW costs the same as a Lexus and is, well, they're built shit since the eighties, but they're actually worth driving. For their extra $2000 they could have got something substantively better, but all they've done is buy a shinier Toyota with some options they could have added themselves for less money.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    54. Re:Hope it's going in the new Mac Pro by GlobalEcho · · Score: 1

      I agree with you from a price point of view, but workflow efficiency is very important to me, moreso than workstation power.

      At one of my jobs, a powerful Linux workstation is my primary machine and we use a Linux compute farm, so I am keenly aware of the shortcomings of both the Linux user environment and of the hassle involved in dealing with remote jobs. If one doesn't have a very wide variety of calculations, or the calculations rarely change, then remote is no big deal. Otherwise it is a real time sink.

    55. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      no. the main benefit of ecc is to not just help uptime (correction), but also to guarantee your data is good. Ifi t cant correct the issue, it'll Pinkscreen your ass, so ass not to work with bad data. that is huge if you value your data. i agree with grandparent, this should be standard in modern times. for both desktop memory, and disk's memory.

      i don't care how you classify good memory as rarely producing errors/good/quality---your memory is still exposed to background radiation and cosmic rays--and that'll flip your bits.

    56. Re:Hope it's going in the new Mac Pro by TheTurtlesMoves · · Score: 1

      Not everything maps to GPUs all that well. Some fluid stuff would be rather hard work to get to work fast on GPUs, say for example 2 phase flows. Also mapping stuff to a GPU means its often quite difficult to keep it flexible which is often needed for R&D fluid codes.

      Its not just about FLOPs its also if you can use em, and without spending 2 years optimizing the code to do so.

      --
      The Grey Goo disaster happened 3 billion years ago. This rock is covered in self replicating machines!
    57. Re:Hope it's going in the new Mac Pro by LordLimecat · · Score: 1

      The ability to run MacOS/X (without "hackintosh" style shenanigans) is really nice, and is worth $2000 extra if you have that kind of money lying around

      Which doesnt explain why a lower end Mac costs only $1000. And whether its worth $2000 extra is about as subjective as it gets; particularly when I doubt you can name a capability that OSX has that Windows does not, or a benchmark showing a substantial performance difference.

      Why not just a debian or RH flavor and be done with it if you really want a *nix?

    58. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      Low is relative, and errors are still a common occurrence as evidenced by recent studies. Detecting bad memory is also an invaluable feature--something you certainly want to do before your data is riddled with corruption.

      Why gamble on corrupt files and filesystems when the chance of uncorrectable errors can be made statistically insignificant through ECC, at virtually no cost?

    59. Re:Hope it's going in the new Mac Pro by Anonymous Coward · · Score: 0

      Hey, I've been working on multi-physics stuff, too. I've been using OpenMP and OpenCL together, and the performance gains have been absolutely spectacular.

      There's a lot more that can be parallelised than you might think. The speed gains for embarrassingly-parallel portions can easily be 100x, as you might expect - but while some algorithms are harder to parallelise, I have yet to find any that couldn't be parallelised at all. The speed gains for the tricky stuff (sorts, scatters etc.) might only be 20x instead of 100x and they might take a few more days to find - but it's still worth it! Atomics etc. are fairly well supported on GPU now, and parallel algorithms for tricky problems like sorts & reductions are pretty well documented.

      I agree that OpenMP is easier to use than most GPU computing APIs, though. #pragma omp go_much_faster! However I'd like to suggest you take a fresh look at GPU computing as well, because a lot has changed in the last year or two. Using the GPU isn't that much harder than using OpenMP these days, thanks to some new(ish) developments.

      Have a look at C++ AMP first, if you're on Windows. Like OpenMP, you can just scatter it into existing Visual C++ projects, for far greater speedups on anything 'embarassingly parallel' than you might get from OpenMP alone.

      In the near future, the Bolt template library from the HSA Foundation could be doing the same thing for you, and more (with many of the common tricky algorithms included), with even less coding effort.

      Or if you're cross-platform-inclined and you did want to get into OpenCL but found it too daunting, there's an OpenCL C++ Wrapper (one file: cl.hpp) that can be used to quickly initialise a CL context and its associated objects without all the tedious C API calls.

      Having all that esoteric GPGPU stuff taken care of would leave you with much more time to find creative ways to turn serial code into parallel code. Then with your simulations running 10-100x faster overall, you'll have much more time for... y'know... everything else you like to do.

      It's generally thought that GPU computing is just too hard and the speed gains don't justify it - but today that's no longer true at all. Today, it's just easy enough, and the speed gains are often astonishing. Not with Intel hardware, though. I don't know what the hell they're doing or why they expect us to settle for incremental performance improvements. Anyway, best of luck!

    60. Re:Hope it's going in the new Mac Pro by semi-extrinsic · · Score: 1

      What are you going to do with 64 GB of RAM on a single node? If you're actually using more than 2-4 GB per core, your program is fully limited by RAM speed, and you should REALLY be using MPI and more nodes with less ram each.

      --
      for i in `facebook friends "=bday" 2>/dev/null | cut -d " " -f 3-`; do facebook wallpost $i "Happy birthday!"; done
    61. Re:Hope it's going in the new Mac Pro by Jeremi · · Score: 1

      particularly when I doubt you can name a capability that OSX has that Windows does not

      Built-in bash shell and Unix environment by default is what does it for me. (I know you can sort of fake it using Cygwin and whatnot on a Windows box, but I'd rather pay the extra money and not have to fake it). I was a die-hard BeOS user back in the day, and MacOS/X is the closest thing to the BeOS user experience that is readily available now.

      Why not just a debian or RH flavor and be done with it if you really want a *nix?

      Because I also want to be able to buy and use commercial software. Linux/Unix are fine, but it's also nice to be able to get software X you rather than having to search around for "something like X"

      Also, I think the Mac desktop experience is a bit nicer.

      Keep in mind that I'm pretty well paid and also I can usually get my employer to pay for my computer purchases. If money was an issue I'd probably be using Linux on a cheap PC, but it's not, so why skimp?

      --


      I don't care if it's 90,000 hectares. That lake was not my doing.
  6. Might be important, but probably not... by MasseKid · · Score: 4, Interesting

    For problems where you need floating point AND is not multithread friendly AND need large computing power AND is specially coded, then this will be of great use. However, most massive computing problems like this are multi-thread friendly and this will still be roughly an order of magnitude from the speeds you can get by using a GPU.

    1. Re:Might be important, but probably not... by semi-extrinsic · · Score: 3, Insightful

      The good thing about manufacturers speeding up SSE/AVX/etc. is that the linear algebra libraries (specifically the ATLAS implementation of BLAS and LAPACK) usually release code that makes use of the new hawtness in about six months after release. Do you know how much software relies on BLAS and LAPACK for speed?

      --
      for i in `facebook friends "=bday" 2>/dev/null | cut -d " " -f 3-`; do facebook wallpost $i "Happy birthday!"; done
    2. Re:Might be important, but probably not... by Anonymous Coward · · Score: 0

      How good are GPUs for large matrix multiplications nowadays compared to CPUs? This sounds like something that could help a lot with linear algebra, which is a huge part of scientific computing. Also, I think you are overestimating the number of problems where a GPU can get its full performance. A problem needs to be much more parallelizable to get good performance on a GPU than on a multi-core CPU.

    3. Re:Might be important, but probably not... by Anonymous Coward · · Score: 0

      Massive computing problems get specially coded, and don't run on a single machine anyways. The gpu bus bottleneck severely restricts the problem set that general purpose scientific computing users on workstations can benefit from offloading the cpu.

    4. Re:Might be important, but probably not... by Anonymous Coward · · Score: 0

      It's still relevant. If you're using OpenCL, then your GPU and CPU will both be pegged at 100% performing whatever multithreaded math you throw at it.

      At least with older mac pros, there were some things that actually were faster on the CPU than the GPU, unless you threw massive money at your video card.

    5. Re:Might be important, but probably not... by godrik · · Score: 1

      Intel Xeon Phi relies on avx (version 1 I believe) and using avx gets you good improvement compared to not using avx for both sequential and parallel codes. Of course, course sequential code on Xeon Phi is typically slower than a regular sandy bridge processor.

      Many applications can use 16 float operations simultaneously. Certainly many video codecs and physics engine.

      GPUs can be good for many computations but tehre are many case where they are not so good. Most pointer chasing type of application tend not to be so GPU-friendly. If you need to go back and forth between CPU and GPU, then you pay some latency. GPUs suffer from programming abstraction problems (no CUDA on AMD, OpenCL is suboptimal on NVIDIA, openacc is only good for simple tasks).

      Larger SIMD lanes on the CPU side will certainly be a good thing for performance.

    6. Re:Might be important, but probably not... by godrik · · Score: 1

      replying to self. Xeon Phi uses larger lanes than AVX. It is 512 bits in Xeon Phi and 256 in AVX, I got the names mixed up.

    7. Re:Might be important, but probably not... by Bengie · · Score: 1

      Not all multi-threaded code is large matrix friendly and GPUs need large matrix math to become useful.

    8. Re:Might be important, but probably not... by Bengie · · Score: 2
      http://software.intel.com/en-us/articles/intel-xeon-phi-coprocessor-codename-knights-corner

      An important component of the Intel Xeon Phi coprocessor’s core is its vector processing unit (VPU), shown in Figure 5. The VPU features a novel 512-bit SIMD instruction set, officially known as Intel® Initial Many Core Instructions (Intel® IMCI). Thus, the VPU can execute 16 single-precision (SP) or 8 double-precision (DP) operations per cycle. The VPU also supports Fused Multiply-Add (FMA) instructions and hence can execute 32 SP or 16 DP floating point operations per cycle. It also provides support for integers.

    9. Re:Might be important, but probably not... by godrik · · Score: 1

      My bad, I realize later that AVX was the new instruction set for sandy bridge and not for xeon phi. AVX (version whatever) and IMCI instructions are quite similar (gather/scatter, Fused Multiply Add, swizzling/permute). Their main different is the SIMD width.

      My overall point remains valid. Doing floating point arithmetic by packs of 256 bits is overall useful.

    10. Re:Might be important, but probably not... by GlobalEcho · · Score: 1

      That's one of the nice things about OpenCL. I wish they would come up with more (and better) math libraries.

    11. Re:Might be important, but probably not... by Aardpig · · Score: 1

      I wish NVIDIA would update their drivers to support OpenCL 1.1. Oh wait, that's not going to happen because they are trying to push CUDA instead...

      --
      Tubal-Cain smokes the white owl.
    12. Re:Might be important, but probably not... by pclminion · · Score: 1

      Yeah, pretty much. Basically, they just doubled the width of the vector execution units. Obviously, that will double the FLOPS for vectorized code. In other news, 8 cores can do twice the work of 4 cores, if your code is multithreaded properly.

    13. Re:Might be important, but probably not... by Anonymous Coward · · Score: 0

      Actually 8 cores can do twice the work of 4 cores, regardless of whether your code is multithreaded or single threaded. 8 separate jobs all running on a single core each is twice as much work as 4 separate jobs all running on a single core each. So long as you don't get page faults, data dependency errors or branch mis-predictions, your pipelines won't crash, and the system runs at full efficiency. Sadly there are page faults, data dependency errors and branch mis-predictions, and so the need to switch pipelines.

    14. Re:Might be important, but probably not... by drinkypoo · · Score: 1

      It will happen if AMD ever manages to make drivers reliable enough that significant numbers of people buy significant numbers of their cards, and nVidia has actual competition.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    15. Re:Might be important, but probably not... by TeXMaster · · Score: 1
      OpenCL is suboptimal on NVIDIA only because NVIDIA refues to keep their support up to date, as it would chip in their vendor lock-in attempt with CUDA.

      I honestly think everybody doing serious manycore computing should use OpenCL. NVIDIA underperforms with that? Their problem. Ditch them.

      --
      "I'm never quite so stupid as when I'm being smart" (Linus van Pelt)
    16. Re:Might be important, but probably not... by Anonymous Coward · · Score: 0

      What's wrong with AMD's drivers? No, I mean today?

      Yeah, I'm sure you've got anecdotal evidence that some years ago, some drivers for some devices under some operating systems were substandard - but today, AMD's drivers are at least as good as NVidia's drivers, and frankly, AMD are offering better compute performance and a better OpenCL implementation.

      Nobody outside of Slashdot cares that AMD had dodgy drivers once upon a time, because nobody outside Slashdot even knows that AMD had dodgy drivers once upon a time. This was never a big deal to the general PC-buying public.

      The real problem is that almost nobody knows that AMD are offering better compute performance and a better OpenCL implementation, because AMD aren't any good at marketing, and "dude, I think I heard once that CUDA is faster, or something?"

    17. Re:Might be important, but probably not... by Anonymous Coward · · Score: 0

      OpenCL is also suboptimal on NVidia because NVidia's architecture is scalar and not vector.

      If you have a million floating-point numbers to crunch on an NVidia GPU, you'll write code to crunch a million floats, and it'll go fine.

      If you write the same code on an AMD GPU, the docs will tell you to crunch 250,000 vectors instead, and the LLVM compiler will automatically vectorise your code for you anyway, and every AMD execution unit will crunch four floats at a time.

      If you optimise your kernel code for NVidia, there's a good chance you'll accidentally prevent that auto-vectorisation when you try to run it on AMD, and it'll under-perform. For reference, see: just about any benchmark sponsored (one way or another) by NVidia.

      If you optimise your kernel code for AMD, or just don't optimise it at all, there's a good chance it'll run four times as fast as it would've done on NVidia. Most of my early CL stuff was (accidentally) far faster on AMD than NVidia because of this. So now I just use AMD extensions that NVidia don't even support and life is beautiful. NVidia's officially ditched as far as I'm concerned!

      Supposedly, though, AMD are moving toward a more scalar architecture. I hope they know what they're doing.

  7. Re:Would that improve hashing speeds in, say, Bitc by slashmydots · · Score: 4, Informative

    Slightly, but you haven't been keeping up on the latest hardware? My pair of Sapphire 5830's graphics cards would top off at about 435MH/s at a total system wattage of around 520W. The new Jalapeno chips from butterfly labs will do 4500 MH/s using 2 watts total system power. For comparison, my i5-2400 performed 14MH/s at 95W or so. So the Jalapeno is about 321x faster and about 47x more power efficient so combined, I believe that's 15,267.864x more efficient.

  8. Awesome! by Anonymous Coward · · Score: 0

    I'm gonna buy some i5 "watchacally" chips soon and I'll wait for the price to come down.

    With tech, unless you need it NOW, wait because the price will always come down.

    And I win again!

  9. Nearing complete integration by bstrobl · · Score: 1

    The thing that interests me most about this generation is the progress towards a single chip solution. Ultrabooks and tablets can get a multi chip package with the PCH (last remnant of the old chipset) soldered along the CPU/GPU die. Shouldn't take long till everything is fabbed onto one piece of silicon, reducing power requirements and gadget size.

  10. Less rounding of floating point numbers by raymorris · · Score: 4, Informative

    While it's got the expected 10-15% faster for the same clock speed for integer applications, floating point applications are almost twice as a fast HTH

    Integer and floating point are separately implemented in the hardware, so an improvement to one often doesn't apply to the other. You can add integers by counting on your fingers. To do that with floating point, you have to cut your fingers into fractions of fingers - a very different process.
    See: http://en.wikipedia.org/wiki/FMA3
    It's common to have an accumulator like this:

    X = X + (Y * Z)

    To compute that in floating points, the processor normally does:

    A= ROUND(Y*Z) X=ROUND(X+A)

    Each ROUND() is necessary because the processor only has 64 bits in which to store the endless digits after the decimal point. FMA can fuse the multiply and the add, getting rid of one rounding step, and the intermediate variable:

    X= ROUND( X + (Y*Z) )

    That makes it faster. Since integers don't get rounded to the available precision, the optimization doesn't apply to integers. The above processor would do Y*Z, then +X, then round, then X=. A CPU designer can make that faster by including either a "add and multiply" circuit or a "add and round" circuit or a "round and assign' circuit. Any set of operations can be done in two clock cycles, if the maker decides to include a hardware circuit for it.

    1. Re:Less rounding of floating point numbers by Anonymous Coward · · Score: 0

      Any set of operations can be done in two clock cycles, if the maker decides to include a hardware circuit for it.

      Any set of operations can be done in any number of clock cycles, if your clock cycles are the appropriate length.

  11. wtf? fma3? by convolvatron · · Score: 1, Offtopic

    could someone tell me how many separate instruction sets, pipelines and register files I
    get in a mainline CPU these days? i turned away for a second and completely lost track.

    what happens with the 10 that you aren't using? just sitting there reducing the yield?

    1. Re:wtf? fma3? by Anonymous Coward · · Score: 0

      agree. those idiots at intel have no clue what they're doing.

    2. Re:wtf? fma3? by Anonymous Coward · · Score: 0

      With respect to vector extensions, no they don't--the extensions and improvements have been totally half-assed at every step. To say nothing of yields, it is a total PITA as a developer having to check individually for 107 different features which all should have been standard a decade ago.

      It is a total joke that it has taken this long to get FMA support, and they still don't have a proper vector permute instruction.

  12. Re:Would that improve hashing speeds in, say, Bitc by Anonymous Coward · · Score: 3, Insightful

    Would that improve hashing speeds in, say, Bitcoin?

    Bitcoin is based on SHA256 hashing, which has zero floating point operations. So no, this will not impact Bitcoin mining at all.

  13. Re:And that's why by noh8rz10 · · Score: 0

    "Tom's Hardware has published a lengthy article and a set of benchmarks on the new "Haswell" CPUs from Intel.

    Yes, but will it blend?

  14. 128 bit floats: when? by rmstar · · Score: 1

    While speed for single and double floats is all well and good, I wonder - when will there finally be hardware support for 128 bit (quadruple precission) floats?

    1. Re:128 bit floats: when? by godrik · · Score: 1

      What is the use for them? for "personal" use, floats are all you will ever need. Many physics computation stays in single precision to avoid doubling the memory usage. I guess fluid mecanic computation use double, but is there really a use for quad. Who needs that kind of precisions?

    2. Re:128 bit floats: when? by Twinbee · · Score: 1

      I would have hoped more bits were given to the exponent in quad precision. It's given 15 bits compared to double precision's 11.

      So many bits, and it almost all goes to the fraction - a real shame.

      --
      Why OpalCalc is the best Windows calc
    3. Re:128 bit floats: when? by gnasher719 · · Score: 2

      While speed for single and double floats is all well and good, I wonder - when will there finally be hardware support for 128 bit (quadruple precission) floats?

      It was there on PowerPC for many years, and with Haswell it will be there for x86 as well. FMA is all you need for efficient 128 bit arithmetic.

    4. Re:128 bit floats: when? by Anonymous Coward · · Score: 0

      [10^-4932,10^4932] isn't a big enough range?

    5. Re:128 bit floats: when? by Twinbee · · Score: 2

      It would prevent the need to some extra math for extra high numbers (not just those that end on a high numbers, but where the intermediate calculation may be high (e.g.: factorial math to find out the probability of something if I recall). Plus, 96 bits is more than enough for the fraction if you ask me - very greedy in fact to take that to 112 at the cost of 16 bits the exponent could well do with.

      --
      Why OpalCalc is the best Windows calc
    6. Re:128 bit floats: when? by Anonymous Coward · · Score: 0

      Define personal use. Most people use floats through scripting languages. A 32-bit floating point object w/ a 23-bit mantissa is worthless, because more often than not you're doing integer arithmetic. Not being able to represent more than 8 million is pretty limiting. And believe it or not, a 53-bit mantissa isn't that much better.

    7. Re:128 bit floats: when? by Anonymous Coward · · Score: 0

      I'm computing the zeros of the Zeta function in the region Im(z) > 10^4932 you insensitive clod!

    8. Re:128 bit floats: when? by ChrisMaple · · Score: 1

      Three years ago I was doing a SPICE simulation (SPICE uses doubles) for a radio receiver. The simulation ran into digital noise before the receiver would have, and it essentially ruined the critical part of the simulation. Software 128 bit floats is unacceptably slow.

      --
      Contribute to civilization: ari.aynrand.org/donate
    9. Re:128 bit floats: when? by Anonymous Coward · · Score: 0

      Is double-double not good enough? It's quite fast. There's even a quad-double library out there.

    10. Re:128 bit floats: when? by Jeremy+Erwin · · Score: 2

      here's an old paper describing octuple precision on the PowerPC G4

      Many problems in number theory and the computational and physical sciences, espe- cially in recent times, require more floating point precision than is commonly available in fundamental computer hardware. For example, the new science of “experimental mathematics,” whereby algebraic truths are foreshadowed, even discovered numerically, requires much more than single (32-bit) or double (64-bit) precision.

      That paper references Bailey's 2000 paper on Quad double algorithms, which alludes to "pure mathematics, study of mathematical constants, cryptography, and computational geometry

    11. Re:128 bit floats: when? by Anonymous Coward · · Score: 0

      Factorial math? You mean the gamma function? Or why are you using floating point numbers?

    12. Re:128 bit floats: when? by Anonymous Coward · · Score: 0

      For creating a space game with orbital mechanics.
      I guess right now I need to a 64 bit integers which can handle 1mm precision over a 100AU range. That is just one silly solar system, what if you want to make a game spanning the galaxy (without resorting to hacks).
      64 bit floats don't even get close to the accuracy needed.

    13. Re:128 bit floats: when? by CSMoran · · Score: 1

      What is the use for them? for "personal" use, floats are all you will ever need. Many physics computation stays in single precision to avoid doubling the memory usage. I guess fluid mecanic computation use double, but is there really a use for quad. Who needs that kind of precisions?

      Not all uses are personal and the fact that some physics calculations trade precision for memory doesn't mean that all of them do.

      One example could be matrix inversions with somewhat ill-conditioned matrices. When you know you're going to lose 14 digits of precision inverting the matrix, you'd better have a lot of headroom. Cue quad floats.

      The car analogy that comes to mind is people often do sound mixing with 32-bit audio even though you 16-bit audio is perfectly fine for listening to the product.

      --
      Every end has half a stick.
    14. Re:128 bit floats: when? by Anonymous Coward · · Score: 0

      For creating a space game with orbital mechanics.
      I guess right now I need to a 64 bit integers which can handle 1mm precision over a 100AU range.

      You're off by 3 orders of magnitude. 2^64 mm = 123309 AU = 1.95 light years.

      That is just one silly solar system, what if you want to make a game spanning the galaxy (without resorting to hacks).
      64 bit floats don't even get close to the accuracy needed.

      Sure they do, if you're smart. Who said you had to use a single coordinate system for the entire galaxy? It's a game. Newsflash for you: real game engines almost never implement insanely accurate physical models. They're piles of "good-enough" or "reality actually isn't fun" approximations. A local coordinate system (with a radius of nearly 1 ly!) centered on each solar system would be just fine. Call it a hack if it makes you feel better.

      And if you just have to have a single global coordinate system, go grab a bignum math library (or use a language with one built in) and be done with it. This application doesn't really need super high performance, so implementing big numbers in software is a viable option.

      (Which turns out to be the main reason why you don't see 128-bit FP in hardware. There's not enough demand for high performance high-precision math to justify putting it in.)

  15. Floating point apps are almost twice as a fast by Anonymous Coward · · Score: 0

    Link translated from the original Italian.

  16. Hmmmm by Anonymous Coward · · Score: 0

    "As you see in the red bar, the task is finished much faster on Haswell. It’s close, but not quite 2x."

    The RED bar is integer not floating point.

  17. Confused? by Narishma · · Score: 1

    The serious [floating point] performance increase has a few caveats: you have to use either AVX2 or FMA3,

    Isn't AVX2 just the integer version of AVX? Like SSE2 added integer versions of the SSE floating point instructions? If so, that sentence doesn't make sense.

    --
    Mada mada dane.
    1. Re:Confused? by godrik · · Score: 1

      No, there is more to it:

              * Expansion of most integer AVX instructions to 256 bits
              * 3-operand general-purpose bit manipulation and multiply
              * Gather support, enabling vector elements to be loaded from non-contiguous memory locations
              * DWORD- and QWORD-granularity any-to-any permutes
              * Vector shifts
              * 3-operand fused multiply-accumulate support

      source: wikipedia http://en.wikipedia.org/wiki/Advanced_Vector_Extensions#Advanced_Vector_Extensions_2

    2. Re:Confused? by Anonymous Coward · · Score: 0

      Isn't AVX2 just the integer version of AVX? Like SSE2 added integer versions of the SSE floating point instructions? If so, that sentence doesn't make sense.

      At a guess, the performance gains probably come from AVX2's introduction of fused multiply-accumulate, which can greatly speed up many common operations such as matrix multiplications.

      AVX2 also adds support for loading SIMD vector elements from non-contiguous memory locations, which is a potential win for certain types of applications.

    3. Re:Confused? by Anonymous Coward · · Score: 0

      AFAIK, you are correct and as usual, the summary is complete nonsense. It seems to me that the few new floating-point vector instructions would only improve performance in very isolated cases. E.g., if you have to load data from non-contiguous locations, you've mostly already lost the battle unless you are going to do a very large amount of work with that vector.

    4. Re:Confused? by Anonymous Coward · · Score: 0

      In fact, for fractals, the article shows a 2x speedup for integers and less for floating-point (with some cryptic explanation that follows).

    5. Re:Confused? by Anonymous Coward · · Score: 0

      ...and still no byte-granularity permute like altivec had, which could permute across two input vectors. Why have such a massive slew of specific instructions, when a few more powerful and general purpose ones would do?

  18. Error! by Anonymous Coward · · Score: 0

    "As you see in the red bar, the task is finished much faster on Haswell. It’s close, but not quite 2x." Sorry to ruin it for everyone but the RED bar is integer not floating point.

  19. Re:Would that improve hashing speeds in, say, Bitc by 0100010001010011 · · Score: 1

    Can the Jalapeno chips do anything else when the Bitcoin market crashes? At least with the video cards I cant still drive video cards with them.

  20. ERROR by xlokix · · Score: 1

    "As you see in the red bar, the task is finished much faster on Haswell. It’s close, but not quite 2x." Sorry to ruin it for everyone but the RED bar is integer not floating point.

  21. Re:Would that improve hashing speeds in, say, Bitc by Anonymous Coward · · Score: 0

    Calculate password hashes? Or collisions?

  22. FMA4 by ssam · · Score: 3, Informative

    Pah. AMD had FMA4 since 2011

  23. Poor AMD by Billly+Gates · · Score: 0

    There new Thunder and durgango APUs are rumored to finally get close to the I7's!

    This will crush them as AMD's former strength is floating point calculations and today it is multithreading or rather can get close to performance in multithreading.

    1. Re:Poor AMD by dshk · · Score: 4, Insightful

      AMD already has FMA3. They also published great results. Of course nobody read it, at least I have seen mentioned it in the usual generic benchmark articles people like to refer (which does not use FMA3).

    2. Re:Poor AMD by dshk · · Score: 1

      I mean "...I have never seen mentioned..."

    3. Re:Poor AMD by Anonymous Coward · · Score: 1

      I thought AMD used FMA4, based on one article I read. But then again, I barely understand what FMA stands for, let alone 4 vs 3 and which one is better.

    4. Re:Poor AMD by eabrek · · Score: 1

      The "n" in FMAn refers to the number of register arguments:
      r1 = fma3(r1 + r2 * r3) vs. r1 = fma4(r2 + r3 * r4)

      FMA4 can save you a move or two, depending on where you've got the accumulator now, and where you want it to be.

  24. The new min spec by Billly+Gates · · Score: 1

    To get Cyrsis 3 at 30 fps is here!

  25. lies and bullshit by decora · · Score: 1

    "hey kids, our CPU is twice as fast as the next guys!"*

    *(you must rewrite your code to do twice as much stuff at once)
    **(which has been true for like, 15 years ever since SSE + friends made it into the PC market)
    ***(which means developers have to spend time writing non-portable optimization code)

    1. Re:lies and bullshit by Anonymous Coward · · Score: 0

      And you can gain ten times the performance already by using older and cheaper graphics hardware if all you care about is optimizing floating point performance.

  26. Also by Sycraft-fu · · Score: 1

    Intel's C/C++ and FORTRAN compilers are exceedingly efficient at vectorization, and are of course updated to use their new instructions. Does take a bit for software to be compiled using it, but you can see some real gains in a lot of things without special work.

    I also think people who do GPGPU get a little over focused on it and think it is the solution to all problems. You find that some things like, say, graphics rendering, are extremely fast on the stream processors that make up a modern GPU. However you find other things not so much, they can even be slower. Intel CPUs are very good as mixed tasks, and the better vector units only make that more true.

    1. Re:Also by Anonymous Coward · · Score: 1

      The downside of using Intel's compiler is that it will revert to using 80286 instruction set if you happen to run the code on an AMD chip.

  27. GT3 by edxwelch · · Score: 3, Interesting

    AMD has lost the CPU race a long time ago, but still beats Intel with integrated graphics. Now, It looks like Haswell could win that battle too.
    The article shows GT2 to be 15% - 50% faster than the old HD4000. That's still a bit slower than Trinity, but GT3 has double the execution units than GT2, potentially blowing anything away that AMD could offer.

  28. Re:Tom's Hardware = official Intel PR outlet by Aardpig · · Score: 1

    No overclocking? Ermahgerd, that's a showstopper for those wanting to do HPC!!!!!!!!!!!!!!!!!!

    --
    Tubal-Cain smokes the white owl.
  29. Re:Would that improve hashing speeds in, say, Bitc by Anonymous Coward · · Score: 0

    And will you still be using your outdated video cards when that time comes? Perhaps, perhaps not. Sure, it could theoretically still drive video, but if it's not being used anymore, what's the difference?

  30. Re:Would that improve hashing speeds in, say, Bitc by viperidaenz · · Score: 0

    Bitcoins still hold no value to me. No one I deal with accepts them as currency, hence they hold no value.
    I can't pay my taxes with bitcoins, I can't buy food, I can't repay my mortgage, I can't buy petrol. What can I do with a bitcoin?

  31. Re:Would that improve hashing speeds in, say, Bitc by Anonymous Coward · · Score: 0

    I understand that a big question is whether the Butterfly Labs chips actually exist, let alone work.

  32. Morons like this are why you lose shuttles, etc by Anonymous Coward · · Score: 0

    Imagine employing someone as stupid as 'mozumder' in any mission critical situation. He is like the embodiment of the phrase "no-one ever gets fired for buying IBM". A brain-dead clod who is the joy of every shark selling any over-priced brand.

    Correctness in calculation is a computer science and maths discipline. It is NEVER achieved EVER, EVER by relying on the accuracy of any given piece of hardware. Indeed, any company doing real critical work would immediately FIRE any cretin like 'mozumder' who stated "we can trust this hardware".

    "I don't have to know how to do my job properly- that's why I bought a Mac Pro."

    For those that wonder, it is impossible to build a perfectly reliable CPU, and one shouldn't even try. Instead, you build 'good enough' hardware, and use correctly composed software systems to compensate for statistically rare anomalies. ECC memory is largely a marketing gimmick. There are, sadly, hundreds of thousands of places where 'data' can become corrupt in a CPU. Most of this possible type of error cannot be feasibly detected by inbuilt hardware solutions. ECC is used simply because it is trivial to add to memory blocks- blocks that represent only the tiniest fragment of all possible logic errors.

    The greatest vulnerability in a modern system is in serial data transports, where the transmission line is driven as fast as possible. However, error correction is always used on these interconnects to enable such high speeds. Ordinary logic is clocked at vastly less troubling speeds, so that the likelihood of failure is statistically very low indeed. Any hardware errors that do then happen can be considered as unavoidable- to be countered by proper software procedures.

    Mission critical calculations MUST be subject to sanity tests. This may involve running the same calculation more than once- running different algorithms that should give the same result, using multiple computers, or calculating reasonable bounds for the expected results.

    The idea that someone could say "Duh, I don't have to bother- we use a Mac Pro with ECC" is so terrifying, people expressing such opinions should probably be identified to ensure they aren't working somewhere where their idiocy and complete lack of maths skills may get someone killed.

  33. Re:Would that improve hashing speeds in, say, Bitc by Anonymous Coward · · Score: 0

    Can I buy gold with bitcoins?
    OMG yes!

  34. bs hype is what this is by Anonymous Coward · · Score: 1

    when avx came out, it was supposed to be a major speedup..
    guess what, lots of things are still faster in SSE2/3

    many of the new registers appear to speed things up, but what isn't readily apparent is there haven't always been improvements in memory ports.

    the major speedups are going to come from cleaning up the way instructions are handled and the memory lanes in the chip, not just throwing more registers at us

    This guy (Agner Fog) is the best reference on the net for what's going on in these chips:
    http://www.agner.org/optimize/blog/read.php?i=142

    1. Re:bs hype is what this is by Anonymous Coward · · Score: 0

      In the article you linked Agner Fog said many positive things about Sandy Bridge, few to no negative things, and began his concluding paragraph with the sentence "It has struck me that the new Sandy Bridge design is actually under-hyped." Yet somehow you have spun this into support for your overly-negative attitude towards AVX. Sad.

      (As Agner said, sometimes improvements in one area of a processor reveal another as a new bottleneck. That this sometimes happened in Sandy Bridge, coincident with the introduction of AVX, does not imply that AVX is useless. It is in fact a major speedup for many types of code, despite your pessimism.)

      p.s. Do try to understand what you're talking about before you criticize. You say "many of the new registers appear to speed things up", but AVX doesn't even introduce truly new registers, just widens existing SSE registers.

  35. You Obviously Never Used Sun Servers W/O ECC by raftpeople · · Score: 2

    In the early 2000's we had some, every week one of them would crash. All the other servers w/ECC, no crash. Hardly a marketing gimmick.

    1. Re:You Obviously Never Used Sun Servers W/O ECC by Anonymous Coward · · Score: 0

      I'm calling you on this one sparky! You are a lying sonofabitch! I *did so* administer 400 sun servers when I worked for a government spook house in the late 1990's. None had ECC memory. Everything was remotely administered (what is a network good for anyway?). At some point, an administrator with experience with microsoft wanted the machines to be 'cycled'. It was mentioned that unix systems don't normally need to be 'cycled'. Nevertheless, it was determined that any machine that had been running continuously for more than 1500 days would be 'cycled', at the next most opportune time. There were over 150 such machines that didn't need to be moved or have hardware reconfigurations. Occasionally the 'long duty cycle' would bite us in the butt though. There was one very old machine in operation that had been in continuous service for about 30 years. The boss was going away for a few weeks. The day he leaves, the hard disk crashes. It was an eventful day replicating what the old machine did on a new (much smaller) machine. It was powered off and removed --there was also a power supply issue with the machine, and the vendor had stopped stocking that part more than a dozen years before--. The naval reserve took the drive, added it with other drives in a 45 gallon steel barrel, added concrete, and then went on maneuvers about 140 miles offshore, then sent it over the side.

    2. Re:You Obviously Never Used Sun Servers W/O ECC by Anonymous Coward · · Score: 0

      wouldn't it be much simpler to just melt drives at several thousand degrees in some iron smelter

    3. Re:You Obviously Never Used Sun Servers W/O ECC by MiSaunaSnob · · Score: 1

      I would think the naval reserve is likely a bit short on iron smelters, and flush with ships that can randomly drop a drum of concrete.

  36. Re:Would that improve hashing speeds in, say, Bitc by slashmydots · · Score: 2

    You can sell them on the exchange quickly and easily for USD (or 5 other major currencies)

  37. Re:Tom's Hardware = official Intel PR outlet by ChrisMaple · · Score: 1

    Reading your diatribe would lead the naive reader to believe Intel's processors' benchmarks are substantially inferior to AMD's. Now that's comedy.

    --
    Contribute to civilization: ari.aynrand.org/donate
  38. Re:Would that improve hashing speeds in, say, Bitc by slashmydots · · Score: 1

    They had officially classified it as a coffee warmer

  39. Re:Would that improve hashing speeds in, say, Bitc by viperidaenz · · Score: 0

    So I can sell them for less than the cost of power to mine them? There's also the loss associated with amortising and depreciating the hardware required to mine them as well.

  40. Re:Would that improve hashing speeds in, say, Bitc by Anonymous Coward · · Score: 0

    So if these ASICs are as good as claimed why sell them to the general population when they could simply mine their way to a tidy profit?

  41. Prove that and you'll be more famous than Turing by raymorris · · Score: 1

    If you can do that, you'll revolutionize computing. No, doubling the clock to send two ticks to the gates doesn't count - the real clock is defined by the gate speed.

  42. Re:Would that improve hashing speeds in, say, Bitc by Jeremi · · Score: 1

    I can't pay my taxes with bitcoins, I can't buy food, I can't repay my mortgage, I can't buy petrol. What can I do with a bitcoin?

    You can send them to me...

    --


    I don't care if it's 90,000 hectares. That lake was not my doing.
  43. Re:Prove that and you'll be more famous than Turin by Anonymous Coward · · Score: 1

    Who are you to define what counts and what doesn't?

  44. Compilers by fa2k · · Score: 1

    Will gcc use AVX or FMA3 if I write normal code in C++? How about Java and Python / numpy, could it be that python actually gets faster than C++ if gcc doesn't take advantage of these technologies?

  45. Re:Would that improve hashing speeds in, say, Bitc by ultranova · · Score: 1

    Bitcoins still hold no value to me.

    It is interesting how every mention of Bitcoin attracts people saying how they're worthless, useless, or a scam that's about to collapse any second now. It's interesting, because people don't usually spend this much time hating something that wouldn't affect them in any way even if they were right. It's almost starting to seem like a FUD campaign, which leads to a question: who is behind it, the banks, the government, Visa or PayPal?

    --

    Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

  46. Re:Would that improve hashing speeds in, say, Bitc by Anonymous Coward · · Score: 0

    At least with the video cards I cant still drive video cards with them.

    You can't drive video cards with video cards? Dawg, I heard you like video cards...

  47. Re:Would that improve hashing speeds in, say, Bitc by Anonymous Coward · · Score: 1

    So if these ASICs are as good as claimed why sell them to the general population when they could simply mine their way to a tidy profit?

    So if those pickaxes are as good as claimed, why sell them to the general population when they could simply mine their way to a tidy profit?

  48. Oh my! by Anonymous Coward · · Score: 0

    Sounds like someone is....butthurt

  49. Re:Would that improve hashing speeds in, say, Bitc by viperidaenz · · Score: 1

    The purple unicorns are behind it.

  50. Re:Would that improve hashing speeds in, say, Bitc by Anonymous Coward · · Score: 0

    combined, I believe that's 15,267.864x more efficient.

    I guessing you're not an engineer. Or a mathematician. Something in marketing, perhaps?

  51. Re:Prove that and you'll be more famous than Turin by Anonymous Coward · · Score: 0

    OP is making a trivial observation about pipeline length versus clock speed which any EE would understand. There's no revolution brewing, except perhaps in your understanding if you think about this for a while.

  52. Re:Would that improve hashing speeds in, say, Bitc by slashmydots · · Score: 1

    When bitcoins hit $3.60 ea and the difficulty was about 1/3 what it is now, I was spending $42 on electricity to get around $45 in BTC. Now the price is $47/BTC and it takes 1/250th the power to generate them 10x as fast but at 3x harder difficulty. Still a hell of a net gain.

  53. Re:Would that improve hashing speeds in, say, Bitc by viperidaenz · · Score: 1

    Half the bitcoins that will ever be mined have already been mined. If this was to ever be widespread, how would more than 21 million people be able to take part? That's only 0.3% of the world. Less than 10% of USA. Less than one coin per Australian (there's 22 million of those buggers)

    As soon as you start using fractions of coins, you're introducing traditional banks in to the picture. Single points of failure to what used to be a distributed system.
    Scams and fraud shouldn't be too hard either. If you hijack the local wifi spot while you're trading a coin with someone and you control all the peers accessible in that spot, who's to say the coin really changed hands? remove knowledge from the peers of the transaction and no one will know about it until that other guy gets another connection. You could trade the same coin many times.

  54. Re:Would that improve hashing speeds in, say, Bitc by Anonymous Coward · · Score: 0

    Sometimes, they laugh at you because you really are Bozo the Clown.