Slashdot Mirror


Ask Slashdot: Why Are There No Huge Leaps Forward In CPU/GPU Power?

dryriver writes: We all know that CPUs and GPUs and other electronic chips get a little faster with each generation produced. But one thing never seems to happen -- a CPU/GPU manufacturer suddenly announcing a next generation chip that is, say, 4-8 times faster than the fastest model they had 2 years ago. There are moderate leaps forward all the time, but seemingly never a HUGE leap forward due to, say, someone clever in R&D discovering a much faster way to process computing instructions. Is this because huge leaps forward in computing power are technically or physically impossible/improbable? Or is nobody in R&D looking for that huge leap forward, and rather focused on delivering a moderate leap forward every 2 years? Maybe striving for that "rare huge leap forward in computing power" is simply too expensive for chip manufacturers? Precisely what is the reason that there is never a next-gen CPU or GPU that is, say, advertised as being 16 times faster than the one that came 2 years before it due to some major breakthrough in chip engineering and manufacturing?

75 of 474 comments (clear)

  1. One word by sl3xd · · Score: 5, Insightful

    Physics

    --
    -- Sometimes you have to turn the lights off in order to see.
    1. Re:One word by sl3xd · · Score: 5, Informative

      To elaborate: We can't reliably clock Silicon much faster than we're doing right now.

      There are other semiconductors (such as GaAs) which can operate reliably at higher frequencies, but they are absurdly expensive, produce too much heat, consume too much power, and so on -- not to mention the fact our tiny process sizes for silicon don't exactly work for entirely different materials (chemistry bites again).

      We're running into a similar wall for die shrinkage, on multiple fronts:

        - We're getting into the size territory where bits flip due to quantum tunneling, which tends to hurt reliability. Flash storage has started to reach that territory, if my colleagues working for ${SSD MANUFACTURER} are telling me the truth.
        - Yields of working units are going down significantly as the die shrinks, and it's taking a lot longer to figure out how to bring yields back up.

      In the end, every material has its limits, and we're starting to run into them with Silicon, and there isn't a material that 'stands out' as worth betting the business on.

      --
      -- Sometimes you have to turn the lights off in order to see.
    2. Re:One word by DontBeAMoran · · Score: 4, Insightful

      The problem is that each new generation of programmers is lazier than the one before them. All the increased CPU power is wasted on bloated librairies, OS processes, etc.

      --
      #DeleteFacebook
    3. Re:One word by Chronus1326 · · Score: 2

      No, the reason is marketing. If you were Intel, and had created a processor 4-8 times faster in workload than current ones. Would you sell it? or would you sell a lesser model that was only 2x as fast. Then enxt year you release the 3x as fast edition. You can make this stretch out for 10 years on one development. And if the Military makes it we'll never see it.

    4. Re:One word by mjwx · · Score: 2

      Physics

      Yes and no.

      Yes as in there is a limit to what we can do with silicon and transistors, but also no because of the way innovation tapers off after a few decades. Its the same reason that we dont see huge leaps in car, aeroplane and oven technology. Its because the design has matured to a point where for the most part we're just adding minor improvements to tried and tested designs. Intel/AMD/NVIDIA have pretty much reached this point and it will take a disruptive technology to change that.

      Said disruption will likely be non-silicon or transistor based computers. However it will take 10-25 years for it to go from working prototype to household device. Computers were first built back in the 40's... we didn't get them into the home until the 70's. Even then the diffusion of innovation meant it was another 20 years before they were commonplace. We're up to the 90's if you're not keeping score. 27 years later, there isn't really anything that can be done to radically change existing designs, the last big innovation was changing to 64-bit and that was done in the early 00's. Much like with cars, all they are doing now is making minor changes, however over the course of a decade, these minor changes tend to make a big difference.

      Also, this is why I didn't care about getting Skylake over Kaby Lake or a Geforce 10 over a Geforce 9 when I built my new gaming rig mid last year. I knew the difference would be minor (and it's easy to upgrade a GPU in 2 or 3 generations when a difference can be noticed).

      --
      Calling someone a "hater" only means you can not rationally rebut their argument.
    5. Re:One word by hairyfeet · · Score: 4, Informative

      I would also argue the money isn't there to make the insane investment required to make that "next leap" in chip tech.

      If you look at reports on how old the average PC in the field is? You are looking at 4-7 years and the reason why is obvious...software hasn't kept up with hardware. Even gamers these days can have 7+ year old CPUs and play the latest games at 1080P so there really isn't a huge market** just begging for a new CPU as sales from Intel and AMD have shown. Now if some new "killer app" like the next Lotus 123 comes up? This may change but so far I've seen nothing on the horizon that would fill that slot.

      **.- Before any of you that do some niche job like wave simulation or 3D rendering scream "But I needs moar power!"? You and your ilk are less than 2% max of the market, just too niche to be a big enough market to support the insane amount of R&D required to make that next leap.

      --
      ACs don't waste your time replying, your posts are never seen by me.
    6. Re:One word by Roger+W+Moore · · Score: 2

      Computers were first built back in the 40's... we didn't get them into the home until the 70's.

      The computers built in the 1940's used valves, not silicon. The first transistor-based computers were in the early 1950's so that's when the clock should start ticking since valve-based computers were clearly never going to be a consumer item. The same may be true of the next generation of computer technology - the current tech for quantum computers is not really consumer friendly if that turns out to be the next generation technological platform.

    7. Re:One word by geoskd · · Score: 5, Insightful

      The problem is everyone is hell bent on smaller for the sake of performance. and it's stupid. dont make smaller, make bigger.

      There are a whole host of problems with that.

      First and foremost, physics strikes again with the speed of light. Pretty much all modern processing is done synchronously which means that it requires a clock signal that changes everywhere at the same time. As you expand that size of the processor, suddenly things get out of sync. There are ways to fight this, but they are tricky and dont scale well.

      Second, As die size increases, Power consumption increases faster. All the current your processor draws passes through some parasitic resistance in getting there. The bigger the die, the more parasitic resistance. If you take a chip that draws 50 watts and put two of them on a die, the power draw is now 105 watts because the new chip draws more than 50 watts (it has to pull power through a slightly longer set of wires, as does the original one)

      Third, cost. The single most important factor in processor cost is yield. Any given silicon wafer will have a certain number of defects on it that will render any chip at that location unusable. If you get on average two defects per wafer, and you have 100 chips on a wafer, then you get 98 good chips and two bad ones (98% yield) . If you have two defects per wafer and there are only 10 chips on that wafer, you get 8 good chips and two bad ones (80% yield) (gross over-simplification).

      There are a whole cadre of other issues that chip designers and manufacturers have to deal with such as interconnects and shared resources, etc...

      --
      I wish I had a good sig, but all the good ones are copyrighted
    8. Re:One word by geoskd · · Score: 4, Interesting

      Something about less distance making for faster signaling

      Actually, it has very little to do with the distance. The single biggest speed improvement in die shrink comes because the gate capacitances are smaller due to smaller footprint, and as such the gate charge / discharge time is shorter. The shorter distances does have a small effect as well, but the primary effect is due to the gate capacitance.

      --
      I wish I had a good sig, but all the good ones are copyrighted
    9. Re:One word by mjwx · · Score: 2

      Computers were first built back in the 40's... we didn't get them into the home until the 70's.

      The first transistor-based computers were in the early 1950's so that's when the clock should start ticking since valve-based computers were clearly never going to be a consumer item. The same may be true of the next generation of computer technology - the current tech for quantum computers is not really consumer friendly if that turns out to be the next generation technological platform.

      Fair enough, it's 2:30 where I live and I didn't feel like reading the Wikipedia article on computers to get exact dates. However that's still 20 years from prototype to home product so I stand by my point.

      I forget where I read it, (back in high school, which is a while ago for some of us) but it takes 25 years from the point where a new technology becomes available for it to integrate into our lives. Replacements for silicon are largely still theoretical.

      --
      Calling someone a "hater" only means you can not rationally rebut their argument.
    10. Re:One word by narcc · · Score: 2

      So they claim... I've seen perfectly mundane software that's more than 100x larger than older software that still somehow manages to do less than older versions.

      That is, equal or lesser complexity, dramatically larger size, unimaginably worse performance.

      I blame the attention paid to "do everything" libraries and frameworks used because they're popular, not because they add value. The defense is always "don't reinvent the wheel" and "if we want to add this or that someday" or some variation of the two. If we didn't reinvent the wheel, we'd all be driving Flintstone mobiles. As for the defense against the future defense, it's not going to happen. That never happens. It never happens because the added unnecessary complexity is guaranteed to make your software less, not more, flexible. Stop doing that!

      Stick with small, special purpose, libraries. Your users will thank you.

    11. Re:One word by angel'o'sphere · · Score: 4, Interesting

      While the "end effect" is true, it has nothing to do with laziness.

      Paying a programmer is expensive. The employer have you rather finished quickly and sells your work early with "drawbacks" e.g. more memory usage and less speed.

      And the real culprits are the marketing droids that think programs and OSes need a new UI experience every few years. A huge deal of programming efforts and bloat is wasted and does not bring any value to the users.

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    12. Re:One word by thinkwaitfast · · Score: 4, Interesting

      It's really capacitor charge time. In CMOS technology, you basically have a metallic plate (the gate) sitting on some semi conductor (separated by an insulator).

      As electrons flow into the 'plate', they accumulate. This creates an electric field which pushes electrons in the semiconductor away creating a channel of 'holes'. It's through this channel that electrons can flow (drain to source). Note that the electrons moving through the CMOS gate are typically sent to another transistor. And as soon as that plate fills up with electrons, current stops flowing through the device. And since power = current x voltage (IxV), you only dissipate power while the device is switching and this is why there is more current drain (and heating) the faster that you switch. Leakage current blah blah disclaimers.

      CMOS Transistor

    13. Re:One word by thinkwaitfast · · Score: 3, Interesting

      The smaller the transistor, the smaller the metal plate (the gate), so it charges faster, creating the channel faster allowing for faster switching times.

    14. Re:One word by coastwalker · · Score: 2

      The power problem is not the cost of the power. The problem is that any machine that does work also generates waste heat according to the laws of thermodynamics. So a chip that uses more power generates more heat which has to be got from the chip to outside the package and dissipated into the environment. If the heat is not removed then the chip temperature increases to the point where so many thermal electrons are generated the chip no longer works.

      The issue is also affected by the linewidth shrinks used to reduce die sizes and increase the number of die per wafer. The reduced die area cannot handle the same power as the previous generation without getting hotter. So each new generation of chips has to use less power than the previous one. A great deal of the complexity of the manufacturing process and the materials used are involved in achieving this power reduction.

      The optical problem of printing smaller features is only a portion of the difficulty and cost of going from one chip generation to the next. The brick wall that silicon is facing ending Moore's law is as much due to the power dissipation problem as it is the quantum tunneling leakage problem at ever smaller linewidths. Otherwise we could just go 3d and stack up more transistors on top of each other.

      Gallium Arsenide chips have a higher electron velocity and mobility so they switch faster than silicon but going from silicon to GaAs is a single linear increase in performance. Moore's law is an exponential increase doubling compute power every eighteen months achieved largely from reducing linewidths. There is also the problem that GaAs cannot make CMOS circuit designs at the highest speeds so logic consumes more power than silicon. GaAs is used in applications where the 250GHz maximum speed is needed, silicon has a maximum speed of around 5GHz.

      --
      Facts are history now plebs have politics for religion on social media.
    15. Re:One word by rtb61 · · Score: 2

      Not the physic you are thinking though. Lets use the favoured slashdot car analogy. Why are cars no longer more and more powerful, why is not the average consumer car capable of 300km per hour because it can not be safely or legally used and it is a waste of energy and resources, it serves no purpose except to allow some but to stroke the ego whilst stroking their private parts. Sure there is a market for it, a very tiny market (heh heh) but not sufficient to continue development, most people what more fuel efficiency and cheaper running costs and a more comfortable experience.

      So computers have hit the wall, where computing performance in the consumer market only serves to play games, the only element of the consumer market that needs the power (more over reality has hit the wall too because for the majority yuck. Too realistic violence, just puts people off, yet a more realistic environment is more fun, cartoonish beings in a realistic world).

      So high end desktop power in a small form factor, phone or tablet and low cost, big screen all in one, are the only real targets left for the main consumer market. For the business market, a single current high end gaming computer as server hooked to terminals would do most small to medium businesses, for accounting and admin. Power in business market is CAD, CAM, finite engineering, science, molecular engineering et al a very limited market relative to the consumer market, cost effectively served by hooking up elements consumer PCs to combine power in a grid computing approach (does or does not that mean Linux wins on the desktop when you have tens thousands of hooked together consumer machines in a grid, with many, many grid super computers because a custom built super computer is simply too expensive and want more power add more low end desktop, well, elements of them).

      Central high power cloud machines are just a disaster waiting to happen, how many times does this have to be proven. A highly distributed computer grid is far more durable and reliable, with only tiny parts going down relative to the whole, rather than the whole central power unit collapsing. The cloud will die so bad and for so long when we get hit with the next solar storm, utter disaster and it can be avoided but greed and stupidity.

      --
      Chaos - everything, everywhere, everywhen
    16. Re: One word by Miamicanes · · Score: 2, Interesting

      Moore's Law is merely an observation about the NUMBER OF TRANSISTORS on a die, not computing power.

      In theory, Intel and AMD could probably make 64-core CPUs a retail reality within a year or two... but with current programming languages, it would be almost pointless. Multithreading exists, of course... but few apps besides raytracing can genuinely put it to good use. As a practical matter, 99% of the benefit from having multiple cores comes from being able to run Windows UI threads on one core, and whatever app is in the foreground on the second. With Windows itself spinning off API calls from the app onto cores 3 and beyond when it gets the chance to do so.

      Windows does a decent job of passively putting multiple cores to good use, but its ability to do that mostly depends upon having access to the benefits of x86/AMD64 architecture at its disposal. Historically, Linux has done a TERRIBLE job of passively putting multiple cores to good use without explicit multithreading attempts by the app's programmer, due to two main lines of reasoning:

      a) If the developer intended for the program to multithread, he would have written it that way... IMHO, more of an excuse, because the fact is, current programming languages aren't all that great at handling concurrency without major gymnastics. 25 years ago, we had spaghetti code as an anti-pattern. Now, we have a rat's nest of spaghetti async threads that are almost impossible to grasp without referring to a wall-sized UML diagram.

      b) Many of the things Windows does to passively put multiple cores to good use depend upon the "strong" memory model of x86/AMD64 CPUs. I believe this is actually the biggest current reason why Linux doesn't try as hard as Windows to passively multithread apps written to be single-threaded... and the reason why there were several entire generations of multi-core Android phones that didn't actually do a single damn thing with the extra cores besides brag about them in the marketing literature. Basically, on x86/AMD64, if thread #1 updates a byte of ram, thread #2 running on another core attempts to read it, the CPU will automagically make the change instantly visible to the second core. On ARM, that's rarely/never the case. With a language like Java, x86/AMD64 hides lots of programming sins that will cause the exact same code to crash and burn on ARM.

      TL/DR: we COULD have CPUs with a lot more cores than 4, 6, or 8... but current software wouldn't put them to good use, so there's almost no real market for them besides servers and mainframes.

    17. Re: One word by Bing+Tsher+E · · Score: 2

      It really is transistors, all the way down. Unless you switch Vdd and Vss inadvertantly, and then it's all carbon.

      Speaking of which, plug it in backwards and you, too, can have a Light Emitting EPROM.

    18. Re:One word by dbIII · · Score: 2

      While that is true there are a lot of things where working in parallel is trivial. Image processing is one that most readers will be familiar with. How hard is "apply this filter to every pixel"? You don't care what order it gets done in so long as you get the entire result dumped somewhere. There are a lot of things in science and engineering with that same sort of approach of applying the same transformation to every item in a dataset. While there is plenty of stuff that can't be done in parallel there is a lot of untouched "low hanging fruit" that is.
      Currently the situation seems to be you pick Xeons for single threaded jobs and AMD if you need a huge number of cores (64 cores and 1TB of memory per node is cheaper than you would think). For those lucky enough to have tasks that don't need a lot of memory the GPU stuff gives you a vast number of cores.

    19. Re:One word by TechyImmigrant · · Score: 3, Interesting

      - Yields of working units are going down significantly as the die shrinks, and it's taking a lot longer to figure out how to bring yields back up.

      In the end, every material has its limits, and we're starting to run into them with Silicon, and there isn't a material that 'stands out' as worth betting the business on.

      So, Moore's Law is dead.

      Moore law remains a remarkably correct prediction. However the prediction is concerning both feature size and cost and it predicts the costs rising in pretty much the fashion they have. It's exponential.

      However in terms of computer power, the vast majority of the increases in computer power have been architectural, not from process improvements. If we stopped at 10nm and never went below that, computers would continue to get faster. I am aware of techniques that will continue to improve the processing speed of CPUs. They are not feature size improvements. They will come out in due course. But feature size is not limited by our ability to push feature size. It's limited by the cost of reducing it. Who's going to drop $100,000,000,000 on a fab in 5 years to get below 5nm? Other technique become more effective per unit dollar.

      We push these things on all fronts. I've seen some pretty crazy schemes and I've seen some fail and some succeed.

      My personal opinion as someone who works on these CPUs is that the recent (4-5 years) slowing of CPU power increase (note that improvement in instructions per Joule hasn't slowed) is going to change. New things will come down the line that will dramatically increase the speed of doing stuff. It's happened with specific workloads like graphics, or crypto or RNGs or disk I/O. Other things will continue to improve as attention is spent on improving them.

      Notice how your CPU isn't awesome at DSP, but there are plenty of DSP oriented CPUs that blow any general purpose CPU out of the water on those tasks. There are datapath oriented architectures that can move data faster than any general purpose CPU sitting in big iron routers everywhere. As the demand for specific workloads change, the general purpose CPUs will follow.

      --
      I should use this sig to advertise my book ISBN-13 : 978-1501515132.
    20. Re:One word by dbIII · · Score: 2

      The critical flaw size increases as you shrink the die.
      Thus with all else being equal more flaws. Those flaws that previously were too small to matter now do.
      I hope I made that clear enough.

    21. Re: One word by Cesare+Ferrari · · Score: 2, Insightful

      I'm not sure what world you are living in, but in the one i'm in, we have CPUs with a lot more cores the 4, 6 or 8.

      For starters, mainstream Intel dual socket supporting processors have 22 core options - E5-4669 v4 for example. So, you can get 44 cores into a dual socket machine.

      Sun/Oracle got into this game in a big way with their T series processors, and blurred threads vs cores (in a very interesting way), so produce things like the T5 with 16 cores and 128 threads - it's like hyperthreading, but very cleverly done, so instead of relying on out of order execution to keep the execution units humming, you use multiple threads. Of course you can get multi-socket machines for these too, so you can get a T5-8, so 8 sockets (128 cores, 1024 threads).

      So, high core count is out there, you just jave to look a little further than intel processors aimed at the desktop market.

    22. Re:One word by Blaskowicz · · Score: 2

      The original statement was more about bad yields on newer processes. I.e. if your yields are only about 10% on 10nm trying to make a desktop chip then it's a terrible yield no matter what. If your yield for the former process was 80% for the same area then even if your new die is a much smaller version of the former one, it will have a worse yield overall. Wafer cost and design cost also increase.

      The 10nm process has to be improved over time to get economical. That's business as usual but it's taking more and more time, perhaps years to get up to speed. Same thing happened with the 16/14 nodes. So, what come out first are high end mobile chips - highest end phones and then Intel's Core M for high margin ultra thin laptop things. Meanwhile lower end phones still are on 28nm not even 16-ish nm.

      Even flash memory improvements seem to be slowing down, but that may be that demand is huge and increasing.
      E.g. new phones are coming with 16GB flash, though 32GB would be faster and more reliable. (Some phones are sold in two variants : 2GB/16GB and 3G/32GB. Not yet putting 32GB as standard)

    23. Re: One word by Half-pint+HAL · · Score: 2

      Yes, but the inefficiency in the modern frameworks is the failure to prune unused code at compile-time, typically because the framework is delivered as a pre-compiled binary blob. If your stopwatch app includes an entire raytracing 3D engine that is capable of rendering massively detailed immersive worlds, and all you're using it for is to project a realistic shadow on the clock face from the hands, that's inefficient in terms of storage, even if it's efficient from the perspective of the labour required to produce it.

      --
      Got them moderator blues I blieve I walk out the do', With these mod-points I been gettin', I 'most never post no mo'
    24. Re: One word by dannys42 · · Score: 2

      FYI, Apple's languages/tools (like GCD, Swift, and OperationQueues) make it very easy and manageable to take advantage of concurrent programming. (At least compared to other systems I've see )

    25. Re:One word by AntiSol · · Score: 2

      The frameworks themselves aren't bloated because of laziness (generally, per se), but the programs using these frameworks are bloated due to laziness.

      e.g: You need to write a program which does 2 or 3 nontrivial but common tasks. You could write your own or research and use 2 or 3 lightweight and efficient libraries for those specific tasks, but that would be effort, so you use a framework you've worked with before which has the 2 or 3 things you need plus 50000 other features. And that's how you end up with a "hello world" program targeting .net and loading 100Mb or so of libraries into memory before it does anything.

      However the framework people aren't blameless: they've been lazy by not providing a mechanism to only include the parts of the framework that you want: rather than saying 'include framework', I should have to say 'include core; include crypto; include database'. And when packaging my program there should be a way to only bundle the bits I need. But this isn't the way we do it, because:

      • * Making our framework do that is effort. And totes not sexy. I'd rather implement a new templating engine or a library for iThisMonthsFad(TM). Yeah, we already have one, but oh look - squirrel!
      • * If we don't include them in the core, we'll get a bunch of (lazy) people complaining about how much effort it is ("and soooo mid-late 2000s lol!") to type 'include database'. Or they'll use some other framework.
      • * Most programs use crypto and database anyway, right? So we should just include them in the core. It'll only add a megabyte or so to those lightweight programs which don't use it.
  2. Market by Shaman · · Score: 4, Informative

    Most likely, there is no major competition in the market, and PC sales on the whole have slowed considerably. A modern 6800K processor is as close as you'll come to a leap forward, but it's $1100 Canadian and requires a similarly expensive motherboard + memory. Same with similar chips.

    Meanwhile the cheapest system on the market is as fast as a moderately high-grade enthusiast computer from 2010 and probably has reasonable 3D graphics onboard, with a SSD drive it will feel quite snappy.

    So, a) not a lot of market demand for faster systems, b) lots of tablets and game consoles for entertainment out there, c) moderately faster systems exist but cost keeps them low-volume, d) very low-percentage demand for faster computers - definitely less than 1% that will pay a premium for it, e) the majority of gamers are young-ish and they play largely twitch games even on PCs which are more GPU limited than CPU limited.

    --
    ...Steve
    1. Re:Market by Billly+Gates · · Score: 4, Interesting

      Dude gamer GPU's are increasing in performance incredibly fast. THey double in speed every 2 years. The only reason desktop is not innovating is because Intel has a monopoly and won. But that is changing starting with Kaby Lake thanks to AMD Ryzen. It is back to 15% every year again and maybe even more as graphics shows no slow downs anytime soon.

      Shoot for $185 you can get what a $399 did just in late 2014/2015 at all max settings in games.

    2. Re:Market by Kjella · · Score: 2

      Most likely, there is no major competition in the market, and PC sales on the whole have slowed considerably.

      Sorry, but I think this is plain wrong because they're always working to lower their own cost. Even in the absence of competition if Intel could make a processor twice as fast, they'd make it half the size and sell the same performance at a much higher profit margin. And while the PC market has shrunk it's still 270 million PCs/year or about 75% of its all time high, it's a huge market even if it's not a growth market anymore.

      --
      Live today, because you never know what tomorrow brings
    3. Re:Market by Mashiki · · Score: 2

      GPUs are increasing incredibly fast because of a couple of reasons. First, they're not anywhere close to the same die size as a CPU. They're roughly 2 generations behind CPU's in shrinking, that means the tolerances can be off and it won't make a huge difference and can "run wild" without the danger of causing errors. But can benefit from all the advances that AMD and Intel have gone through with each die shrink. The second is GPU's are able to increase their die size and transistor count as well as having very specific instruction sets compared to a CPU. They also don't have to have on-die caching which takes up very valuable real estate space on the silicon itself, that more space(upto 1/3) with each shrink can be dedicated to specialized instruction sets or more transistors which further take the load off the CPU.

      --
      Om, nomnomnom...
    4. Re:Market by Blaskowicz · · Score: 3, Informative

      Sadly low power dedicated graphics cards aren't being made, due to integrated graphics removing the OEM market for it. The lone exception is geforce GT710 (and the GT610 before that) with a 19W TDP, and a somewhat rare nvidia GPU (GM108) on some ultrabooks.
      Either AMD or nvidia could make a low power GPU like that wih the latest technology and some LPDDR or DDR4 memory, if so they wished.

      nvidia almost released a 15W graphics card with a Maxwell GPU
      http://wccftech.com/nvidia-gef...

    5. Re:Market by Macman408 · · Score: 2

      That's not true; GPUs basically always use the latest process technology available, just like CPUs. Recently, there have been some degenerate cases where a new process is (at least initially) slower and more expensive than the previous one; but in general, they always move to the latest and greatest process, once that process is capable of making a better product.

      As for die size, the big GPUs are way bigger than CPUs. A 22-core Xeon Broadwell E5 from 2016 is 7.2 billion transistors, and 456 mm^2. The NVIDIA GP100 chip (also 2016) is 15 billion transistors, and 600 mm^2. The AMD Ryzen (2017) info I can find says it's (probably up to) 4.8 billion transistors.

      I have no idea what you mean by "tolerances". Maybe you mean "process variation", which is a natural part of any semiconductor manufacturing - and is controlled by the fab (TSMC, GlobalFoundries, Samsung, Intel), not the chip designers (Apple, NVIDIA, AMD, ummm Intel again). The design houses ship off the chip they want - and the fab produces it, with some chips a little hotter/faster than others. Over time, they can tighten up the process so it has less variation and higher yields, but nobody is "running wild" with anything.

      It's complicated too, because the node names are really just marketing hype. Just as "Kaby Lake" is a name that Intel gave to a collection of optimizations put in a single chip, or "Pascal" is a name that NVIDIA gave, or "Ryzen" is a name that AMD gave – 14 nm is a name that some fab gives to their latest collection of optimizations. There's no one measurement that corresponds with the marketing name any more, like there was until the early 2000s. [citation] The upshot of this is that Intel's 14 nm isn't the same as TSMC's 14 nm or GloFo's 14 nm, so you can't necessarily compare them. Intel does generally have an advantage in this space, however. That said, everybody pretty much uses the latest, greatest process technology available to them from the fab they have chosen. And it is often the case that a GPU is one of the first things manufactured in a new process at a fab, so they aren't benefitting from anybody prior - especially not at a different fab, because the fabs don't share their secrets, or even the same set of features (as noted previously).

      Also, with a brand new process, yields can be very low, so a given company may choose to reduce their risk by making their first chip on a new process either a die shrink of a previous chip, a minor revision to an existing architecture (Intel's "tick"), or a small low-performance chip. Once the kinks have been ironed out on one of those "easy" options, they can shift the bigger, higher-performance chips to the new process. But in some cases, if they started out on the big chips, the yield would be 0% - or if not 0%, the cost of an individual chip would be so high that no consumer would ever pay for it.

      And while I will grant you that GPUs have *less* cache, they do still have some caches and other memories. A GP100, for example, has 14 MB of register files, 4 MB of L2 cache, 3.5 MB of shared memory, and 1.3 MB of L1 cache. That's still well shy of the 22-core Xeon I mentioned earlier, which can have up to 55 MB of LLC, but it's a pretty good amount all the same.

      The real reason that GPUs have always outpaced CPUs is because they are inherently parallel. In addition to all the architectural optimizations that are made every year, they also add more cores every year; while most of us are still using something in the vicinity of quad-core CPUs, just like we were 5 years ago. Also, the parallelism of GPUs means that they have more freedom for architectural changes to yield throughput enhancements. A CPU is largely targeted at single-thread performance, so most of the optimizations they make will enhance that. A GPU architect can make similar optimizations to enhance a single thread's performance, but they can also make changes that only help parallel computation.

      So GPUs are arguably more advanced than CPUs, or at the very least on par with them - and they will continue to outpace CPU development for the foreseeable future as well.

  3. Business decision by BoFo · · Score: 3, Insightful

    Every advance has to be paid for by the consumer. Each incremental advance comes as the previous one is marketed.

  4. Limitations by fozzy1015 · · Score: 2, Informative

    Instruction level parallelism in superscaler core designs have hit a limit. More pipeline stages becomes counter productive when a misprediction requires a flush. Thread level parallelism exploited by multi core designs can only go so far; only certain tasks can exploit massive parallelism(e.g. ray tracing).
    Increases in clock speed have hit a wall with current silicon based semiconductors. Exotic semiconductors and incredible cooling systems aren't practical for the mass market.

    1. Re:Limitations by Half-pint+HAL · · Score: 5, Interesting

      In a way, process limitations are a welcome obstacle, that should motivate reflection on legacy decisions, and perhaps finally allow the x86 architecture to be put to rest. Many consider x86 "good enough", but the problems with legacy hardware run a lot deeper than performance, and are largely responsible for the horrific state of computer security today.

      The main problem isn't legacy hardware, but legacy software. The x86 architecture is already dead, and most of what we see is a hardware translation of x86 to a CPU architecture that isn't accessible to the coder.

      I believe that the only way out of this is for us to start making more heterogeneous parallel chips. At the moment, this only really exists in the form of packages of CPU+GPU on a single chip. But if we had (for example) ARM+x86+GPU, we'd be able to run an ARM-based Linux or Windows environment, but power up the x86 core as required to run any vital legacy apps. This would mean it would slowly become more and more economical to develop for ARM (or whatever your chosen architecture is) and we'd be able to start thinking about retiring x86 sooner. And hell, it's not like even Intel are really fans of x86 themselves -- they've already tried to ditch it once (remember Itanium?), and in the end it was AMD who extended the x86 architecture to 64-bit, not Intel. Intel wants away from x86, the market wants a better architecture, we just need a stepping stone that guarantees legacy software compatibility, and when so many multiple cores lie idle, I don't see why heterogeneous multicore isn't recognised as the solution.

      --
      Got them moderator blues I blieve I walk out the do', With these mod-points I been gettin', I 'most never post no mo'
  5. Why Are There No Huge Leaps Forward In CPU power? by JoeyRox · · Score: 5, Insightful

    NVIDIA's 2016 Pascal architecture was significantly faster than their previous Maxwell architecture.

    "Relative to GTX 980 then, we're looking at an average performance gain of 66% at 1440p, and 71% at 4K. This is a very significant step up for GTX 980 owners,"

    http://www.anandtech.com/show/10325/the-nvidia-geforce-gtx-1080-and-1070-founders-edition-review/32

  6. Breakthroughs are NOT plannable projects by redelm · · Score: 5, Insightful

    The poster asks a question that assumes breakthroughs can be planned just like any other development project. But breakthroughs are not, or rather, those that can be planned and worked already have been. The computer science field has been operating awash with funding for at least 55 years.

    I'm not saying there are no breathoughts out there, what I'm saying is that our current project methodology has already discovered all it can, and most future breathoughs will come from some other methodology.

    The target, CPU/GPU power is also not especially compelling -- compared to the past, there is much less pressure to increase performance, and considerable uncertainty how the increase will be helpful.

    1. Re:Breakthroughs are NOT plannable projects by sl3xd · · Score: 3, Insightful

      I'd mod you up if I could... at this point, it's starting to look like we need a material breakthrough - Silicon appears to be reaching its limits.

      --
      -- Sometimes you have to turn the lights off in order to see.
  7. Intel just got faster by Billly+Gates · · Score: 5, Informative

    The sole reason Kaby lakes got hot and clocked in so fast is because of AMD just around the corner and it worked to beat Ryzen. I expect the CPU race to heat back up again as physics has not killed innovation yet.

    Proof is GPU's and Phones are still improving at breakneck speed. It is only because of an INtel monopoly that on the desktop it has went to a standstill.

    1. Re:Intel just got faster by coastwalker · · Score: 2

      I think you will find that Intel is paddling as fast as it can, Qualcomm among others is snapping at their heels.

      --
      Facts are history now plebs have politics for religion on social media.
  8. Most People Only Want a Window to the Internet. by DatbeDank · · Score: 5, Insightful

    Right about 2008/2009 computer hardware became "good enough" to appeal to people's basic needs which really only centered on having a simple window to the internet. Netbooks became available and smartphones started to become good enough to browse the internet on their own. Consumers at the end of the day really only want a platform that's able to view into the internet.

    Someone can correct me, but I believe such innovation is still occurring for server technology and niche fields like a/v production, cad, and animation. Though, I do yearn for the olden days when consumer technology was cool and exciting. Being a tech nerd in the 90s was something else!

  9. Re:milking it by msauve · · Score: 3, Insightful

    ...because of software inefficiency and planned obsolescence. Ever wonder why current Windoze takes about the same time to boot as Win 3.1 running on a 486? It's not because Windoze does 10,000 times more (useful stuff) today. (486DX2 ~25 MIPS, i7 5960X ~240K MIPS).

    --
    "National Security is the chief cause of national insecurity." - Celine's First Law
  10. There used to be by Ramze · · Score: 2

    I remember when Pentiums were first coming out. P75, P90, P100, P133, P166. They were faster than the 386s and 486sx and 486dx models. The p166 was noticeably more than twice as fast as the P75 on lots of tests. The Mhz and Ghz races are over.

    We can't just ramp up cycles anymore with silicon. It puts out too much heat. Multicore doesn't magically make programs faster unless they lend themselves well to parallellization & are coded properly for it. New architectures have been tried, but ultimately fail because they're costly or proprietary. ARM was a pretty good leap forward for mobile use. New instructions are being included in CPUs all the time -- especially ARM. Try to play a HEVC 1080p video on a 2013 tablet vs one today... you'll notice a difference right away. Check the CPU usage -- one's at 100% and dropping frames left and right while the other barely nudges past 15%.

    Intel or AMD could sell you a chip with 256 cores on it, but unless you do a lot of video encoding or physics rendering, it'd be wasted on you... and super expensive b/c they have no incentive to make it in volume. Maybe when VR or AI becomes commonplace, you'll drive demand for such architectures.

    CPUs are fast enough for just about anything one could think to do with them at a consumer level. GPUs can be made better, but market forces push for low power that's "good enough" for most users. CPUs and even GPUs aren't the bottlenecks anymore -- it's RAM, SSD, PCI-express lanes, various busses like USB, thunderbolt, HDMI, SATA, etc. Doesn't do much good to stuff a really fast CPU or GPU into a system if you can't feed it data fast enough to max it out. Most CPUs already have several layers of cache as well as branch prediction to help with the crippling latency from other I/O, but it's still not enough.

    Changes are usually evolutionary, not revolutionary... and we've tweaked so much with CPUs and GPUs, you're not going to see a big bump until we move away from silicon and PCB to say... diamond or carbon nano-wires and optical computing.

  11. Because there's no such thing as one "performance" by imgod2u · · Score: 5, Informative

    CPU architect here. I'll try to provide some insight.

    Performance for CPU/GPU or any computational tool isn't exactly just a number you hit. It's not like bandwidth for storage or communications nor is it like a battery's capacity.

    A CPU and to a lesser extent a GPU is able to perform all sorts (all logical) computational functions. Each of these involves different usage patterns of the different computational paths inside a piece of silicon. And thus, speeding up each of these usage patterns requires different structures.

    A single piece of code running something complex like launching an app or opening a webpage will generate hundreds of millions of instructions with lots of different patterns. Think about all those API's you call. How much code do you think is similar between them?

    And thus the problem of improving "performance". The goalpost is a shifty one. Speed up one code pattern, and you risk your changes hurting another. Or you can spend extra transistors making a specialized accelerator for that code pattern. But then...it'll be idle 95% of the time.

    And if you speed up a particular function by 1000x (it's happened), your average speed increase for a typical benchmark or API call will still be 0-1%. Because that function is only a small piece of the larger codebase.

    Think about how many non-similar libraries and functions there are in typical software, and think about how there's any way to speed them *all* up. You can make memcpy or memset (malloc uses these) faster by 5x and that'll speed up javascript processing by....0.01% or so.

    The reason "performance" doesn't increase as drastically in the computer world is because computing "performance" is very very multifaceted. Much like how "intelligence" can't just be increased by 5x -- someone can get 5x better at specific tasks, like memorizing or image recognition, but that doesn't make them 5x more "intelligent".

    Compare this with a simple metric like 0-60 acceleration or network bandwidth.

  12. Re:Why Are There No Huge Leaps Forward In CPU powe by Misagon · · Score: 4, Informative

    Architecture-wise, Pascal was mostly an incremental upgrade to Maxwell.
    The big difference from Maxwell to Pascal was a process upgrade from 28 nm to 16/14 nm which allowed the clock speed to bump 50% from around 1 GHz to around 1.5 GHz.
    Couple that improved memory and a good balance of different types of units for the best performance in typical games of its time.

    --
    "We mustn't be caught by surprise by our own advancing technology" -- Aldous Huxley
  13. No context by RubberDogBone · · Score: 3, Interesting

    This question lacks context. In terms of desktop PCs and common everyday usage, we don't NEED more speed or power. Nothing is going to speed up webpages or Facebook or whatever people typically do on their PCs. And even if you did, then you become constrained by the speed of the internet and there won't be much perceived benefit.

    On the mobile side, there is room for more speed but it comes at the expense of power and is still constrained by connection speeds and website performance on mobile devices, which often sucks. Throwing faster and more processing isn't necessarily the fix that is needed.

    There are cases where rendering and other heavy duty uses might benefit but the vast majority of people never use those things. Even gaming is usually constrained by other things like the GPU, the game engine, connection speed, and human performance.

    The major places where computing power is much more important are in things like supercomputing but those machines don't run desktop programs and don't work the same way. Only the people directly using those machines would ever have any idea how fast they are or how much faster they wish they could be.

    So, to recap, desktop PCs are adequate, mobile devices are still finding a balance between power and power usage, gamers are off on their own island but sheer CPU isn't a magic fix, and supercomputing, where extra power would matter, is so far removed from everyday users, there is no way to relate to it.

    --
    Sig for hire.
    1. Re:No context by Lumpy · · Score: 3, Insightful

      You need a netbook.

      I need a 6ghz 8 core because I do actual work on the computer like compiling and rendering.

      PC's are Not adequate because software today is complete shit, almost none of it is written well for multi threading.

      Again, mostly because programmers coming out of colleges are poorly trained, and then companies want them to bang out trash and not well optimized code that takes advantage of the hardware.

      --
      Do not look at laser with remaining good eye.
  14. Re:One word [Physics] by imgod2u · · Score: 2

    Speed of electrons or even light isn't the problem. It's the capacitance. The destination transistor feels the voltage change at the speed of light, but it doesn't change its own stored charge fast enough to register a "0" or "1". This has much more to do with intrinsic resistance of the material locally than how far the signal has to travel.

    The problem is that a material that's a semiconductor will typically straddle some range between conductance and resistance (by definition). So conductance is hard to increase without impacting the resistive "mode" it needs to be set in. This is the problem with graphene and carbon nanotubes. They're really conductive, but not terribly resistive when we want them to be in the "off" mode.

  15. Gate tunnelling current by swm · · Score: 5, Informative

    Moore's law had a great run: ~40 years from early 60s to early 00s.
    During that time, every generation boosted density, gate count, clock speed, and value per dollar.
    The (exponential!) rule of thumb was 2x more every 18 months.

    Everyone knew it had stop sometime: you can't make things smaller than atoms.
    What finally did stop it (considerably north of atom-scale) was gate tunnelling current.
    In a MOS-FET, the gate is separated from the channel by an insulator (SiO2).
    As you scale the transistor down, that insulator gets thinner, along with everything else.
    When the insulator thickness is less than the wavelength of an electron, you start to get significant tunnelling current.
    This acts like short-circuit from the power to ground.

    The technology hit the wall around 2003.
    Gate tunnelling current was then over half of total power dissipation.
    The power density of the CPU chip was 150 W/cm^2 (like a stove top),
    and going further was clearly impractical.

    As it happens, the clock speed at that design node was 3 GHz,
    and that's pretty much were we are today.
    Everything since then has been building bigger, not faster: multi-core, caches, SoC;
    plus architecture tweaks and optimizations, like pipelining and super-scalar.

    It was a great run while it lasted, but it's over,
    and we're not getting another one without a fundamental scientific/technological breakthrough,
    on the order of coal, or steel, or quantum mechanics.

    1. Re:Gate tunnelling current by Anonymous Coward · · Score: 5, Funny

      Excellent (and accurate) observations, but
      can I just say?
      The way you did your line-breaks
      made me think at first glance that you had written your
      Comment in verse. Maybe,
      "An Ode to Moore's Law"? :)

    2. Re:Gate tunnelling current by Raenex · · Score: 2

      It's nice to get the real answer amidst all the bullshit. I experienced nearly 20 years of those processor speedups, and it was glorious. Too bad it came to an end. If the trend had continued, we'd all be using some terahertz CPUs by now.

  16. Re:The Once and Future CPU by ClickOnThis · · Score: 2

    Moore's Law is an observation made by its namesake that the density of transistors on a chip doubles approximately once every 18 to 24 months. Gordon Moore first made the prediction in 1965 and it held fairly well until recent years (roughly after 2012.)

    Processor speeds, although they have increased significantly over the same time period, have not doubled every 18 to 24 months.

    --
    If it weren't for deadlines, nothing would be late.
  17. Risk Averse CEOs are holding us back by LeftCoastThinker · · Score: 5, Informative

    Risk averse CEOs who don't want to sink in the R&D to make carbon based chips because there is risk of it not working.

    A synthetic diamond transistor was first built and tested over 13 years ago at 81GHz: http://www.geek.com/blurb/81gh...

    More recently they developed a 300GHz Graphene transistor, but that was still 7 years ago: https://www.bit-tech.net/news/...

    The technology is there and proven, but scaling it up to processor scale would be a massive investment and a big risk.

    --
    If you disagree, please post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like
    1. Re:Risk Averse CEOs are holding us back by gantry · · Score: 3, Informative

      The chip manufacturers are funding research on these and other technologies, but they are all a long way from viability. It is easy to forget that silicon CPUs with a billion transistors are the outcome of 60 years' research, development, and investment.

      Silicon processing is made easier because silicon's oxide is an extremely good insulator. For diamond and graphene, the oxide is a gas, and so insulating areas cannot be created by oxidising the material: another substance must be deposited.

    2. Re:Risk Averse CEOs are holding us back by Goldsmith · · Score: 4, Interesting

      The timeline for carbon electronics is really, really long, predating transistors and silicon by decades. Carbon based electronics has had more than enough R&D for us to understand the basic properties and scaling challenges. The proof of this is that there are commercial products out there using these materials, made in commercial fabs. You just don't hear about them, because they have very little to do with the digital world (right now). Typically, you'll find these products in sensors and analog components. The particular strengths of carbon based electronics are an ability to carry lots of current in small channels (this is not just about resistivity, but also relates to chemical stability and thermal conductivity), and an ability to integrate seamlessly with biological material (this was initially just about carbon-carbon chemistries, but has grown to also encompass superior integrations of electronics with living systems).

      These are different kinds of transistors, and don't operate the way (digitally) MOSFET silicon transistors do.

      Diamond is a wide bandgap semiconductor (that's physics for insulator). In special conditions, it can perform well, but those conditions (ranges for temperature, humidity, and field strength) are not practical for consumer devices. Doping diamond is possible, but very difficult, and it still results in a material that is a pretty good insulator. Sorry, it's going to be a lab toy for a long time.

      Graphene is a zero-bandgap semiconductor. That means that it never turns off, it just has varying amounts of "on." It's got great numbers on paper (resistivity, mobility). Doping graphene is something immoral scientists talk about doing. The reality is that doping graphene creates a different material that lacks the speed and chemical stability of normal graphene. Your conduction mechanism changes, your gating mechanism changes, your noise sources change. It's a mess. Also, it's really easy to dope graphene on accident and lose your high-end performance. It's the newest material in this space, and the one least understood in the manufacturing realm (despite that, it forms the basis for the commercial product linked above, so obviously it's understood well enough).

      You didn't mention carbon nanotubes, but I will, because what was the point of getting a PhD in carbon nanotube electronics if I can't talk about them on Slashdot?! Carbon nanotubes remain the unattainable holy grail of digital electronics. You can have it all: the speed of graphene, the on-off ratio of silicon, low power requirements... It's just that you almost need to assemble your circuit by hand. It's been >25 years we've been working with these materials, and we still don't know how to properly control where they go on a wafer (well, maybe these guys know). The problem is that nanotubes want to make a heterogeneous mixed metal-semiconductor plate of spaghetti on the wafer, when you want clean rows of uniform semiconductor. The best guys in the world at this are up to producing postage stamp sized patches in the middle of the wafer. So... there's some work to be done there before anyone starts designing a processor.

  18. Re: Why Are There No Huge Leaps Forward In CPU pow by GrahamJ · · Score: 2

    Most of Pascal's increases come from dropping to a much smaller node size which allowed them to add a lot more cores in a smaller thermal envelope. That's why it bugs me that they jacked up the prices and are fusing them off to create artificial tiers - it's mostly more of the same. And they'll continue to be able to do that because there is almost no limit to the number of cores you can throw at the types of problems GPUs are used for.

  19. Re: milking it by lgw · · Score: 5, Insightful

    My Chromebook takes mere seconds to boot, whereas an IBM AT could easily take minutes. And of course, my modern device performs tasks that would have been the domain of supercomputers in the past.

    Time to take off the rose colored glasses. I did live through the eighties and nineties, and computing was pathetic back then ... we just didn't know any better

    My Commodore 64 took about 0.1 seconds to boot. We just suck at "fast" these days.

    --
    Socialism: a lie told by totalitarians and believed by fools.
  20. Weak process improvement/Few ideas waiting by erice · · Score: 5, Informative

    This kind of thing was rather common until about 2000. Each process node was better in every way than the last. Big jumps in performance at each node advance. Power went down too. And, of course it was much cheaper per gate. You could get doubled performance and 1/4 the cost by just porting over the same design, trace for trace, to the next full node. These "die shrinks" were quite common. Through the 90's you got an extra bonus for new designs. That is because the industry was brimming with ideas that were known to work but were just not practical to implement because they took too much silicon area.
    First the idea spigot sputtered. The good mainframe ideas had already been implemented. It was longer clear what to do with all those gates. New ideas were tried. Some worked. Some didn't. Also, about this time, complexity started to threaten the ability to make chips that actually worked. Bugs became more common. Design progress slowed.

    Then process starting acting up. Power scaling stopped. More transistors were available but if you used them, your chip consumed proportionally more power. Run the transistors faster and you had the same problem, only worse. A hot chip was no longer a marketing problem, it was a chip that would not work. More effort and more complexity were needed to tame power. A simple die shrink wouldn't do that much.

    Then process started getting messier. The new nodes were not better in every way. Leakage current went up instead of down. Variability went up. Performance scaling slowed. Getting any improvement at all required more development time and money. Progress always slows when development time and cost rise.

    Then 20nm planer came and it was awful. Terrible leakage. Required double patterning. Double patterning means more masks mean more expense up front and during manufacturing. It actually cost more per transistor than 28nm. What was the point, really?

    That is pretty much the mess were are in now. Can't significantly increase clock rate. Can't throw gates at the problem and wouldn't really know what to do with the gates if we had them. Finfets temporarily tamed power but are only available in nodes hobbled by the need for multi-patterning.

       

  21. Re:milking it by Karlt1 · · Score: 4, Informative

    My SSD based laptop boots a lot faster than Windows 3.1.

    As far as "planned obsolescence", I'm running Windows 10 on a Core 2 Duo 2.66Ghz laptop with 4Gb of RAM - a computer that was first sold in 2009. It runs my Plex Server and my PlexConnect server.

    My mom still uses my 2006 era Mac Mini (Core Duo 1.66) with Windows 7, Office, and Chrome. It has 1.5Gb or RAM. When I go home and use it, it's not unusable as long as you don't try to run too many things at once.

    My secondary laptop that I keep upstairs is a circa 2009 2Ghz Pentium Dual Core with 4Gb of RAM running Windows 7. In day to day use, the only thing wrong with it is a battery that won't hold a charge.

    You can accuse MS of a lot of things, but not optimizing Windows to run well on fairly old hardware isn't one.

  22. Mill Computing and Wintel by Misagon · · Score: 3, Interesting

    For a long time, Intel and Microsoft Windows have rules the computing world. The platform has been at the bottom, Intel's instruction set architecture.
    Intel leaped from 16-bit to 32-bit architecture and then from 32-bit to 64-bit but the basic execution model remains the same. Most of the advances that Intel have done from the Pentium onwards in the early '90s have been stopgaps to get as much out of the execution model, but still being limited by it.

    There are other processors out there, DSPs, that are much faster than x86 at specialized tasks by making them pipelined and parallel. GPUs could be seen as massively parallel DSPs.
    But raw computing power is not the problem. The problem is to run general-purpose code well - and general-purpose code has many branches between code paths and that can't be parallelized.

    A company called Mill Computing is working on a general-purpose CPU architecture inspired by DSPs and from what they think that the Intel IA-64 (Itanium) should have been.
    By being vastly different in several significant ways from x86, they claim to be able to achieve a significantly higher performance per watt and performance per clock overall than Intel and AMD's x86.

    --
    "We mustn't be caught by surprise by our own advancing technology" -- Aldous Huxley
  23. Playing too much Civilization by anvilmark · · Score: 2

    The CIV games make young minds think that technological breakthroughs are simply a matter of money and time, then BANG tech advance!
    Somebody needs to start airing "Connections" again: http://topdocumentaryfilms.com...

  24. Everything ... everything is conspiring. by NothingWasAvailable · · Score: 4, Interesting

    The gates are now so small that the electron wave function has a pretty high probability of being "on the other side" of the gate. As gates shrink, leakage power goes up very rapidly. Even when they're "off", the gates are consuming too much power (leaking it to ground.)

    Also, think about 5 Ghz, IBM's fastest chips. At 5 Ghz, the clock speed is 200 picoseconds, and a 10 deep pipeline can allocate about 20 ps to each gate transition. That's a lot to ask, given that resistance and capacitance don't scale down linearly with dimensions. You also have to populate your chip with a lot of decoupling capacitors in order to hold the charge locally for each transition (because you can't get the power from off chip in 20 ps.) To fight the increased RC load (proportionally) you're putting in more buffers (big amplifiers).

    As if that weren't enough, you have the fact that a 14 nm gate is about 20 silicon atoms across. When you start doping the substrate, your actual behavior is all over the place because one or two more dopant atoms represent a 10-20% shift, up or down (total shifts of 40-50%.)

    So, your gates are too small, they all behave differently, they have to drive a relatively larger load, and the suckers are too hot.

  25. Intel's shady tatics by bongey · · Score: 5, Interesting

    Intel is up to their shady tactics again with AMD's new Ryzen release. Maybe not out right paying off computer makers, just now they are sponsoring reviewers. The reviewers jump through all kinds of hoops to make sure that Intel is on top of the benchmark graphics and read like a Intel marketing brochure. None of the reviewers disclose that they are sponsored by Intel.
    Examples of oddities from reviewers that are sponsored by Intel.

    1) Tom's Hardware: Complains about the power consumption being higher than spec, leaves out that the result was from a overclocked test and an MSI board that has an additional CPU power.
    2) GamersNexus (one worst of them)
    a) Had to compared the 1800x to 6 different Intel processors that were overclocked with the 6900k overclocked by 700Mhz.
    b) Only one AMD processor was OC by -100Mhz(yep) . There OC vs stock were almost exactly same.
    c) Makes the 6900k pop on the top of the benchmarks.
    d)1800X only loses 6 vs 8 to the Intel 6900k at stock speeds. With only 2 benchmarks with the 1800x losing by more than 7fps.
    e)Pretty much all benchmarks by the same author never included OC tests, but suddenly he had to compare it to 6 different OC benchmarks. http://www.gamersnexus.net/gam... http://www.gamersnexus.net/gam...
    f) Out right lied saying AMD told him not to benchmark Ryzen at 1920x1080. AMD just asked him to benchmark at multiple resolutions , not just 1080P.

  26. C versus SQL. SQL is understandable, and parallel by raymorris · · Score: 4, Interesting

    > trying to teach some of the programmers out there how to program effectively on the various parallel platforms is harder than trying to alter physics.

    Which could also be phrased as:
    So far, many of the parallel platforms available are much harder to learn.

    Programmers can and do learn new and different ways of working, provided that the new ways don't suck.

    C, Java, etc are all imperative, scalar and object based languages. SQL is a completely different paradigm, declarative and set-based. In other words, in most programming languages the programmer tells the computer how to do some task, with some value. In SQL, the programmer tells the computer what the result must be - without specifying how to do it, and all fundamental operations work on sets, not individual values. Yet most programmers can ans often do learn the declarative, set-based way of programming just as well as they learn the classic imperative way. They learn two very different ways of thinking and programming, because SQL is reasonably good - it's quite learnable, with or without understanding the underlying mathematical concepts.

      There's no fundamental reason you can't have a parallel programming language or library for general purpose programming that's roughly as easy to use as SQL. In fact, SQL may point the way in many respects - besides being a learnable paradigm, it's fundamentally parallelizable precisely because the fundamental operations all use sets as input and output. All the major operations could easily be completely parallelized behind the scenes and the user (programmer) wouldn't have to know or care.

    Maybe that's the way to go, since we know programmers can and do use sets - introduce a set-based general purpose language. To avoid leading programmers into temptation, the language should have no loop constructs. With no capability to run this:
    foreach blah in group {
          result[i++] = do_stuff(blah);
    }

    programmers will quickly learn to instead write:
    results = do_stuff(group);

  27. Re:C versus SQL. SQL is understandable, and parall by FrankSchwab · · Score: 2

    But you don't have to look to future software for this.

    ASIC design languages create designs that are explicitly parallel, and they do it easily. Sure, there are synchronizations that have to happen, but that may not apply to much of the design. They are explictly event-oriented, and combinational (When this event occurs, do one of the following things depending on the state of these other two signal). I have sometimes been amazed at how quickly, and in how small a description. and with a full test suite, a good digital designer can implement some algorithms compared with an embedded 'C' programmer.

    --
    And the worms ate into his brain.
  28. Re: milking it by lgw · · Score: 2

    Well the C64 didn't do really do anything on boot - mostly initialize the 40 character x 25 line display and jump to Basic and start executing. The kernal was custom written for one hardware config, didn't work with thousands of different pieces of hardware. No internet, no services at all to run (because no multi-threading). Those machines were extremely simple, and really can't be compared to today's Mac, Linux, or Windows OS's.

    But modern machines are about 10000x faster. Needless complexity aside, it's just not that much more complicated. Whatever is hardware-specific, cook that up when the hardware changes - how often does that happen? - and park it ready for fast boot again.

    We just suck at "fast".

    --
    Socialism: a lie told by totalitarians and believed by fools.
  29. Re: milking it by Anonymous Coward · · Score: 2, Interesting

    My girlfriend asked what laptop she should buy. There was a time when I would have had all kinds of answers, maybe even fixup her old laptop with Linux or something to squeeze a couple more years out of it. That was then.

    To save trouble, I just gave her a Chromebook. I know very little about them. But I know they just work, at a fraction of the cost of anything else. She can check her work schedule, do online shopping, watch Netflix, etc. And I don't have to be bothered!

    I don't have to mansplain to her, figure out why her network connection wasn't working, or how to install extensions so she can browse safely, or one of a million things that happen when an ordinary person uses a real computer and real OS. I could have given her a top of the line, tricked out Dell, or Asus, or whatever. She wouldn't have been any happier or any more satisfied.

    So now my stock answer when anyone (other than a STEM student) asks about what computer they should buy, my answer is Chromebook.:

  30. Vectorization by JBMcB · · Score: 3, Informative

    For certain operations, AVX made a huge difference. AVX2 made an even huge-r difference. Depending on what you're doing, you can see a 2x to 10x speedup on the outside vs. using a chip without AVX2 with similar performance characteristics.

    --
    My Other Computer Is A Data General Nova III.
  31. Breaksthroughs allow continued development by ET3D · · Score: 2

    There have been many breakthroughs in the PC industry, incredibly clever inventions which allowed things to move forward. And that's the thing, the smartest things in the industry don't make for a huge processing leap, they enable making progress at all. Each of these developments take years. Ideas may be simple, but implementing them, especially at the level required for mass production, is hard. Each development also requires more accurate tools. Also, complexity is now so high, that, as imgod2u said, even a huge change in some part leads to an overall small change.

    So as others have said, physics, but I think the above is a more nuanced answer. I remember when people said that it wouldn't be possible to make transistors under a micron in size. The very fact that we've reached so far is miraculous.

  32. It DOES happen by SoftwareArtist · · Score: 3, Informative

    It happened about ten years ago with the rise of GPUs for general purpose computing. Suddenly we could do a lot of things 10-100 times faster than before. You program GPUs really differently than CPUs, so we had to rewrite a lot of code and design new algorithms. But the benefit was huge.

    It may be happening again with specialized chips for deep learning, like Google's TPU. These chips are designed for just one class of applications, but it's a really important class, and they can be 10x faster or more efficient for those applications.

    There've been other times when a new generation brought a sudden major improvement in speed, like with vector units or multicore CPUs. But always at the cost of having to rewrite how your code works.

    Now if you want new chips that work just like the old ones and run the same programs as before, just 10x faster, sorry. That isn't likely to happen. Huge jumps like that require major changes of approach.

    --
    "I'm too busy to research this and form an educated opinion, but I do have time to tell everyone my uninformed opinion."
  33. Re:C versus SQL. SQL is understandable, and parall by Anne+Thwacks · · Score: 2
    Algol68 was extremely easy to use, and allowed programmers to use parallelism with only a very limited amount of learning. And that was in 1968! You did, of course, have to understand the problem you were coding, but if you don't understand that, then your program will probably fail in bizarre ways anyway.

    It did indeed have a construct like:

    foreach blah in group {
    result[i++] = do_stuff(blah);
    }

    Unfortunately, it was not American.

    --
    Sent from my ASR33 using ASCII
  34. Because we're already close by psmoot · · Score: 3, Informative

    I think the real issue is, semiconductors are so competitive, the current shipping product is always very close to the state of the manufacturing and physics arts. Intel, AMD, nVidia, Samsung, Toshiba, Apple, and others spend billions pushing the processes and architectures to the limit in every product so it stays competitive as long as possible.

    To get a 4x or 8x improvement in size, power, or speed would imply there's a revolutionary way to do things that we just don't quite know yet. And it better be something which can be quickly turned to production because Moore's Law hasn't stopped yet. If you have a 4x improvement idea but it takes five years to release, it won't get funded. Plain CMOS silicon has too good a chance of catching up.

    There's plenty of times people rolled the dice on processor moon shots. I was at HP when Itanium was first developed (~95). We thought we'd have working silicon in a few years (~98 or 99) at the astounding clock rate of 500 MHz (oh, and that was potentially retiring something like 6 to 12 instructions per cycle, I forget the details). This was when a good Pentium processor ran at around 45 MHz. We thought Itanium was going to be so frickin' fast there was no way Intel could compete. Then AMD started a clock rate war, x86 got faster really fast, Itanium took much longer to produce than we anticipated, and the rest was history.

    I think the bottom line is, it's really hard to produce a system which really is even 2x faster than the competition. 4x is incredible and 8x probably has never been done.

    As an analogy, consider cars and mileage. My car, a diesel Passat (which shortly will not be road legal :() actually exceeds 50 MPG on a good day. What would it take to make a car which gets 100 MPG with a 600 mile range? How about 200 MPG? With no compromises? And a sales price of $28k? It's pretty hard to imagine.

  35. Plastics? by Tenebrousedge · · Score: 2

    Central high power cloud machines are just a disaster waiting to happen, how many times does this have to be proven.

    Once would be a good start. Do you really think that people are not designing fault-tolerant network infrastructure?

    --
    Those who advocate genocide deserve every protection afforded by law, and none afforded by common human decency.
  36. Laziness by Tenebrousedge · · Score: 4, Insightful

    Laziness is a virtue in a programmer.

    The whole point of this profession is to save labor. That includes programmer labor, especially because it's an expensive commodity.

    I don't know who has mod points today but this comment is frankly ridiculous.

    --
    Those who advocate genocide deserve every protection afforded by law, and none afforded by common human decency.
  37. Re:C versus SQL. SQL is understandable, and parall by Half-pint+HAL · · Score: 2

    Maybe that's the way to go, since we know programmers can and do use sets - introduce a set-based general purpose language. To avoid leading programmers into temptation, the language should have no loop constructs. With no capability to run this: foreach blah in group { result[i++] = do_stuff(blah); }

    programmers will quickly learn to instead write: results = do_stuff(group);

    I agree, but I think you've taken it a step too far here. Look back at maths and how things like sigma summation and similar things like the product function work. Because of the mathematical properties of these, they are order independent, and inherently parallelisable.

    Eliminating loops doesn't mean eliminating a "foreach" -- it just means treating each instance of the block as its own scope, and ensuring that no instance can access the variables of another instance. (Talking "instances" instead of "iterations" immediately says it's not a logical loop, even if the computer running it realises it as such simply due to lack of parallel capacity.)

    The problem with this is that you then have to combine the results, so you either need to treat the whole block as an inline procedure and end with a return statement, or you treat the block as a function, and now we're into functional programming.

    Basically, this sigma-style programming would be logically equivalent to carrying out a map followed by a reduce... and map-reduce has become such an important concept in server programming specifically because of this inherent parallelism. The thing is that current map-reduce renders code to the programmer in a totally different style to what they're used to. There are parallel programming environments that do render parallelised blocks in a C-inspired way, and surely that's the most obvious approach...?

    --
    Got them moderator blues I blieve I walk out the do', With these mod-points I been gettin', I 'most never post no mo'