Ask Slashdot: Why Are There No Huge Leaps Forward In CPU/GPU Power?
dryriver writes: We all know that CPUs and GPUs and other electronic chips get a little faster with each generation produced. But one thing never seems to happen -- a CPU/GPU manufacturer suddenly announcing a next generation chip that is, say, 4-8 times faster than the fastest model they had 2 years ago. There are moderate leaps forward all the time, but seemingly never a HUGE leap forward due to, say, someone clever in R&D discovering a much faster way to process computing instructions. Is this because huge leaps forward in computing power are technically or physically impossible/improbable? Or is nobody in R&D looking for that huge leap forward, and rather focused on delivering a moderate leap forward every 2 years? Maybe striving for that "rare huge leap forward in computing power" is simply too expensive for chip manufacturers? Precisely what is the reason that there is never a next-gen CPU or GPU that is, say, advertised as being 16 times faster than the one that came 2 years before it due to some major breakthrough in chip engineering and manufacturing?
Physics
-- Sometimes you have to turn the lights off in order to see.
Those leaps are in the works, in the form of spintronics, quantum computing, and photonics.
Most likely, there is no major competition in the market, and PC sales on the whole have slowed considerably. A modern 6800K processor is as close as you'll come to a leap forward, but it's $1100 Canadian and requires a similarly expensive motherboard + memory. Same with similar chips.
Meanwhile the cheapest system on the market is as fast as a moderately high-grade enthusiast computer from 2010 and probably has reasonable 3D graphics onboard, with a SSD drive it will feel quite snappy.
So, a) not a lot of market demand for faster systems, b) lots of tablets and game consoles for entertainment out there, c) moderately faster systems exist but cost keeps them low-volume, d) very low-percentage demand for faster computers - definitely less than 1% that will pay a premium for it, e) the majority of gamers are young-ish and they play largely twitch games even on PCs which are more GPU limited than CPU limited.
...Steve
Every advance has to be paid for by the consumer. Each incremental advance comes as the previous one is marketed.
Instruction level parallelism in superscaler core designs have hit a limit. More pipeline stages becomes counter productive when a misprediction requires a flush. Thread level parallelism exploited by multi core designs can only go so far; only certain tasks can exploit massive parallelism(e.g. ray tracing).
Increases in clock speed have hit a wall with current silicon based semiconductors. Exotic semiconductors and incredible cooling systems aren't practical for the mass market.
But one thing never seems to happen -- a CPU/GPU manufacturer suddenly announcing a next generation chip that is, say, 4-8 times faster than the fastest model they had 2 years ago.
Just because they don't announce it doesn't mean that doesn't happen.
The Intel chips out right now are 2-3 generations old in so far as their R&D goes.
They simply have no reason to release more than they do since there is really little competition.
The RISC architectures combined with new chip technology was on the order of 10x the previous chips back in the 1980's but that was a rarity. Everyone is basically building the same architectures using the same technologies so you don't see too many 10x improvements.
NVIDIA's 2016 Pascal architecture was significantly faster than their previous Maxwell architecture.
"Relative to GTX 980 then, we're looking at an average performance gain of 66% at 1440p, and 71% at 4K. This is a very significant step up for GTX 980 owners,"
http://www.anandtech.com/show/10325/the-nvidia-geforce-gtx-1080-and-1070-founders-edition-review/32
CPU's and other chips used to double in speed about every 18-24 months, all the way from the early 1980s to about 2003. This is what used to be known as Moore's Law. It was long predicted that the standard semiconductor chips (CMOS etc.) would hit fundamental physics limitations when they reached very small feature sizes and very high clock rates, e.g. billions of cycles per second, some time in the early 2000s which is what appears to have happened. This is more generally known as the technology S curve where many technologies go through a period of very rapid, sometimes exponential improvement and then top out. This happened with propeller airplane engines from 1903 up to 1940s and then again with jet engines in the 1950's and 1960s. Once the top out is reached, usually a fundamental new technology such as jet engines in place of propeller engines, is needed to make a further leap: a 2X or better improvement. Probably a fundamentally new CPU technology, not current silicon chip technology, is needed to get to even higher clock speeds if it is even possible.
The poster asks a question that assumes breakthroughs can be planned just like any other development project. But breakthroughs are not, or rather, those that can be planned and worked already have been. The computer science field has been operating awash with funding for at least 55 years.
I'm not saying there are no breathoughts out there, what I'm saying is that our current project methodology has already discovered all it can, and most future breathoughs will come from some other methodology.
The target, CPU/GPU power is also not especially compelling -- compared to the past, there is much less pressure to increase performance, and considerable uncertainty how the increase will be helpful.
The sole reason Kaby lakes got hot and clocked in so fast is because of AMD just around the corner and it worked to beat Ryzen. I expect the CPU race to heat back up again as physics has not killed innovation yet.
Proof is GPU's and Phones are still improving at breakneck speed. It is only because of an INtel monopoly that on the desktop it has went to a standstill.
http://saveie6.com/
Right about 2008/2009 computer hardware became "good enough" to appeal to people's basic needs which really only centered on having a simple window to the internet. Netbooks became available and smartphones started to become good enough to browse the internet on their own. Consumers at the end of the day really only want a platform that's able to view into the internet.
Someone can correct me, but I believe such innovation is still occurring for server technology and niche fields like a/v production, cad, and animation. Though, I do yearn for the olden days when consumer technology was cool and exciting. Being a tech nerd in the 90s was something else!
...because of software inefficiency and planned obsolescence. Ever wonder why current Windoze takes about the same time to boot as Win 3.1 running on a 486? It's not because Windoze does 10,000 times more (useful stuff) today. (486DX2 ~25 MIPS, i7 5960X ~240K MIPS).
"National Security is the chief cause of national insecurity." - Celine's First Law
I remember when Pentiums were first coming out. P75, P90, P100, P133, P166. They were faster than the 386s and 486sx and 486dx models. The p166 was noticeably more than twice as fast as the P75 on lots of tests. The Mhz and Ghz races are over.
We can't just ramp up cycles anymore with silicon. It puts out too much heat. Multicore doesn't magically make programs faster unless they lend themselves well to parallellization & are coded properly for it. New architectures have been tried, but ultimately fail because they're costly or proprietary. ARM was a pretty good leap forward for mobile use. New instructions are being included in CPUs all the time -- especially ARM. Try to play a HEVC 1080p video on a 2013 tablet vs one today... you'll notice a difference right away. Check the CPU usage -- one's at 100% and dropping frames left and right while the other barely nudges past 15%.
Intel or AMD could sell you a chip with 256 cores on it, but unless you do a lot of video encoding or physics rendering, it'd be wasted on you... and super expensive b/c they have no incentive to make it in volume. Maybe when VR or AI becomes commonplace, you'll drive demand for such architectures.
CPUs are fast enough for just about anything one could think to do with them at a consumer level. GPUs can be made better, but market forces push for low power that's "good enough" for most users. CPUs and even GPUs aren't the bottlenecks anymore -- it's RAM, SSD, PCI-express lanes, various busses like USB, thunderbolt, HDMI, SATA, etc. Doesn't do much good to stuff a really fast CPU or GPU into a system if you can't feed it data fast enough to max it out. Most CPUs already have several layers of cache as well as branch prediction to help with the crippling latency from other I/O, but it's still not enough.
Changes are usually evolutionary, not revolutionary... and we've tweaked so much with CPUs and GPUs, you're not going to see a big bump until we move away from silicon and PCB to say... diamond or carbon nano-wires and optical computing.
How close are we to the theoretical physical limits in terms of what electricity can do? Can light or some other radiation theoretically be significantly faster?
And there's also heat dissipation. Even if we could build 3D chips, heat dissipation will tricky. (Would we still call them "chips" if they were little boxes instead? "Borglets"?)
Does the quantum world offer significant potential improvements, or only incremental?
Are the current performance walls mostly limits in knowledge of how to tame and control materials and energy, or simply an inherent limit to how much energy can be controlled in a confined space?
Suppose one ignores manufacturing capability and designs a chip made up of any known substances. How much faster would it be compared to manufacture-able chips (based on simulations or calculations)?
Table-ized A.I.
CPU architect here. I'll try to provide some insight.
Performance for CPU/GPU or any computational tool isn't exactly just a number you hit. It's not like bandwidth for storage or communications nor is it like a battery's capacity.
A CPU and to a lesser extent a GPU is able to perform all sorts (all logical) computational functions. Each of these involves different usage patterns of the different computational paths inside a piece of silicon. And thus, speeding up each of these usage patterns requires different structures.
A single piece of code running something complex like launching an app or opening a webpage will generate hundreds of millions of instructions with lots of different patterns. Think about all those API's you call. How much code do you think is similar between them?
And thus the problem of improving "performance". The goalpost is a shifty one. Speed up one code pattern, and you risk your changes hurting another. Or you can spend extra transistors making a specialized accelerator for that code pattern. But then...it'll be idle 95% of the time.
And if you speed up a particular function by 1000x (it's happened), your average speed increase for a typical benchmark or API call will still be 0-1%. Because that function is only a small piece of the larger codebase.
Think about how many non-similar libraries and functions there are in typical software, and think about how there's any way to speed them *all* up. You can make memcpy or memset (malloc uses these) faster by 5x and that'll speed up javascript processing by....0.01% or so.
The reason "performance" doesn't increase as drastically in the computer world is because computing "performance" is very very multifaceted. Much like how "intelligence" can't just be increased by 5x -- someone can get 5x better at specific tasks, like memorizing or image recognition, but that doesn't make them 5x more "intelligent".
Compare this with a simple metric like 0-60 acceleration or network bandwidth.
Let me introduce you to the economics of the bell curve and the s curve.
AMD, to be fair, has pretty much done this just now with the Ryzen chips.
Who is this that even the wind and the waves obey Him? Surely this computer must submit also!
Architecture-wise, Pascal was mostly an incremental upgrade to Maxwell.
The big difference from Maxwell to Pascal was a process upgrade from 28 nm to 16/14 nm which allowed the clock speed to bump 50% from around 1 GHz to around 1.5 GHz.
Couple that improved memory and a good balance of different types of units for the best performance in typical games of its time.
"We mustn't be caught by surprise by our own advancing technology" -- Aldous Huxley
The annual incremental improvement comes from silicon process technology (fab shrink), architecture, and/or optimizations. This is how Intel characterizes their development cycle over time. The "large scale" leaps in performance come from a fundamental shift in the underlying algorithms, not silicon hardware. For example, the clock frequencies associated with uprocessors have only been creeping upwards slowly over the past ten years (around 3+ GHz), in spite of multiple generations of silicon process shrinks. Instead of increasing speed (which turns out to be really hard due to a nasty thing called physics), better silicon process has been used to decrease power consumption and increase the number of parallel cores. So, you get more cores, or lower power, and a modest performance increase every year. You won't see a "global" performance increase, which would mean you'd have to double clock frequency, or somehow double instructions per clock cycles at 100% efficiency.
At an _algorithmic_ level, you do see 2-4X breakthroughs. A simplistic example would be the shift from a video compression algorithm designed to run on a single processor core (one thread) being mapped to eight processors in parallel. Or taking an algorithm and making it N times more efficient (a classic example is the discrete Fourier transform versus the fast Fourier transform, which revolutionized signal processing). Parallel mapping of algorithms turns out to be far harder than you might expect; the compilers are getting better, but still far from that good. Coming back to the video compression case for a moment, I have yet to see a massively parallel H.264 video compressor (NVidia GPU/CUDA on 1024 parallel cores) that can beat a single-threaded compressor in terms of quality at the same compression ratio. My understanding is that this is due to the fact that trying to "optimize" the image quality during compression requires the algorithm to look at large chunks of the image, whereas parallel processing demands that the image be broken up into little chunks with minimal ability to look at the big picture (so to speak).
This question lacks context. In terms of desktop PCs and common everyday usage, we don't NEED more speed or power. Nothing is going to speed up webpages or Facebook or whatever people typically do on their PCs. And even if you did, then you become constrained by the speed of the internet and there won't be much perceived benefit.
On the mobile side, there is room for more speed but it comes at the expense of power and is still constrained by connection speeds and website performance on mobile devices, which often sucks. Throwing faster and more processing isn't necessarily the fix that is needed.
There are cases where rendering and other heavy duty uses might benefit but the vast majority of people never use those things. Even gaming is usually constrained by other things like the GPU, the game engine, connection speed, and human performance.
The major places where computing power is much more important are in things like supercomputing but those machines don't run desktop programs and don't work the same way. Only the people directly using those machines would ever have any idea how fast they are or how much faster they wish they could be.
So, to recap, desktop PCs are adequate, mobile devices are still finding a balance between power and power usage, gamers are off on their own island but sheer CPU isn't a magic fix, and supercomputing, where extra power would matter, is so far removed from everyday users, there is no way to relate to it.
Sig for hire.
Just what do you mean by handle? I wouldn't mind a system that takes advantage of properties of the multiverse to bring me things that I want before it occurs to me that it could even exist. Bring on the singularity!
Speed of electrons or even light isn't the problem. It's the capacitance. The destination transistor feels the voltage change at the speed of light, but it doesn't change its own stored charge fast enough to register a "0" or "1". This has much more to do with intrinsic resistance of the material locally than how far the signal has to travel.
The problem is that a material that's a semiconductor will typically straddle some range between conductance and resistance (by definition). So conductance is hard to increase without impacting the resistive "mode" it needs to be set in. This is the problem with graphene and carbon nanotubes. They're really conductive, but not terribly resistive when we want them to be in the "off" mode.
Moore's law had a great run: ~40 years from early 60s to early 00s.
During that time, every generation boosted density, gate count, clock speed, and value per dollar.
The (exponential!) rule of thumb was 2x more every 18 months.
Everyone knew it had stop sometime: you can't make things smaller than atoms.
What finally did stop it (considerably north of atom-scale) was gate tunnelling current.
In a MOS-FET, the gate is separated from the channel by an insulator (SiO2).
As you scale the transistor down, that insulator gets thinner, along with everything else.
When the insulator thickness is less than the wavelength of an electron, you start to get significant tunnelling current.
This acts like short-circuit from the power to ground.
The technology hit the wall around 2003.
Gate tunnelling current was then over half of total power dissipation.
The power density of the CPU chip was 150 W/cm^2 (like a stove top),
and going further was clearly impractical.
As it happens, the clock speed at that design node was 3 GHz,
and that's pretty much were we are today.
Everything since then has been building bigger, not faster: multi-core, caches, SoC;
plus architecture tweaks and optimizations, like pipelining and super-scalar.
It was a great run while it lasted, but it's over,
and we're not getting another one without a fundamental scientific/technological breakthrough,
on the order of coal, or steel, or quantum mechanics.
Risk averse CEOs who don't want to sink in the R&D to make carbon based chips because there is risk of it not working.
A synthetic diamond transistor was first built and tested over 13 years ago at 81GHz: http://www.geek.com/blurb/81gh...
More recently they developed a 300GHz Graphene transistor, but that was still 7 years ago: https://www.bit-tech.net/news/...
The technology is there and proven, but scaling it up to processor scale would be a massive investment and a big risk.
If you disagree, please post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like
Most of Pascal's increases come from dropping to a much smaller node size which allowed them to add a lot more cores in a smaller thermal envelope. That's why it bugs me that they jacked up the prices and are fusing them off to create artificial tiers - it's mostly more of the same. And they'll continue to be able to do that because there is almost no limit to the number of cores you can throw at the types of problems GPUs are used for.
Moore's law ended in 2006 (heard it straight from an Intel engineer). In it's place they have been focusing on multi-processing and power savings.* In doing so they learned they could make even more money through a much slower upgrade time-table. They do have tech on the back burner to roll out that will have huge improvements on performance (optical interconnects, for instance) but they are going to roll that stuff out like molasses going up a hill. Greed has really taken hold of everything these days.
* (Did you know half the cores on the latest chips have to be idle most of the time to prevent over heating?)
:T:R:A:N:S:
LOL
:T:R:A:N:S:
Massively parallel processors are readily available. In fact, that's exactly what a GPU is. But many tasks don't obviously lend themselves to being divvied up among this many cores. Single thread performance still matters a lot for a lot of problems. And that's much harder to scale up then just throwing more cores at it.
Well that, and all the difficulties with writing cache aware software. Modern CPUs are quite fast. But they spend most of their time waiting for memory. And there are limits to how fast we can make memory that needs to be accessed in a truly random access pattern.
GPUs tend to have more regular access patterns and insanely wide tightly coupled data buses
My Chromebook takes mere seconds to boot, whereas an IBM AT could easily take minutes. And of course, my modern device performs tasks that would have been the domain of supercomputers in the past.
Time to take off the rose colored glasses. I did live through the eighties and nineties, and computing was pathetic back then ... we just didn't know any better
My Commodore 64 took about 0.1 seconds to boot. We just suck at "fast" these days.
Socialism: a lie told by totalitarians and believed by fools.
Instead of thinking about processing power in term of Hz, you should be looking at a CPU's/GPU's overall computational throughput. When you look at things that way, you will see there has been a massive uptick in processing power in GPUs. x86 CPU have stagnated a bit due to lack of serious competition for the high-end but everywhere else it's thriving. Massive parallel processing is the real future of computing, so get ready for chips with a thousands of sub-GHz cores running independent and identical tasks because that future is coming.
Anons need not reply. Questions end with a question mark.
That's a fair point, although the submitter didn't disqualify process improvements as a valid source of performance gains.
This kind of thing was rather common until about 2000. Each process node was better in every way than the last. Big jumps in performance at each node advance. Power went down too. And, of course it was much cheaper per gate. You could get doubled performance and 1/4 the cost by just porting over the same design, trace for trace, to the next full node. These "die shrinks" were quite common. Through the 90's you got an extra bonus for new designs. That is because the industry was brimming with ideas that were known to work but were just not practical to implement because they took too much silicon area.
First the idea spigot sputtered. The good mainframe ideas had already been implemented. It was longer clear what to do with all those gates. New ideas were tried. Some worked. Some didn't. Also, about this time, complexity started to threaten the ability to make chips that actually worked. Bugs became more common. Design progress slowed.
Then process starting acting up. Power scaling stopped. More transistors were available but if you used them, your chip consumed proportionally more power. Run the transistors faster and you had the same problem, only worse. A hot chip was no longer a marketing problem, it was a chip that would not work. More effort and more complexity were needed to tame power. A simple die shrink wouldn't do that much.
Then process started getting messier. The new nodes were not better in every way. Leakage current went up instead of down. Variability went up. Performance scaling slowed. Getting any improvement at all required more development time and money. Progress always slows when development time and cost rise.
Then 20nm planer came and it was awful. Terrible leakage. Required double patterning. Double patterning means more masks mean more expense up front and during manufacturing. It actually cost more per transistor than 28nm. What was the point, really?
That is pretty much the mess were are in now. Can't significantly increase clock rate. Can't throw gates at the problem and wouldn't really know what to do with the gates if we had them. Finfets temporarily tamed power but are only available in nodes hobbled by the need for multi-patterning.
The people who are actually paying for the products are interested in
a) Power in: do the same about of computation at half the power so my battery will last longer.
b) Power Out: do the same amount of computation at half the power so I can use twice as many devices without blowing by power budget.
Data centers are limited by how much heat you can extract per square foot. Desktops are limited by how loud the fan is. Mobile is limited by the battery size.
Therefore, the designers are designing what people are actually willing to pay for.
Perhaps the article poster is not familiar with recent history? their have been both significant gains in CPU and GPU power, especially GPU. however improvements tend to be focused where it is needed most e.g. performance per watt.
I was booting computers in milliseconds in the mid 90's (to the point where users space applications were getting scheduled time). It really depends on what you considered 'booted' and what hardware checks you are willing to skip. RAM test? walking ones test? read/write test?
Sometimes you have to set up a piece of hardware to fail and wait for it to time out to verify that a system is working and that alone can take an arbitrary amount of time. 40ms? 2 minutes? Depends on the hardware and what you're looking for. Eg, set something up so it overheats and BIT catches and shuts you down verifying that the hardware to catch overtemp works. Or maybe not do the test at all.
You're just looking in the wrong markets. If you're "just" looking at x86, obviously you have a blueprint you need to follow. Any breakthrough will take quite a few years in order to integrate and fab it. But even then, comparing 5 or 10 year old CPU's to now you can see quite a bit of new circuitry.
Look at AES acceleration and virtualization, we can now fully virtualize a machine including it's hardware as if they were separate machines including networking. There is quite a bit of logistics to make that happen in the CPU and attached chipsets and devices.
ARM has a bit more room to develop more quickly, plenty of breakthroughs in both CPU and GPU developments as are the developments in Power and other architectures.
Sure, incrementally, it doesn't look like much because 10yo CPU's are "decent enough" for most work, but if you're working on the high-end of the spectrum (calculations and large data storage) there are plenty of "breakthroughs", using the latest capabilities of chipsets, you do indeed get 4 times the jump but only for specific workloads.
Custom electronics and digital signage for your business: www.evcircuits.com
I have we haven't seen nothing yet. Basic AI concepts have been known since at least the 1960s, but computing capacities are only now starting to be viable to actually implement these algorithms. Of course, with the availability of actually usable implementations, we also have a renewed interest in developing better algorithms.
We are starting to see the fruits of these efforts with good natural language recognition, comprehension, and translation. But expect to see a lot more mind boggling advances in the near future.
I used to think about the early eighties the same way you're fantasizing about the nineties. I now realize that every decade has only gotten more exciting
And more transistors means lower yield.
Amdahl's law
My SSD based laptop boots a lot faster than Windows 3.1.
As far as "planned obsolescence", I'm running Windows 10 on a Core 2 Duo 2.66Ghz laptop with 4Gb of RAM - a computer that was first sold in 2009. It runs my Plex Server and my PlexConnect server.
My mom still uses my 2006 era Mac Mini (Core Duo 1.66) with Windows 7, Office, and Chrome. It has 1.5Gb or RAM. When I go home and use it, it's not unusable as long as you don't try to run too many things at once.
My secondary laptop that I keep upstairs is a circa 2009 2Ghz Pentium Dual Core with 4Gb of RAM running Windows 7. In day to day use, the only thing wrong with it is a battery that won't hold a charge.
You can accuse MS of a lot of things, but not optimizing Windows to run well on fairly old hardware isn't one.
In fact, process improvements are critical as a valid source of performance gains.
That's pretty much Intel's entire chip development model...
For a long time, Intel and Microsoft Windows have rules the computing world. The platform has been at the bottom, Intel's instruction set architecture.
Intel leaped from 16-bit to 32-bit architecture and then from 32-bit to 64-bit but the basic execution model remains the same. Most of the advances that Intel have done from the Pentium onwards in the early '90s have been stopgaps to get as much out of the execution model, but still being limited by it.
There are other processors out there, DSPs, that are much faster than x86 at specialized tasks by making them pipelined and parallel. GPUs could be seen as massively parallel DSPs.
But raw computing power is not the problem. The problem is to run general-purpose code well - and general-purpose code has many branches between code paths and that can't be parallelized.
A company called Mill Computing is working on a general-purpose CPU architecture inspired by DSPs and from what they think that the Intel IA-64 (Itanium) should have been.
By being vastly different in several significant ways from x86, they claim to be able to achieve a significantly higher performance per watt and performance per clock overall than Intel and AMD's x86.
"We mustn't be caught by surprise by our own advancing technology" -- Aldous Huxley
They are so cute!
Table-ized A.I.
...that /. was starting an 'Explain Like I am 5' section, just like Reddit.
The idea is not to develop a CPU chip for all performances but to develop a family of chips that each are specialized to a specific performance.
It's like building a computer, you could put all of the computer functions on one chip but they separate the functions onto multiple chips. Video controllers, bus controllers, memory controllers, cache controllers, I/O controllers and such.
What happens if you build several sub CPU architectures. They could be connected via an optical bus.
How about super fast memory architecture, super fast I/O, and super fast video. Maximize the reliability and predictability of each sub architecture. You could then incrementally change each sub architecture over time while retaining reliability. This flies in the face of the disposable device concept.
The main reason is money. Each generation costs billions to develop and produce, and manufacturers are going to make sure they get a return on their investment. These investments stretch back years, and designs have to be made with assumptions about what will be workable at the current process node at the time the chip is ready to produce. That said, not quite all the low hanging fruit has been picked yet. Ryzen could not carry a 50% IPC improvement over the FX if there was nothing left to work with. Maybe this means treating transistors as cheap and power consumption and time as the hurdles, and moving back into a true CISC paradigm. Less microcode, more dedicated logic circuits. There was a very long time when transistors were considered valuable, and designs tried to optimize so that they would all be in use as much of the time as possible. Now we have the reverse problem -- power is dear (on battery-powered devices), heat is a killer, but idle transistors are quite tolerable.
Meanwhile Intel chips away with 5% here, 8% there, and continues to make money hand over fist. Their main motivation has always been to make money, and since they have proven able to do so without amazing leaps, they'll ride the slow train of progress rather than staking the company on a complete overhaul the way AMD is forced to do every five years or so. I'm still sporting a desktop with a 1090T, and this is the first thing AMD has done since 2011 (when I built this) that actually makes me sit up and say "wow, I want that".
My laptop is on the slow side, but I didn't get it for heavy lifting. I don't see (currently, who knows down the line a bit) that AMD has done anything to make me want to change it. Nor has Intel. 5-8% improvement per generation, times four generations, would appeal if I needed the muscle, but I'm rocking a Haswell 1.4 GHz dual-core Celeron. Getting an i3 board (which can be had for about $100) instead would be much more cost-effective than buying new. Heat, noise, and battery life are all pretty well acceptable, even if they have continued to improve since.
Another major reason is that for massive number-crunching tasks, the CPU is no longer the most important part of the system. The GPUs (plural) are, and they continue to advance at a fairly impressive rate because they're several nodes behind. (Those old foundries have to do something.) When (not if) GPUs start hitting the process node wall the way CPUs already have, then they too will start to drag down the pace of improvement.
How is the Riemann zeta function like Trump rallies? Both have an endless number of trivial zeros.
I wouldn't say I use bloated crap out of laziness. At my work it's considered "best practice" to build software on top of a jenga tower of bloated, obtuse bullshit that neither they nor I could hope to understand.
The CIV games make young minds think that technological breakthroughs are simply a matter of money and time, then BANG tech advance!
Somebody needs to start airing "Connections" again: http://topdocumentaryfilms.com...
How do you get 240k MIPS for a modern CPU? That's 60 to 80 instructions per cycle.
The gates are now so small that the electron wave function has a pretty high probability of being "on the other side" of the gate. As gates shrink, leakage power goes up very rapidly. Even when they're "off", the gates are consuming too much power (leaking it to ground.)
Also, think about 5 Ghz, IBM's fastest chips. At 5 Ghz, the clock speed is 200 picoseconds, and a 10 deep pipeline can allocate about 20 ps to each gate transition. That's a lot to ask, given that resistance and capacitance don't scale down linearly with dimensions. You also have to populate your chip with a lot of decoupling capacitors in order to hold the charge locally for each transition (because you can't get the power from off chip in 20 ps.) To fight the increased RC load (proportionally) you're putting in more buffers (big amplifiers).
As if that weren't enough, you have the fact that a 14 nm gate is about 20 silicon atoms across. When you start doping the substrate, your actual behavior is all over the place because one or two more dopant atoms represent a 10-20% shift, up or down (total shifts of 40-50%.)
So, your gates are too small, they all behave differently, they have to drive a relatively larger load, and the suckers are too hot.
Competition (academic and free market) makes big jumps unlikely.
Most of the improvements that any one company is trying to do to get 2X or more performance has already been done, by the time they get to market, by other companies trying to beat them to market. Only a percentage of things they manage to do differently (perhaps things that other companies didn't think were worth doing) differentiate the performance of any one company's product.
Yes, but I think you're missing the point that the OP is really making: they are asking why improvements to processor speed are so danged incremental. Processors are maybe 200x times faster now than they were 25 years ago, but the point is that we got here, so it was physically possible. What stopped us from condensing the last 25 years of progress into 5 years? Or 1 year? Why is the progress of Moore's Law supposedly so inexorable? Does this indicate a "learned helplessness" of the industry, transitioning from the view that Moore's Law was an interesting phenomenon that arose from the industry without collusion, to the point where it now dictates what the product targets should be for this year and next year? Why is nobody trying to dramatically outstrip Moore's Law? Is it even possible to jump more than one process node ahead at a time, or increase IPC by an order of magnitude at a time rather than by a small percentage?
When too much money is invested in the status quo, you are much more like to see a slightly improved status quo next year rather than something completely different. Look at the resistance to changing our health care, our education system, our infrastructure, our.... Only when some newcomer finds a new way to do something and starts cleaning their clocks...do the entrenched players try to switch gears.
Why should a company who did all the hard work face competition from new brands?
Former cpu and gpu staff starting their own brands?
The way to stop that is to control the entire sector. No advance game or codecs will be offered to support any new start ups.
Anything tech that is useable and considered free will be open sourced by the original brand to control, brand and shape the free end of the market.
Zilog https://en.wikipedia.org/wiki/... pricing spreading around the world was the reason why the the CPU and GPU market is very careful about advances.
Domestic spying is now "Benign Information Gathering"
Intel is up to their shady tactics again with AMD's new Ryzen release. Maybe not out right paying off computer makers, just now they are sponsoring reviewers. The reviewers jump through all kinds of hoops to make sure that Intel is on top of the benchmark graphics and read like a Intel marketing brochure. None of the reviewers disclose that they are sponsored by Intel.
Examples of oddities from reviewers that are sponsored by Intel.
1) Tom's Hardware: Complains about the power consumption being higher than spec, leaves out that the result was from a overclocked test and an MSI board that has an additional CPU power.
2) GamersNexus (one worst of them)
a) Had to compared the 1800x to 6 different Intel processors that were overclocked with the 6900k overclocked by 700Mhz.
b) Only one AMD processor was OC by -100Mhz(yep) . There OC vs stock were almost exactly same.
c) Makes the 6900k pop on the top of the benchmarks.
d)1800X only loses 6 vs 8 to the Intel 6900k at stock speeds. With only 2 benchmarks with the 1800x losing by more than 7fps.
e)Pretty much all benchmarks by the same author never included OC tests, but suddenly he had to compare it to 6 different OC benchmarks. http://www.gamersnexus.net/gam... http://www.gamersnexus.net/gam...
f) Out right lied saying AMD told him not to benchmark Ryzen at 1920x1080. AMD just asked him to benchmark at multiple resolutions , not just 1080P.
I use the computer in such a manner as I routinely run into the Windows out-of-memory error on the default settings. www.bing.com/search?q=desktop+heap+out+of+memory Without making available hardware that can allow users to change their behavior more readily much fewer advances will be made as the two drives reinforce each other.
Now with my point out of the way, I like keeping large numbers of browser tabs and windows open and navigate between them at my leisure. I also have a huge amount of bookmarks that I navigate through via the toolbar.
You are the definition of "A little knowledge is a dangerous thing."
Smart enough to be cynical, but not smart enough to offer any evidence. If you worked in silicon, I'd like to hear your story.
Otherwise, shut your piehole and let the adults talk. Want to blame it on the illuminati or aliens? Stuff it up your arse.
In the 1990s, we had the megahertz wars. Beginning in the 2000s, we had the core wars. Now, we have the wattage wars. Other performance measures have stagnated as manufacturers try to reduce power consumption to give us laptops and tablets that go for 10+ hours on a charge.
Sent from my iPhone
Honestly, go read textbooks. This isn't some big cover-up, increasing performance is _hard_, it takes hard work, there's not some Slashdot poster who knows the magical answer. If you literally can't spend the hour and a half to read Ars Technica articles about the complicated GPU or CPU pipelines, then it's not like a pithy three-sentence Slashdot post is going to enlighten you.
He is comparing historical mips with modern.
In linux they call it "bogus MIPS" ... a computer in 1990 did perhaps 100 "bogus MIPS" and a computer now does 500,000 ...
The literal meaning of MIPS however is million instructions per second ...
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
> trying to teach some of the programmers out there how to program effectively on the various parallel platforms is harder than trying to alter physics.
Which could also be phrased as:
So far, many of the parallel platforms available are much harder to learn.
Programmers can and do learn new and different ways of working, provided that the new ways don't suck.
C, Java, etc are all imperative, scalar and object based languages. SQL is a completely different paradigm, declarative and set-based. In other words, in most programming languages the programmer tells the computer how to do some task, with some value. In SQL, the programmer tells the computer what the result must be - without specifying how to do it, and all fundamental operations work on sets, not individual values. Yet most programmers can ans often do learn the declarative, set-based way of programming just as well as they learn the classic imperative way. They learn two very different ways of thinking and programming, because SQL is reasonably good - it's quite learnable, with or without understanding the underlying mathematical concepts.
There's no fundamental reason you can't have a parallel programming language or library for general purpose programming that's roughly as easy to use as SQL. In fact, SQL may point the way in many respects - besides being a learnable paradigm, it's fundamentally parallelizable precisely because the fundamental operations all use sets as input and output. All the major operations could easily be completely parallelized behind the scenes and the user (programmer) wouldn't have to know or care.
Maybe that's the way to go, since we know programmers can and do use sets - introduce a set-based general purpose language. To avoid leading programmers into temptation, the language should have no loop constructs. With no capability to run this:
foreach blah in group {
result[i++] = do_stuff(blah);
}
programmers will quickly learn to instead write:
results = do_stuff(group);
Instinct tells me we're nearing peak optimization in this industry, so it's not really possible to realize gains of that magnitude without creating a new industry (e.g. going from binary computing to quantum computing).
So I tried to think what other kinds of industries are making announcements every 2 years showing 4x-8x gains, and I can't think of any... So, why is this an expectation here? Where else is this happening? medicine? transportation? agriculture? Are comedians 4x as funny as they were two years ago? Are hamburgers 8x as satisfying as they were two years ago? Is NASCAR finishing races 6x sooner?
Is Vodka making you 4x as drunk?
The first poster gave the answer to all this:
Physics!!!
Why don't you read on Wikipedia how a processor is made? You probably grasp immediately that we are right now at the point where we can not make them smaller, hence we can not make them faster.
Oh .... I did not read this line from you till now, forget my comment above:
Why is nobody trying to dramatically outstrip Moore's Law?
Because no one is working on flying faster than the speed of light, too.
Is it even possible to jump more than one process node ahead at a time, or increase IPC by an order of magnitude at a time rather than by a small percentage?
No it is not. How would you accomplish something you don't know how to accomplish it? Hu?
Make a 100 yards sprint. Measure your time. ... good luck. Or explain to me how you plan to be twice as fast in a year. A human being that can sprint, simply can not double its speed, regardless how long and hard it tries ....
Then explain to me how you plan to be twice as fast in a week
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
What are you comparing to, for these leaps of functionality, ability and/or speed? Although many of the modern enhancements are simple 100MHz increments. Which is more than CPUs used to be able to do. We started PCs at 4MegaHertz. We got to 100MHz with Pentium-2 and Pentium-Pro (I think) then 1GHz with the Piii. We can make 10GHz systems today. No users (well, normal people) will buy them. Because they don't want to pay $50k for the latest and greatest processor. Not to mention having the 30-ton air conditioning in their purpose built computer room and having the power company install a SECOND connection just for the room as well. Oddly enough, except for the super computer business and the military there are really a limited number of customers for such. Satellite companies routinely use the really expensive stuff but they are used to paying tens of millions of dollars for one unit, indeed up to hundreds of millions. And their gear is notoriously hard to service.
;-).
The public (as in customers) accepted the solution the move to more cores as they were unable to accept the massive heat from 5GHz and up. Also, look into ECC 'buffered' RAM. Great for enterprise, less so for power, heat and cost - for you, me and everybody we know.
Now we are easily running at multiple GHz. TODAY you can buy 4GHz system with 8/16GB of RAM, a reasonable video card & 1TB HDD for around $1K. I spent $6k on a PC-AT. Count your blessings (and get off my lawn
Multiple cores, really fancy floating point, huge graphic gains and all running at lower power than just five years ago. I had friends that were heavy into graphics and they routinely bought 1000Watt PC power supplies because they had to. Now, they're not necessary unless you are doing multiple cards or some other form of heavy computing.
Past that, read up on physics and semi-conductor processes. Electro-migration is a fun one. Atoms just moving around because they want to (actually quantum mechanics) at the finest layers. Remember that each new process essentially enables creating a new layer underneath the existing version. Things get tiny real fast that way.
But you don't have to look to future software for this.
ASIC design languages create designs that are explicitly parallel, and they do it easily. Sure, there are synchronizations that have to happen, but that may not apply to much of the design. They are explictly event-oriented, and combinational (When this event occurs, do one of the following things depending on the state of these other two signal). I have sometimes been amazed at how quickly, and in how small a description. and with a full test suite, a good digital designer can implement some algorithms compared with an embedded 'C' programmer.
And the worms ate into his brain.
Cores are a foreign concept to you, obviously.
"National Security is the chief cause of national insecurity." - Celine's First Law
A modern processor has many different parts and technologies in it. You might make huge leap in one area - lithography, reducing internal resistance or gate switching time but it won't increase your overall performance by very much because one of the other parts will then become the bottle neck.
Well the C64 didn't do really do anything on boot - mostly initialize the 40 character x 25 line display and jump to Basic and start executing. The kernal was custom written for one hardware config, didn't work with thousands of different pieces of hardware. No internet, no services at all to run (because no multi-threading). Those machines were extremely simple, and really can't be compared to today's Mac, Linux, or Windows OS's.
But modern machines are about 10000x faster. Needless complexity aside, it's just not that much more complicated. Whatever is hardware-specific, cook that up when the hardware changes - how often does that happen? - and park it ready for fast boot again.
We just suck at "fast".
Socialism: a lie told by totalitarians and believed by fools.
My girlfriend asked what laptop she should buy. There was a time when I would have had all kinds of answers, maybe even fixup her old laptop with Linux or something to squeeze a couple more years out of it. That was then.
To save trouble, I just gave her a Chromebook. I know very little about them. But I know they just work, at a fraction of the cost of anything else. She can check her work schedule, do online shopping, watch Netflix, etc. And I don't have to be bothered!
I don't have to mansplain to her, figure out why her network connection wasn't working, or how to install extensions so she can browse safely, or one of a million things that happen when an ordinary person uses a real computer and real OS. I could have given her a top of the line, tricked out Dell, or Asus, or whatever. She wouldn't have been any happier or any more satisfied.
So now my stock answer when anyone (other than a STEM student) asks about what computer they should buy, my answer is Chromebook.:
Have a look at the progress in semiconductor process size. To you as an end-user, this looks like a fairly smooth curve. What's hidden behind that is tens of thousands of engineering breakthroughs, as the physics change radically as you go down the size. The second thing is that going from a great idea to a mass market product takes time as well. There are many ideas for radically more efficient technologies, but it takes years, or even decades, to create the tools, fab lines, and expertise to produce them.
Even when there are no engineering obstacles, there are often other barriers to adoption, such as education, preferences, and backwards compatibility. It took decades for technologies like garbage collection, OOP, and runtime typing to be accepted by industry, and even today, many developers are still reluctant to adopt functional programming. And declarative programming, logic programming, and FPGA programming are still niche technologies.
For certain operations, AVX made a huge difference. AVX2 made an even huge-r difference. Depending on what you're doing, you can see a 2x to 10x speedup on the outside vs. using a chip without AVX2 with similar performance characteristics.
My Other Computer Is A Data General Nova III.
Your mistake was clicking the links on Drudge to infowars.com. Alex Jones isn't worth paying any attention to. I finally fixed that problem by blocking infowars.com at my router, because I was tired of looking up from the nuttiness of the text body to the top of the browser window and seeing I was at *that* site again.
There have been many breakthroughs in the PC industry, incredibly clever inventions which allowed things to move forward. And that's the thing, the smartest things in the industry don't make for a huge processing leap, they enable making progress at all. Each of these developments take years. Ideas may be simple, but implementing them, especially at the level required for mass production, is hard. Each development also requires more accurate tools. Also, complexity is now so high, that, as imgod2u said, even a huge change in some part leads to an overall small change.
So as others have said, physics, but I think the above is a more nuanced answer. I remember when people said that it wouldn't be possible to make transistors under a micron in size. The very fact that we've reached so far is miraculous.
It happened about ten years ago with the rise of GPUs for general purpose computing. Suddenly we could do a lot of things 10-100 times faster than before. You program GPUs really differently than CPUs, so we had to rewrite a lot of code and design new algorithms. But the benefit was huge.
It may be happening again with specialized chips for deep learning, like Google's TPU. These chips are designed for just one class of applications, but it's a really important class, and they can be 10x faster or more efficient for those applications.
There've been other times when a new generation brought a sudden major improvement in speed, like with vector units or multicore CPUs. But always at the cost of having to rewrite how your code works.
Now if you want new chips that work just like the old ones and run the same programs as before, just 10x faster, sorry. That isn't likely to happen. Huge jumps like that require major changes of approach.
"I'm too busy to research this and form an educated opinion, but I do have time to tell everyone my uninformed opinion."
It did indeed have a construct like:
Unfortunately, it was not American.
Sent from my ASR33 using ASCII
I think the real issue is, semiconductors are so competitive, the current shipping product is always very close to the state of the manufacturing and physics arts. Intel, AMD, nVidia, Samsung, Toshiba, Apple, and others spend billions pushing the processes and architectures to the limit in every product so it stays competitive as long as possible.
To get a 4x or 8x improvement in size, power, or speed would imply there's a revolutionary way to do things that we just don't quite know yet. And it better be something which can be quickly turned to production because Moore's Law hasn't stopped yet. If you have a 4x improvement idea but it takes five years to release, it won't get funded. Plain CMOS silicon has too good a chance of catching up.
There's plenty of times people rolled the dice on processor moon shots. I was at HP when Itanium was first developed (~95). We thought we'd have working silicon in a few years (~98 or 99) at the astounding clock rate of 500 MHz (oh, and that was potentially retiring something like 6 to 12 instructions per cycle, I forget the details). This was when a good Pentium processor ran at around 45 MHz. We thought Itanium was going to be so frickin' fast there was no way Intel could compete. Then AMD started a clock rate war, x86 got faster really fast, Itanium took much longer to produce than we anticipated, and the rest was history.
I think the bottom line is, it's really hard to produce a system which really is even 2x faster than the competition. 4x is incredible and 8x probably has never been done.
As an analogy, consider cars and mileage. My car, a diesel Passat (which shortly will not be road legal :() actually exceeds 50 MPG on a good day. What would it take to make a car which gets 100 MPG with a 600 mile range? How about 200 MPG? With no compromises? And a sales price of $28k? It's pretty hard to imagine.
$$$ - If you jump from 50nm to 5nm you get paid well once. If you go 50nm to 45nm to 40nm - you get paid every single year.
In another 18 months, the whole planet will own a computer, and we have not got a practical way of exporting to the Klingons yet.
Sent from my ASR33 using ASCII
yield is more closely related to area than gate count.
Complete nonsense, competition in electronics hardware is blood curdling. There are no monopolies that are not five minutes from extinction. Companies that last more than a few years are as rare as rocking-horse droppings. Apple is admittedly a bit of an exception but even Apple is probably about to fall off the edge, after all they are just Nokia with better marketing.
Facts are history now plebs have politics for religion on social media.
And my IBM x3650 takes more than five minutes, testing hardware, looking for boot devices, and enumerating raid drives. But that's not a problem, nor the issue here.
Functions that do the same thing as functions did in the past now run slower, even with much faster hardware. We need Moore's law just to compensate for the increased bloat.
Bring out a Windows 98 disk and install it in a VM. Marvel at how snappy it is with modern CPUs, even though it can't take advantage of the extra instruction sets and only uses a single core. Then compare similar programs. You don't have to go farther than minesweeper, where the old Windows version is instantaneous, and modern implementations that have simpler graphics (metro, material design, call it what you like - shading and textures are gone) are dead slugs. Similar for simple command line programs, where you find examples that were a few hundred bytes and ran near instantaneously, while the modern equivalents that can't do more take up dozens of megabytes and you get to sip your coffee between hitting return and something happening.
Moore's law just isn't able to keep up with the bloat acceleration.
If they push too far they risk not being able to jump again in time before profits die. Incremental upgrades provide steady income.
The Luxor ABC80 took less than a second to become ready back in '79.
If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
What I want to know is why no-one is trying to make computers with thousands / millions / billions / trillions of processors and similarly large numbers of connections.
The manufacturing would have to be very different, possibly self replicating processors, biologically inspired.
One of the reasons computers aren't good at what humans can do: image / speech recognition / language processing etc, is that they literally don't do nearly as much processing.
There's only so much speed, disk space, internet connectivity you can throw at the problem to make do, still using with shortcuts, picking low hanging fruit problems.
The 80/20 rule is fine. But if you want 100 percent of the results, you need to do 100 percent of the work.
At this point, fast, faster, fastest processors is a linear solution (or shallow enough), that they still won't get close to the processing power of our brains with 100s of billions of connections for a while yet.
Even our ears take thousands of audio inputs from tiny hairs before our brain starts audio processing. Microphones still work with a single membrane right?
I think in part that the processes are now so complex, you actually needed the previous generation of CPU / GPU's to design the next generation. It be hard to skip generations if your current generation can't even run CAD software, let alone validate a design before putting it on a wafer.
Central high power cloud machines are just a disaster waiting to happen, how many times does this have to be proven.
Once would be a good start. Do you really think that people are not designing fault-tolerant network infrastructure?
Those who advocate genocide deserve every protection afforded by law, and none afforded by common human decency.
Laziness is a virtue in a programmer.
The whole point of this profession is to save labor. That includes programmer labor, especially because it's an expensive commodity.
I don't know who has mod points today but this comment is frankly ridiculous.
Those who advocate genocide deserve every protection afforded by law, and none afforded by common human decency.
Linux Bogomips are just a measure of how fast a single core runs a delay loop. The kernel uses it to busy wait for short intervals. It doesn't scale with number of cores, and is usually close to the nominal clock rate.
I'm quite familiar with cores, thank you. That CPU still cannot retire 10 IPC per core.
Of course, you used DMIPS without saying so, which is only the most common synthetic measure of CPU integer performance, so Intel has had 3 decades of experience gaming its results.
Maybe that's the way to go, since we know programmers can and do use sets - introduce a set-based general purpose language. To avoid leading programmers into temptation, the language should have no loop constructs. With no capability to run this: foreach blah in group { result[i++] = do_stuff(blah); }
programmers will quickly learn to instead write: results = do_stuff(group);
I agree, but I think you've taken it a step too far here. Look back at maths and how things like sigma summation and similar things like the product function work. Because of the mathematical properties of these, they are order independent, and inherently parallelisable.
Eliminating loops doesn't mean eliminating a "foreach" -- it just means treating each instance of the block as its own scope, and ensuring that no instance can access the variables of another instance. (Talking "instances" instead of "iterations" immediately says it's not a logical loop, even if the computer running it realises it as such simply due to lack of parallel capacity.)
The problem with this is that you then have to combine the results, so you either need to treat the whole block as an inline procedure and end with a return statement, or you treat the block as a function, and now we're into functional programming.
Basically, this sigma-style programming would be logically equivalent to carrying out a map followed by a reduce... and map-reduce has become such an important concept in server programming specifically because of this inherent parallelism. The thing is that current map-reduce renders code to the programmer in a totally different style to what they're used to. There are parallel programming environments that do render parallelised blocks in a C-inspired way, and surely that's the most obvious approach...?
Got them moderator blues I blieve I walk out the do', With these mod-points I been gettin', I 'most never post no mo'
Physics is the answer.
You can not change the way how the light used for the lithographic processes is bend and diffracted ...
If you want to make smaller masks, you need shorter wavelengths to do the "photographic" processes to mold the chips.
We are on the edge that we are using UV and X-Rays ... there is simply no real way of improvement anymore in making gate sizes smaller.
Go figure and read some articles ... sigh: PHYSICS!
You have provided no sound explanation as to why engineering processes,
Why should I? It is completely clear from wikipedia articles.
time and experience are exactly stuck in lock-step with Moore's Law.
No they are not. "Moors Law" stopped a decade ago, probably 2 decades.
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
This is so ignorant, I don't even...
Image processing. Other types of >1D signal processing. Neural nets. Basically any task that benefits from more than one worker.
I can't believe you were modded up. Are all the mods here today script kiddies?
Your next CPU's are going to be made of coal. Screw this silicon crap. We are going to have clean coal, fast coal, edible coal, fried coal, chocolate coal, coal coal...
"TV, a medium as it is neither rare nor well done." Ernie Kovacs
Note how that last comment refers to "the average cloud-ready server"...
NO competent large scale developer would ever even think in terms of of a "cloud-ready server"! That's exactly what I meant by technological refactoring. It IS happening but we're not bothering to notice it. (Other than some fat-fingering maintenance at Amazon last week) We have uptime expectations, performance expectations, that were impossible a few years ago.
As younger generations of developers move in to replace older ones, the loss of implicit and limiting assumptions of the older ones will allow for newer ways of thinking about the problem space. That is where the stepwise improvements will come from, just at they have been arriving all along.
They could have to change completely how the processor is made and use new materials and that's too expensive to experiment with when slight increases are a sure thing aka the safe bet.
Bad code, and bad coders, sure are very common. I can absolutely relate to what you said.
Bad code isn't in any way limited to declarative programming, or imperative, or procedural, functional, object-oriented ...
Poorly educated programmers can make a mess in every paradigm, and those who continually study for many years to become highly competent can be highly competent with any paradigm. The language or paradigm isn't what makes the difference.
Right now, at work, I'm fixing some bad SQL written by the founder of the company. (Who wrote a huge system by himself in a hurry, with limited programming knowledge.) Some of the SQL is pretty bad. I'm also fixing his bad Perl, which is even worse. Procedural programming (Perl, Java) didn't make him any better or worse than he was using SQL.
If a programmer can learn to use C, Perl, Java, or Erlang well, they can learn to use SQL well. If they can learn to use C, Perl, Java, or Erlang poorly, they can learn to use SQL poorly.
My old Acorn booted to the desktop in 5 seconds, give or take, back in 1992. There's a lot to be said for storing the OS on a ROM chip but these days we have SSDs instead.
If God forks the Universe every time you roll a die, he'd better have a damned good memory.
While we're talking about sad xmas future, I should probably have mentioned the dull tinsel of Intel's leaked, Optane SSD datasheet, which gives Optane an implied durability of 32,000 write cycles.
From the back of a recent napkin:
I estimated a price on the (rumoured) Optane DC P4800X of $3/GB and got $16/day as the cost of sustaining a peak 2 GB/s write bandwidth. (Took no account of write amplification, which is probably very low.) Unfortunately, on this workload (not warranted) Intel's new shiny will need to be replaced every 70 days.
The benchmark result I'm waiting to see is serving NFS from a ZFS server with all NFS traffic set to synchronous write (as the specification requires, but many fast and dirty and redhatty OSes kind of ignore). The low write latency at low queue depth ought to be a godsend in this application.
Users will see little difference—except when their files fail to corrupt or disappear at the previously established baseline rate. No, first-generation Optane SSD can't do it faster than 3D Flash under current administrative practice, but perhaps it can do it faster while remaining correct.
It's weird in this world how figures of merit squish sideways.
ZFS servers are often tuned to sustain a high fraction of a pool's peak write bandwidth (reads are assumed to be heavily cached in memory and ARC).
On current specifications, Optane SSD is worthless for a ZIL SLOG write cache. Just not enough write cycles.
Carbon nanotube–based Nantero NRAM has demonstrated 10^12 write cycles in the lab, and that's just the present lower bound (testing takes a while at this elite altitude).
Notes from a recent napkin:
* Essentially zero power consumption in standby mode and 160x lower write energy per bit than flash.
* Small number of process steps.
* Read/write same speed as DRAM.
* Memory retention > 1000 years at 85C.
But the early devices will be limited to 32 MB (not GB) and the only application presently profitable for such a small device is embedded SOC.
This technology is probably a real thing, any day now.
That still leaves a giant hole where ZFS ZIL SLOG roams the earth. The bidding would start roughly here:
* 16 GB
* 2 GB/s sustained write bandwidth
* 10^6 guaranteed write cycles
* memory retention > 10 years at 85C
Ideally, that would be delivered from a single chip.
Proper NFS semantics over a snapshotty FS pretty much demand something like the ZFS ZIL SLOG (or buttery replacement). That's why I nominated this a primary holding in my bulging portfolio of persistent-memory desideratas. It's a real thing.
[*] snapshotty: also known as "ransomware resistant"
Seems to be a move towards low power devices. Cheap, low power, single board computers, like Raspberry Pi, and of course, mobile devices.
Maybe the industry is concentrating more on that sort of thing now?
We are at the end of what CMOS on silicon can deliver. Something new is need but no one has had the nerve to commit to finding it. Because you will have to risk billions and you may never find the solution. Small steps instead of giant leaps because you don't know where you are leaping to.
I must say that FPGA design (using Verilog) helped me understand multithreaded programming enormously. The physically parallel circuit design is unforgivable, compared to multithreaded code that might work despite lacking the proper rigour.
OTOH, as a physicist I was already used to languages with inherent parallel math. For example Fortran has native vector/matrix math that proper compilers can parallelize automatically. I guess the physics background also helps to think about vectors as inherently parallel beings, because nature does not loop over dimensions.
Escher was the first MC and Giger invented the HR department.
As such, there are some small possibilities for further speed improvement, except for special scenarios. GPUs may still have a few generations with significant speed increases, but CPUs do not. There is room for optimizing cost and power consumption, but when that has happened, then that is it.
Think of this like a hammer or an ax as we know them today: They are finished and there is no way to make them better with reasonable effort. However do not forget that the steel used in them is the result of a few 1000 years of optimization and that they are a very sophisticated product, simple though they may appear to be.
Incidentally, CPUs have not had any dramatic speed increases for about half a decade.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Oh, sure you can *ask* the guys at the factory to lift the lightspeed limit, but Nooooo.... "We'll ruin all the timing in the cards... management says it's the *rules*...we shed too much energy when we leave the dev room and get cold at lunch. Marketing doesn't like the blue glow of cherenkov radiation around the cards."
Whine, whine, whine....
Please do not read this sig. Thank you.
Not enough filthy rich to fund research & development of next generation computers
Not enough folks around who have the money to buy next generation computers
Now if some new "killer app" like the next Lotus 123 comes up? This may change but so far I've seen nothing on the horizon that would fill that slot.
A possible candidate for such a "killer app" for more powerful CPU and GPU would be AI.
There's a crazy amount of interesting stuff that can be done by training deep neural nets to do your work.
Some of this interesting stuff is more geared toward R&D that would run the workload on a high performance cluster anyway.
But some of this has very practical application for everyday life (think RNN-LSTM in automatic natural speech transcription, think all the assistant such as Siri, Cortana, Alexa, OkGoogle, Soundify, etc.)
The problem, is that these neural nets have huge processing requirement to train, and even to use require some processing power.
So currently, most of these AI assistants run remotely on the cloud.
But by having more computing power locally could make such assistant more useable offline.
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
Conceivably, you could save the machine's status in flash each time there's been some sort of meaningful change, and reload that at turnon. You wouldn't have checked hardware for malfunctions, you wouldn't be connected to the internet, and your hard disks wouldn't be ready. Nonetheless, if someone designed a system properly, that system could at least pretend to be ready for use in a second or so.
How long did your C64's monitor take to produce a display? Unless the filament in the CRT was kept warm all the time, it wasn't ready in 0.1 seconds.
Contribute to civilization: ari.aynrand.org/donate
Only to its BASIC. :P
Ant(Dude) @ Quality Foraged Links (AQFL.net) & The Ant Farm (antfarm.ma.cx / antfarm.home.dhs.org).
That's nonsense. One critical factor is lithography. Optics, resists, etching processes, more precise steppers, etc. all need to be developed and in production before ICs get to a new process node. None of that requires state-of-the-art processors to design.
Things like Cadence's design software can run adequately on hardware several generations back. The slowness of old hardware is a damned nuisance, but it's not prohibitive.
Contribute to civilization: ari.aynrand.org/donate
https://www.researchgate.net/figure/238594798_fig1_Fig-1-The-number-of-transistors-per-microprocessor-chip-versus and many similar graphs show that Moore's observation of transistor count has been maintained at least through 2011.
Contribute to civilization: ari.aynrand.org/donate
GPUs are still seeing notable performance increases because the problems you solve with a GPU are embarrassingly parallel. CPU progress has largely stalled because it's hard to get additional per-thread performance without clocking higher; the low-lying instruction level parallelism fruit is all gone. And the physics of the situation doesn't allow continuing to scale clock speeds the way they scaled from 1994-2002.
There are design related gains we know could be achieved without any new materials. In particular, clockless processors could be a huge jump forward. But designing a clockless processor is extremely difficult. A lot of the methods, tools, and engineering that has been developed over the last 50+ years to allow a relatively small team of people to manage the complexity of billions of transistors simply don't apply any more when you're dealing with a clockless processor.
The available wavelengths of light in the EM spectrum have not changed since the dawn of Moore's Law. Only our ability to used them for lithography has changed. It's engineering, knowledge and skill, NOT PHYSICS. What you are referring to as "physics" is simply "standing on the shoulders of giants". There is NO principle of physics that limits us from jumping multiple process nodes ahead, EXCEPT for the fact that we seem only able to do incremental development. You're still missing the point of the OP's question.
Nitpicking at its finest again ... and pretty dump. Do I repeat myself? Strange, have the feeling of a deja vue ...
Lithography only works if the chemicals involved can "react" to your wavelength. That is Physics. Facepalm.
We are not able to "jump ahead" "nodes" ... no idea why everyone today says "node" a chip manufacturing process is not called a "node".
If mankind was able to "jump ahead" we would do that, facepalm.
You're still missing the point of the OP's question.
He had no point. He was an idiot. Thinking you can just tell someone: "I need a warp drive. Better tomorrow than next year."
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
Then there must be some serious miscommunication here since it was looking like you'd missed something that we used to tell all of the first year engineering students in introductory materials science.
I meant the size at which the flaw matters (becomes critical) and not an actual technical term like "critical crack size" which is not the same and is about fracture. What the paper you linked does not go into is that there will be defects in the zone refined ingot of a wide range of sizes but below a certain size they just do not matter. That certain size is going to change if all of your traces are much smaller, which is beyond the scope of that paper because it's not about changing processes.
Is that a way of saying with at the same process size? I have been writing about how smaller flaws matter when going from say 32 nm to 14 nm.
If so sorry about the confusion.
If you turned the monitor on first, it was ready by the time your hand found the power switch on the C64.
Why should storage take time to "become ready" - it's not like we still need spinning rust for home systems.
Not sure why "having internet" would be slow - you have your IP address and hostname, what more do you need?
Can we not check hardware for malfunctions from time to time in an active system these days?
We do all this the slow way because we're accustomed to slow. There's no inherent reason for any of it to be slow.
Socialism: a lie told by totalitarians and believed by fools.
I take it you aren't old enough to remember Lotus 123?
I'm young enough to *have been conceived* more or less at that time. (And anyway, my parent were living on the other side of the Iron curtain back then).
OTOH when Lotus 123 hit? ZOMFG suddenly EVERYBODY had to have Lotus 123, from the smallest business to the biggest corp you weren't shit if you weren't using 123.
Though I couldn't be able to remember a period I didn't live through, I do have some general culture.
Yup, I know that Lotus was one of the first massively successful spreadsheet software.
It owes it to a few key thing, among the fact that everybody can see use of making some calculation (be it for business use as you mention, or any other use. Lab technicians love to have their protocols in the form of spread-sheet that can magically auto-adapt given a few key parameters), and it was relatively simple to use (as opposed to write your own data manipulation script, e.g. using the in-ROM embed BASIC) so that even non technical people could use it.
And it was very well marketed.
God that makes me feel ancient, but the reason your AI wouldn't do it is because it will always be a niche application as there is a very limited subset of people that are gonna actually WANT an AI on their PC, most will think its creepy or weird and say "do not want".
First, I'm not saying that AI as it is deployed NOW is a killer app.
I'm only saying that is has a potential to soon evolve into a must have technology because it helps automate tons of tasks. (I used the word "candidate").
It's a bit like for some time Apps were just considered a "fad". But now some of them are slowly becoming "must haves", are driving hardware sales (you need a good enough phone to be able to use said app. Your old EPOC-powered Ericsson or Palm OS-powered PDA won't cut it, etc.) And a platform that isn't part of any large app ecosystem is probably going to day.
(case in point : Microsoft's Windows for Phone only runs microsoft apps and never managed to break into the current Google/Apple duopoly)
(counter example: Jolla's Sailfish OS. It's a faily small effort, barely known outside geeks. Beside being a fairly standard GNU/Linux distro for smartphones, also features various solutions to run Android apps. Its users aren't left aside. As such it has still managed quite a few success on the levels of manufacturer (Intex - the indian electronic devices manufacturer, not the US swiming pool manufacturer) or even states (Russia and China have shown interests into the platform). That's quite an achievement for such a small team).
My opinion is that even if currently it's only tech demo that creep the shit out of end users (Do you want me to automatically tag all your friends in all your facebook photos ? Do you want me to pre-write draft of your most likely answer on Google Allo ?) the technology has over-all good potential to be useful for end users (e.g.: better voice recognition and thus better dictation than the previous statistical methods used up until now).
The best part is that these are much more autonomous learning technology, relying less on smart algorithms to process the data, but more on the neural net simply "learning the data" and how to make sens out of it (Like a very young children that mostly "learns" the world around simply by experiencing it).
Just like VisiCalc/Lotus 123/etc. made data processing much more accessible to people without programming skills, maybe AI could make bigdata analysis much more accessible to people who aren't necessarily good at developing subtle data processing algorithms and/or complex statistical methods.
Depending on how the technology is used maybe within a decade AI *will* be a must have that you can't do without.
And if that becomes the case, you'll eventually need good hardware to be able to run the deep neural nets locally.
But indeed, nobody want the current AI that automatically recognise all your friends puking on your facebook photos.
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
Companies that do that get their lunch eaten by their competitors
https://hardware.slashdot.org/story/17/03/06/1420243/ibm-will-sell-50-qubit-universal-quantum-computer-in-the-next-few-years
It's coming. It will change everything*
* where 'everything' is probably the wrong word. Maybe 'lots' is better.
Have you tried booting Windows 3.1 off an SSD? I can bet it's faster.
"It is no measure of health to be well adjusted to a profoundly sick society." - Jiddu Krishnamurti
Exactly! Same here. Chromebook is just no trouble and a smooth ride. Privacy issues aside.
"It is no measure of health to be well adjusted to a profoundly sick society." - Jiddu Krishnamurti
Except my link already said just that:
"Starting in 2014, Intel introduced "Refresh" cycles after a tock in form of a smaller update to the microarchitecture. It is said that this is done because of the expanding times to the next tick... In March 2016 in a Form 10-K report, Intel announced that it had deprecated the Tick-Tock cycle in favor of a three-step "process-architecture-optimization" model..."
Did you even read it?