The Legacy of CPU Features Since 1980s
jones_supa writes: David Albert asked the following question:
"My mental model of CPUs is stuck in the 1980s: basically boxes that do arithmetic, logic, bit twiddling and shifting, and loading and storing things in memory. I'm vaguely aware of various newer developments like vector instructions (SIMD) and the idea that newer CPUs have support for virtualization (though I have no idea what that means in practice). What cool developments have I been missing? "
An article by Dan Luu answers this question and provides a good overview of various cool tricks modern CPUs can perform. The slightly older presentation Compiler++ by Jim Radigan also gives some insight on how C++ translates to modern instruction sets.
"My mental model of CPUs is stuck in the 1980s: basically boxes that do arithmetic, logic, bit twiddling and shifting, and loading and storing things in memory. I'm vaguely aware of various newer developments like vector instructions (SIMD) and the idea that newer CPUs have support for virtualization (though I have no idea what that means in practice). What cool developments have I been missing? "
An article by Dan Luu answers this question and provides a good overview of various cool tricks modern CPUs can perform. The slightly older presentation Compiler++ by Jim Radigan also gives some insight on how C++ translates to modern instruction sets.
The first large scale availability of virtualisation was with the IBM 370 series, dating from June 30, 1970, but it had been available on some other machines in the 1960's.
So the idea that "newer machines have support for virtualisation" is a bit old.
Watch this Heartland Institute video
Also served as a space heater.
He wrote, "introduced to x86 since the early 80s include paging / virtual memory, pipelining, and floating point." We know that some platforms had some of these features earlier than x86, but he was speaking to those who had been programming on the x86 platform. Of course, this ignores the x87 math coprocessor, but I digress.
Gamingmuseum.com: Give your 3D accelerator a rest.
Did you read the bit at the *top* of the article:
"Everything below refers to x86 and linux, unless otherwise indicated. History has a tendency to repeat itself, and a lot of things that were new to x86 were old hat to supercomputing and HPC."
The latest generation of CPUs have instructions to support transactional memory.
Near future CPUs will have a SIMD instruction set taken right out of GPUs where you can conditionally execute without branching.
You are overreacting.
There must be computers with registers far wider than 64-bits today. The article simply stated the most common capabilities.
The IBM 360/370 line and its successors have had decimal arithmetic (in addition to binary and after the 370/158 floating point) since the 1960/70s. Others have had these also.
Actually there are also random number generators and all kinds of controllers (memory controller etc.) in many CPUs nowadays. But yes, mostly still binary math.
There was a period in the 0s when PC processors were good for cooking eggs. You had to be careful with the AMD ones though, they had a tendency to burn the egg quickly.
Then it's settled. Your edge case must apply to everybody. The article is wrong.
I'm not a nerd. Nerds are smart.
I remember when VMWare first came out, and there was all this amazement about all the cool things you could do with Virtual Machines. Very little mention anywhere that these were things you could do for decades already on mainframes.
Same thing with I/O offloading (compared to mainframes, x86 and UNIX I/O offload is still quite primitive and rudimentary), DB-based filesystems (MS has been trying to ship one of those for over 20 years now; IBM has been successfully selling one (the AS/400 / iSeries) for 25, built-in encryption features, and a host of other features.
Current 64-bit path/register cpu architechture will satisfy most computing requirements for some time to come. The only real reason to increase data path width is to address more data. Until we have need to address 16 exabytes, 64 bit will remain in favour everywhere because $$$.
I am staying away from your lawn, that's for sure. If my frisbee lands over there, you can keep it; you've earned it.
The whole evolution of multi-level caches has gotten a bit crazy. if Intel can put 2 billion transistors in a processor, how about instead of piling more cores in, just do 1 or 2 cores and a massive L1 SRAM cache instead? Memory latency is more of a bottleneck than anything just now.
We just had a story about low-level improvements to the BSD kernel, and now we get an article about chip-level features and how compilers use them?
Is this some sort of pre-April-Fools /. in 2000 joke? Where are my Slashvertisements for gadgets I'll never hear about again? My uninformed blog posts declaring this the "year of Functional Declarative Inverted Programming in YAFadL"? Where the hell are my 3000-word /. editor opinions on the latest movie?
If this keeps up, this site might start soaking up some of my time instead of simply being a place I check due to old habits.
Last post!
I have had arguments over this. People in various fora have asked what programming languages they should learn. I always put assembly in my list. But is it really important enough to learn these days? Is hardware still relevant?
putting the 'B' in LGBTQ+
What does that even mean? Pipeline-able executions? How much data? There is no context.
I think we can safely assume "yes" and "one machine word" here
[rants how crude the author's understanding of the matter is without giving a grain of indication that he's got a better understanding]
Speaking of people who sound like they're in junior high.
CLI paste? paste.pr0.tips!
If you want to see more Slashdot-in-2000 style posts, and you have access to the sort of articles that Slashdot-in-2000 might have posted, Slashdot welcomes your submissions. You could even become a "frequent contributor".
If you want to get the most out of an 8-bit microcontroller, you'll need assembly language. Until recently, MCU programming wasn't easily accessible to the general public, but Arduino kits changed this.
For example, I worked for a decade in the linux kernel and low-level userspace. Assembly definitely needed. I tracked down and fixed a bug in the glibc locking code, and you'd better believe assembly was required for that one. During that time I dealt with assembly for ARM, MIPS, powerpc, and x86, with both 32 and 64-bit flavours of most of those. But even there most of the time you're working in C, with as little as possible in assembly.
If you're working in the kernel or in really high-performance code then assembly can be useful. If you're working with experimental languages/compilers where the compilers might be flaky, then assembly can be useful. If you're working in Java/PHP/Python/Ruby/C# etc. then assembly is probably not all that useful.
The author states right at the beginning of the article that he's focusing on x86.
That's what makes the article absurd. The original question, to which the article is a response was:
So the moron then proceeds to blather on about Intel x86, quoting as "cool developments" stuff that has been around for forty odd years.
Watch this Heartland Institute video
You are a retarded idiot. The author states right at the beginning of the article that he's focusing on x86. In the (late) 80s, most people had an IBM PC, if they had anything.
"Most", maybe. But the late 1980s were the heyday of the Macintosh, Amiga and Atari ST.
Come to think of it, I'm not even sure of "most" outside the business world. The Commodore 64 and Apple computers fit in there somewhere.
You are a retarded idiot. The author states right at the beginning of the article that he's focusing on x86. In the (late) 80s, most people had an IBM PC, if they had anything.
"His name was James Damore."
I haven't seen the article or video. But for 99% of developers, I'd say the only CPU-level changes since the 8086 that matter are caches, support for threading and SIMD, and the rise of external GPUs.
Out-of-order scheduling, branch prediction, VM infrastructure like TLBs, and even multiple cores don't alter the programmer's API significantly. (To the developer/compiler, multicore primitives appear no different than a threading library. The CPU still guarantees microinstruction execution order.)
Some of the compiler optimization switches have become more complex, and perhaps a few coding idioms are now deprecated/encouraged so that compilers better understand what you intend (so you don't make their job unnecessarily harder).
But overall, almost all developer techniques don't benefit from changes to CPU microarchitecture after 1990, aside from caches, SIMD, and GPUs.
And of course, ever since the 80486 (1989), all CPUs support floating point instructions.
I'm not a CPU expert so feel free to take my opinions below with a grain of salt... (grin)
The biggest change to processors in general is the increased use and power of desktop GPUs to offload processing-intense math operations. The parallel processing power of GPUs outstrips today's CPUs. I'm sure that we will be seeing desktop CPUs with increased GPU like parallel processing capabilities in the future.
http://en.wikipedia.org/wiki/G...
http://www.pcworld.com/article...
The GP said decimal arithmetic. Those of us that know about processors and electronics - including the GP, but apparently not you - know *exactly* what he meant by that.
Hint: go look up the instructions that deal with "binary coded decimal" for x86 or 680x0.
I'm not sure SIMD really falls outside of a "1980s" model of a CPU. Maybe if your model means Z80/6502/6809/68K/80[1234]86/etc, rather than including things like Cray that at least some students and engineers during the 80s would have been exposed to.
von Neumann execution, but Harvard cache become common place in the 1990s. Most people didn't need to know much about the Harvard-ness unless they need to do micro-optimization.
Things don't get radically different until you start thinking about Dataflow architecture and Transactional memory. I'm not sure if Dataflow will ever come back, but transactional memory is here and pops up from time to time and I think it will get big pretty soon as moves beyond being some small obscure part in a processor core.
(Transport Triggered Architecture is equivalent to von Neumann for software when abstracted out in a macro assembler, so I don't count TTA as something new, plus it was pretty uncommon)
“Common sense is not so common.” — Voltaire
while (hitCount < arraySize) {
i=rand() % arraySize;
if (hitArray[i] == 0) {
sum += array[i];
array[i]=0;
hitArray[i]=1;
hitCount++;
}
}
https://www.eff.org/https-everywhere
And of course, ever since the 80486 (1989), all CPUs support floating point instructions.
486 SX chips had the FPU disabled or absent. So not all CPUs (or even all 80486 CPUs). As far as I'm aware Penitum (586) did not have a model without FPU support (although in the MMX models, you couldn't use MMX and the FPU at the same time).
He effected a bored affect.
I used to sort of understand how a computer works. Not anymore. It's just magic.
Guy who doesn't understand how CPUs work amazed about how CPUs work. /thread
It must have been something you assimilated. . . .
Can someone explain why in the example race condition code given, the theoretical minimum count is 2, and not n_iters?.
A llifetime of stereotyping will do that to you.
Also note that rather recently Intel drastically dropped the accuracy of their FPU's in order to make the performance numbers look better.... dont expect 80-bit procession even when explicitly using the x87 instructions now... its now been documented that this is the case but for a few years Intel got away without publicly acknowledging the large drop in accuracy....
"His name was James Damore."
if you understand scalar assembly, understanding the basic "how" of vector/SIMD programming is conceptually similar
Actually, if you think back to pre-32bit x86 assembler, where the X registers (AX, BX) were actually addressable as half-registers (AH and AL were the high and low sections of AX), you already understand, to some extent, SIMD
SIMD just generalizes the idea that a register is very big (e.g. 512 bits), and the same operation is done in parallel to subregions of the register.
So, for instance, if you have a 512 bit vector register and you want to treat it like 64 separate 8 bit values, you could write code like follows:
C = A + B
If C, A, and B are all 512 bit registers, partitioned into 64 8 bit values, logically, the above vector/SIMD code does the below scalar code:
for (i == 1..64) {
c[i] = a[i] + b[i]
}
If the particular processor you are executing on has 64 parallel 8-bit adders, then the vector code
C = A + B
Can run as one internal operation, utilizing all 64 adder units in parallel.
That's much better than the scalar version above - a loop that executes 64 passes..
A vector machine could actually be implemented with only 32 adders, and could take 2 internal ops to implement a 64 element vector add... that's still a 32x speedup compared to the scalar, looping version.
The Cray 1 was an amazing machine. It ran at 80mhz in 1976
http://en.wikipedia.org/wiki/C...
According to WP, the only patent on the Cray 1 was for its cooling system...
My opinions are my own, and do not necessarily represent those of my employer.
True. The one article linked is every specifically x86 oriented (all hail to the monoculture). There really are far far too many people out there still who act as if microcomputers were the beginning of computer history.
There is some interesting stuff. But it mostly boils down to ways to optimize code. The older chips may have had the idea for something but didn't implement it due to the enormous cost. Sometimes it's handy to have just a couple of instructions to help out rather than add a giant feature; is in having no floating point or multiplication (early RISC machines) but having an instruction to find first or last bit set which makes the software library to do this much faster.
There are instructions to help out cryptography, which I don't think any computer in the 60s was concerned enough about to devote expensive hardware to it. Instructions to support atomic operations even within a multiprocessor environment is present in many modern CPUs too, whereas in the past if there were multiprocessors there would usually be some round-about way to do this. As processors got more complex with out of order execution and delayed writes, there was a need for instructions to synchronize operations, such as the "EIEIO" instruction on the PowerPC. Possibly some of this was present on early supercomputers but today these are present in mainstream processors.
It's not that we're all crotchety though. But these articles are like going to a history class where you're taught that everything before 1960 isn't relevant, so just assume that JFK was the first president. There a monoculture out there with the PC, but it's not a representative of the state of the art, in the past or the present. It's not the most common chip, it's not the best designed chip, it's not a good chip for learning architecture with, there's nothing much to recommend it for except that it's compatible with Windows and Mac OS and the volume chipped makes it relatively inexpensive.
If you don't learn history you're doomed to repeat it.
You can have a memory bus interfaces a wider than 64 bits. This has nothing to do with word size or address space size, but the fact that reading and writing more bits at once is a big speed increase. Ie, DIMM memory with a 64 bit interface (more if you count ECC) was common long before there were 64-bit PCs.
There are also floating point representations in common use with 80 bits, and a 128 bit format is in use but less common.
8 bit was not the most popular back then. And 64 bit is not the most popular today. This may be true back then if you consider only the newly created microcomputer segment, and it may be true today if you consider only the PC & Mac segment.
When 8 bit CPUs were new and thus few in number we had lots of computers already with 16 bits and more. The 8 bit CPUs were primarily used by hobbyists at the time, or as support for larger computers.
Today the x86-64 is not the most common chip, because the majority of chips out there are on embedded systems rather than PCs and Macs. Count up all those phones which probably have more than one CPU on them, the CPU controllling your fuel system in the car, the CPU in your microwave, television, router, etc. The most common word width for embedded systems are 16 and 32 bits, and I think even 8 bit embedded systems out number the 64 bit ones.
The article does not feel like it's been written by a junior high school student, I agree. But it does feel like it's written by an undergraduate student in the middle of the curve.
Moderation is our form of peer review.
>> If you don't learn history you're doomed to repeat it.
“What we learn from history is that we do not learn from history”
Benjamin Disraeli
I have plenty of common sense, I just choose to ignore it. -- Calvin
What about that part of the question?
The things I’m most interested in are things that programmers have to manually take advantage of (or programming environments have to be redesigned to take advantage of) in order to use and as a result might not be using yet. I think this excludes things like Hyper-threading/SMT, but I’m not honestly sure.
That is what the article is really about and what it is answering.
The answers seem comprehensive and useful, i.e. you can switch page size to reduce pressure on the TLBs if your application benefits from it, you can ignore the branch penalties due to the hardware being so efficient at running branches.
Your old rat's nest mainframe did not run at over 1GHz with a million or however many transistors dedicated to OoOE and branches you dumbfuck.
Your old rat's nest mainframe did not run at over 1GHz with a million or however many transistors dedicated to OoOE and branches you dumbfuck.
The CDC 6600, around 1964, had partial out of order execution. The IBM 360/91, 1966, had out of order execution.
Twat.
Watch this Heartland Institute video
You might like the second sentence of the article:
things that were new to x86 were old hat to supercomputing, mainframe, and workstation folks.
Just curious, did you read one sentence of the article before commenting on it?
At least he didn't stop at "asked" and immediately fired off a diatribe against Ask Slashdot.
Of course news about a fake are Fake News.
Given that we had 80-bit extended precision FPU registers in 8087 chips 30 years ago, I don't think the 64-bit path/register assertion holds any water. I have lots of code that uses 128 bit registers and runs on pretty boring consumer CPUs. The reason to increase data path width is not to address more data, but to increase the throughput. I use 128 bit registers with code that uses no virtual memory and runs with a couple MBytes of RAM.
A successful API design takes a mixture of software design and pedagogy.
Moreover: which cycles? Core? FSB? Memory? Eh? Under what test conditions?
A successful API design takes a mixture of software design and pedagogy.
Strictly speaking, there's no reason for user binary other than that it makes some things a lot easier, while it makes other things a bit more difficult.
For example, during the early time of electronic engineering, the Russians/Soviets experimented with ternary computers, the "SETUN" while the USA had the "Ternac". Both had more complicated hardware than a binary computer, but were a lot more efficient at processing arithmetic instructions.
See: http://en.wikipedia.org/wiki/T...
And who knows, in a few decades, people might thing binary to be quaint and outdated, given that Qubits are so much, much more efficient.
So, what, just double or triple the voltage?
This comment is my opinion and does not represent an official position of Donald Trump or others I do not work for
... where you're taught that everything before 1960 isn't relevant, ...
where you're taught that everything before 2000 isn't relevant, ...
Fixed that for you. 8-)
Current 64-bit path/register cpu architechture will satisfy most computing requirements for some time to come. ...
"There is no reason that anyone would need more than 640KB."
"The maximum number of computers needed in the United States is five."
But yeah, 64bit is probably enough for a little while... 8-)
I am staying away from your lawn, that's for sure. If my frisbee lands over there, you can keep it; you've earned it.
Thank you. But I usually throw the Frisbees back, when I walk the dog. 8-)