PowerPC Assembly Language
Josh Aas writes: "I've been looking for a way to learn PowerPC assembly language for a while now. My search for books only led to extremely out-of-date publications, and the whole ordeal was generally frustrating. I was amazed at the lack of documentation. Even Motorola and IBM's documentation resources (on the web) were lacking anything of use to me. However, it turns out that Apple provides a pretty good free tutorial on the subject. It's tailored for coding in Mac OS X, but I imagine it would be just as useful in any PowerPC environment. For some reason it includes instructions for the Intel architecture. Perhaps this has to do with the fact that Darwin runs on x86 as well."
Is available at here
--jeffipv6 is my vpn
The real use of learning assembly is to validate your compiler output. It's rare to want to code much from the ground up in assembly anymore, save hardware-banging code, but knowing assembly still makes all the difference.
Yes, most compilers will perform 1,001 optimizations, but the final bit of optimization you can eck out demands that you profile, examine the hot points, and replace them with assembly or restructure your C with a good idea of the code you really want it to generate.
You'll still find things that the compiler engineers haven't thought of, or were lazy about implementing, or operations which implement unneccessary logic/precision.
For example, sometimes it's best to hoist an expensive calculation -into- a loop because it fits where the processor otherwise stalls, but most compilers take the opposite approach. It's tough to force a C compiler to do this, because most want to hoist code -out- of loops when possible, assuming it's cheapest.
Dependency on non-native signedness, float format or type sizes can mean that shifts, transcendental math, type conversion, etc is being done in software when a single opcode is what you expected.
Understanding the code will tell you if something unusual is happening, such as non-native sized bools or less than optimal variables being turned into register variables.
Some operations are simply not available to the C-only programmer. Subsequent lookups to sin/cos operations can be combined into a single opcode, but I've yet to see it done by any compiler. It's only available in assembly.
The PowerPC offers a rough inverse/division approximation which is faster than a real divide, and is good enough for operations only requiring low precision or an approximation... also completely inaccessible to C code.
Knowing a variable is guaranteed to be within a certain range or of a fixed set of values at a certain point can let you get away with all kinds of murderous assumptions which it's impossible to express in C.
That said, I haven't seen a single PPC assembler reference that was half as good as just looking at the code. Look at code, look at a lot of code, and past that, just look at system implementors' documentation. x86 through Pentium III aside, most current assembler books are just fluff, are wrong about half the pipelining, omit a million useful optimzations and don't cover the real story at all. It's really gotten to the point where the only real way to learn is by doing and doing and doing.
I'm working on an embedded PowerPC 603e and was able to find some descent documentation on the 32-bit PowerPC instruction set in general from Motorola's home page. I know you said you checked there, but I had a difficult time tracking it down myself. So in case you missed it, the name of the document is PowerPC Microprocessor Family: The Programming Environments for 32-Bit Microprocessors . I ordered a printed copy from their literature center, and a week later I got a nice little green book that has already proven to be indespensable.
A completely different reason for wanting know Assembly language is not related to software. If you are a hardware design engineer, Assembly Language can be invaluable. I work for a company that makes Logic Analzyers and the most popular embedded processors is the PowerPC family. Our Logic Analyzers have a built in inverse assembler. Many Many times hardware engineers will look at the IA to determine what was going on, on the bus. Granted, they don't care what C instructions were being executed. They want to know, at the roots, what was the processor doing.
I think its very unlikely nowadays someone would write a computer application in assembly laungage. It would take way to long. But in the embedded world there are tons of reasons to know assembly. At the very least, everyone should know how to read it....
Technical Library
PowerPC FAQ
Lightsoft: Beginners Guide to PowerPC Assembly Language
March 95 - Balance of Power: Introducing PowerPC Assembly Language
PowerPc Architecture
PowerPC Library
The Metroworks Code Warrior documentation also has some helpful stuff. I found a copy online a while ago, but it's gone now.
I thought the Power PC was x86 compatible.
Someone you trust is one of us.
RISK instruction sets assume you don't mind having slightly larger code segmets if it means that most of your instructions execute in one clock cycle and you can up your clock rate (this is why the pentium family uses RISC cores). It also assumes you don't want to pay in transistors (and power and heat) for rarely used instructions. It also assumes you have compilers that a good enough so that you don't pay too high a penalty for not having those scan opcodes, having to work with only a couple of addressing modes, etc.
VLIW takes this one step further. You don't mind increasing your code size because it means that your compiler has lots of time (comparatively) to figure out how to do multiple simultaneous issues in order to maximize your performance per transistor by explicity putting each issue into the verly long instruction. This assumes that you have very smart compilers. Unfortunately, it also means that your instructions are big. If clock multipliers continue to climb, you may see VLIW instruction sets, but decompressors on the die in order to increase the appearent memory bandwidth.
Note that this progression in instruction sets represents a progressive moving of certain aspects of the CPU somplexity from the CPU core into the compiler. This allows the limited die realestate to be used for more ALUs, more Cache, smarter branch prediction, "SMP on a chip", etc. while continuing to allow the price of CPUs to fall.
You can do on-chip emulation of the legacy insstruction sets (like on the P4 and even the Itanium), but it's baggage you have to drag arround even when running your native code. Software emulation of the legacy hardware seems more appropriate. If we all switched to some VILW CPUs now, in another 3 years (being pessimistic), Bochs (or another emulator) running on those CPUs would run the legacy code as fast as today's machines, and would run native code much faster.
Copyright Violation:"theft, piracy"::Anti-Trust Violation:"thermonuclear price terrorism"<-Overly dramatic language.