Linux Gains AltiVec Support
Anonymous Coward writes: "Terra Soft today [Note: Thursday] announced development support for AltiVec (a.k.a. "Velocity Engine"), saying that Black Lab Linux running on a PowerPC G4 may offer up to a '150-300% increase [in performance], with some Linux applications running in excess of 10 times (1,000%) their normal performance.'
The AltiVec-enabled Black Lab Linux offers the GCC compiler with support for the AltiVec C and C++ extensions, as well as Linux-kernel run-time support for AltiVec enabled applications."
I have seen a bunch of literature that changes the I to Independent from inexpensive. They like to charge more.
costumer: How much for drive?
Vender: $4000
costumer: I thought you said they were inexpensive.
Vender: No, I said independent.
Costumer: Idependent disks in an array. COOOOOL!!!
I'll take two.
only some Display PostS#t they call Quartz
Actually, its Display PDF, not Display PS.. Mac OS X Server uses DPS, and last time I played with it, there was an X11 Server available for it.. Unfortunately, it was somewhat buggy. There may not be a X server shipped by Apple, but I doubt it will take long for someone to get one working..
Personally, I'd rather see Apple include a means of setting the PDF display back to a remote machine than fool around with X11. I'm not sure, but I'm guessing that the vector graphics could be streamed over a modem, or across a network, much easier than an xsession...
ACs don't have to worry about karma.
Where are the benchmarks?
Distributed.net recently (a couple months ago) released an altivec enabled version of their client for the MacOS. The result of including calls to the Altivec unit amounted to a 300% increase in keyrate. I didn't believe it until I threw both the non-altivec and and altivec clients on one of the G4's we have kicking around..
Since that little demo, we've entered Absoft's beta testing program to get a copy of their Altivec enabled F90 compiler and are rewritting all our CFD codes to include the Altivec calls..
As for MMX, 3dNow, etc.. Who cares.. I can't use MMX calls on a PPC machine. I'm sure you'd see an improvement in performance by properly using the SIMD units on Intel and AMD processors, but I have now idea if you'll see the same sort of performance jump..
Depends on what you are doing... Sun has the advantage that you can get more hardware... i.e. more processors, more ram, yada yada yada. You can build a bigger system. But for smaller systems, Sun doesn't really stack up. We recently did some tests with nbench and several in house apps, and the performance of RedHat 6.1 on a PIII 667 absolutely killed a dual processor 440 mhz Ultra10, which is quite nice especially when you consider that the PIII cost several grand less.
Perhaps he did not want to login, jackass. RIAA is hardly a new abbreviation (RIAA is not an acronym, an acronym in a word formed with the abbreviations).
Did I mention that you are a jackass?
Summer grits, make me feel fine!
>>Trust me. I could have done better, especially >>with a troll like you. >>So hows your mom? Is that an example of how you "could have done better"? You are such as ass-fuck! Come on, lets here you rip me with one of those classic mom jokes, you little 13 year shit sack. Did I mention that you are a jackass? Summer grits, make me feel fine!
>>On your left, we see a gathering of trolls, who >>have sunk to the level of pointing out verbal >>faux-pas.
>>Although, i'll conceed one thing -- the "Summer >>grits, make me feel fine!" made me laugh.
You have sunk to the level of a jackass!
YEEEHAH!
sun has a well thought out directory structure, while a distro like red hat often make sys admins pull their hair with the default placement of files in directories.
The problem isn't the processor in particular.. actualy its the other hardware... Apple keeps the rest of it secret... and Be does not want to design their own hardware (never did... they built the bebox for developemnet purposes only)... The only reason linux runs is a bunch of really dedicated linux hackers... Gotta have a lot of respect for them...
Motorolla fails to benchmark (publicly displaying them that is) any of their latest cpu's (PPC 7400) but their older ones are benchmarked (like the 604e). As for the IBM RS6000 series, the processor is based on the motorolla 604e series (which is 4 years old?) - chosen because the 604's have better fpu/mhz performance over the ppc 750s+ (not sure about the ppc 7400). I think motorolla wants to give all of their processor specs after that processor is outdated... anyhow, I don't think anyone really knows the "power" a ppc 7400 can execute (except for motorolla)... the RS6000 series doesn't mean anything to it...
That rules! Too bad I'm out of moderator points :(. That doesn't really look like aalib. Hmm. Either it, it rocks.
Had enough of what? Have you started to do something? Let me know when you are ready (I would imagine that will be after you remove your head from your ass). Mom jokes are the lowest form of humor. Did I mention that you are a jackass? Yeehah, grits down the front of my trousers!
Oops, the Altivec unit does require a small kernel level thinger. Sorry, my bad.
on an absolutely fucking hilarious flamewar. Thank you.
First of all, why is AltiVec considered to be 'proprietary'?
Secondly, how is using the entire instruction set considered bloat? Should we now start only using half of the instruction set of any given chip in order to combat 'bloat'?
And thirdly, I believe the majority of this work (if not all of it) went into gcc, not Linux.
Hey, I like this thread. Really. Please go on.
Hmm yes this is unfortunate. One can only hope that this is just an initial step, and someone will then work on incorporating this into something that's actually useful. It is a nice first step, though, you must admit.
Christ if it wasn't for stupid people, my friends with marketting degrees would all have to flip burgers for a living.
Willamette may have a full-fledged Vector unit in it -- there's been rumors.
Does this mean that absofts linuxPPC fortran compiler is AltiVec enhanced, or is it just their MacOS version? (there is no AltiVec symbol on the linux PPC page on their web site and the info does not make it clear)
Number one is that all of the instructions and registers are allways available to an application. Which means that there are no special restrictions on there use.
Second the FPU is still useful when using the Altvec instructions.
Most important though is the lack of restrictions so to say that the VECTOR unit is only for Vector Processing is simple wrong. While the Altvec unit is at this time a extremely good value for what use to be DSP work the usage of thoose nstructions and registers are up to the programmer and the operating system.
Dave
A marketing claim coming from Apple is automatically suspect.
The same claim from a Linux distributor is automatically credible.
Both claims should be handled with the exact same level of suspicion, i.e. a lot !
Note: I find the 7400 to be a very good processor, very competitive wrt. 1GHz PIII & Athlons. However I can spot marketing exagerations from all sides. I tend to tolerate Apple marketing lies, as they suffered from the competition FUD for too long. It's nice to see them fighting back, as long as I have enough technical background to spot the truth.
POAG, that was weak.
What is a POAG? Penis on a goat? Now that is god damned funny!
Any POAG is an excellent example of an acronym!
Yeehah! Grits to my left and grits to my right!
And fuck karma!!!
and on your right, we have POAG making momma jokes. real cool, Propaganda boy
PENIS ON A GOAT! LOL!! fucken POAG...
It didn't work.
WTF do you expect from a press release? Go to www.altivec.org and read the papers...
You can use many 3Dfx cards and Voodoo2 cards in a mac using Griffins NE3D adptor and drivers
Why
AltiVec support in GCC
AltiVec enabled GCC patches
As previously posted, an elegant way to do this is to use C++, not C, and paramaterize STL's std::vector for vector when libstdc++ knows that target==G4 or has_altivec_unit. Configure hackery could be done to compile this into the standard library for these targets. The dudes doing libstdc++ aren't stupid: I'm sure they've thought of doing this.
heres one!
Cool, I'll see you there then.
Well, this isn't strictly true. An application which relies on AltiVec enhanced system services or dynamic libraries will get a performance boost without necessarily needing to be updated itself.
Where's the OS with these "altivec enhanced" system services? Heck, they just finally got rid of 68k emulation in 8.6.
well... right now there is a part sortage stopping them...
Just curious, and maybe I'm missing something- BeOS can't run on G3/G4 based Power Macs because Apple would not release the specs needed for Be to compile their code so it could run on a G3/G4. How is it that Linux can do this? Did Apple release the specs, or is it reverse engineering?
A company to make PPC motherboards will need doe and insight to stay in black...IBM has yet to sell 7400 computers you know.
How would Linux need to mature? Why would you prefer FreeBSD specifically, have you installed Linux and NetBSD? Do you use FreeBSD? What amount of time did you use each? FreeBSD is an advanced BSD UNIX operating system for "PC-compatible" computers
Get Linux/PPC. If you must use BSD, why not NetBSD?! NetBSD has better source organization. NetBSD/powerpc, NetBSD/macppc, NetBSD/prep and NetBSD/ofppc are in the tree.
DANGER! DANGER! running sleep more than 162 seconds on Altivec can cause CPU to smoke.
chicken and fucking egg.
how the hell do you always manage to completely miss the point and post completely mindlessly more than anyone else except for maybe WillAffleck? just a question.
Later on the the thread, you call someone a troll. Yet here you are saying "i'm absolutely sure that nobody knows what the hell an 'AviTec enabled' application is, or cares". Or, put another way "AltiVec? Never heard of it, but I'm don't care and neither do you! It's clearly stupid, anyway!" Forgive me, Pot. You seem to have mistaken me for a kettle of some sort. AltiVec is the vector exectuion unit that is included with all currently-shipping G4 processors. Unlike the FPU or IU, AltiVec is designed to do very few things, but do them very well. Intel has taken a stab at this with MMX and KNI, grafting the illusion of a vector unit over the existing MMX unit. Short form: it makes things like photoshop filters, MP3 playing/encoding, compression, encryption, rendering, etc run INSANELY fast. Ever see a distributed.net client run on a G4? I know you don't care, but I hope this explanation has done a bit to explain things. It's important, considering that every new processor being released seems to have something similar being included. E
Because Stile is a fucking cuntwad.
And so is Sharkey, and wrongforum, and Solo, and the whole rest of that fucking bunch.
you need to calm down jackass.
the facts are in front of your face and everyone elses...
linux outperforms sun by over 5000% in single cpu operations, it outperforms sun by about 1800% in SMP operations. the same goes for IRIX/AIX/etc.
that it performs 1000% better on a G4 is no surprise.
Linux is the internet innovator, get on the train or get run over.
whats up POAG?
Sun cannot compete with linux/intel servers, they try to make up for incompetence and poor performance with marketting and FUD.
linux/intel is a more scalable architecture, it is more secure and linux is powering the enterprise datacenters where sun is failing.
look for sun to be out of business in 6months to a year as linux kills it.
This is well and good, but this sort of thing is eventually going to bring about the death of Linux. MMX, 3DNow, SSI, and now AltiVec? How many more proprietary extensions will we have to support?
Let's just face facts here: the Linux kernel can only get so big. There's only so much bloat a software project can stand before it just collapses in on itself. And here we are adding all sorts of little niggling different unnecessary bits to a kernel source tree that's already too big for its own good.
Is there any way that the gains from this sort of mentality offset the damage that it does to a project? I think not. For the sake of argument, let's take the Slashdotter's favorite bad example, Windows. It tries to do everything, and we all know that it fails miserably at most of it for precisely that reason. There are at least two mutually incompatible ways to do everything: Windows Media Architecture and DirectSound, OpenGL and Direct3D, and, for crying out loud, FAT and FAT32.
We have to fight this kind of creeping proprietary featurism in Linux lest it go the same way as Microsoft, and I think not including this new kernel support for AltiVec would be the place to start.
>>actually, the altivec vector units can do quite a bit. 160 instructions, iirc. Is this in response to what I said about AltiVec being able to do a few things very well? I don't follow what you're saying...having 160 instructions doesn't really mean much if those are all aimed at special purpose things. If you're drawing a correlation between instruction count and usefulness, you're on the wrong page. Those 160 are int he _Vector_ unit. Necessarily, they are used for _Vector_ operations. Since there are a limited (but highly important) number of these applications, the actual instruction count doesn't tell the whole side of story. Look at Cray computers. Hmm...vector computers...lots of instructions..weren't exactly general purpose were they? And this is not just because of price restrictions: they really were NOT general purpose machines. They did a few things, but they did them VERY well. Just like "AviTec", or whatever the hell the thread calls it.
I laugh. I laugh hard. You honestly rate all of these things as being equal? What on _earth_ is wrong with you? Do you not read the (very informative) posts that have been placed so far? Do you not keep up with microprocessor technology? What's that you say? You don't? Then why post informed trash? *mutters to self*
+5 insightful and wrong. Intel's FP SIMD extensions have their own registers, an can execute concurrently with the FPU. You're thinking of the integer only MMX.
As for your optimized X "no-brainer" I doubt you can find a G4 without a graphics accelerator, so it buys you nothing. you're still limited by your AGP bus bandwidth for mem->screen blits, so it doesn't much matter if you're trying to do them normally or 128bits at a time.
Slashdot moderators appear to be just as dumb as slashdot posters.
Okay...this is a good development.
Now, where can you get a hold of a G4-based computer that
ISN'T a Mac, hmm?
Where's the commodity PPC motherboards?
I might be wrong, but I've heard that when running two AltiVec enabled applictaions, there will be some corruption. This questions me, how will this new linux system work without corrupting...?
This has to be moderated up as funny... hilarious... something like that!
I have worked on may platforms and have found that High-End hardware will kill Intel any day of the week. I don't know how merced will stack up but I do know as things stand now I would take an AIX box over linux any day of the week. On the other hand, I would take a Linux box over an HPUX box any day of the week. If you want to do something that matters get both boxes and test for a week. Any vender that will not let you test there equiptment before a purchase isn't worth dealing with. Know what you are buying from first hand experience. Take the test drive.
Better to incinerate in hell than to bow before your jackass god in heaven or suck christian dick while still on earth.
Not that it matters as we're all wormfood in the end anyhow. I can only hope that a dog perpetually pisses all over your rotting corpse.
Umm, there's the AGP memory-handing interface in the 2.3.x series (also available as a patch to 2.2.13 and .14). This enables the OpenGL drivers to use system memory (without allocating a huge chunk of it at boot-time, aka the "mem=" hack) for system-side texture memory and command DMA. So yes, the hardware accelerated OpenGL drivers may enlarge the kernel, but the "newagp" interface only takes about 11K in kernel memory after the module is loaded. Which is a lot less than AGP support in wind'ohs for sure.
No, this doesn't have anything in particular to do with the original story. So what?-D
Still, I'd like to see automatic integer-SIMD (aka MMX, on i386) support in egcs.
LOL! funny POAG! I anonymized myself on purpose, i'm just having a good time here. I like the IQ thing though, i'm going to use that tonight, unless you've patented it.
>http://www.sun.com/servers/highend/10000/spec.htm l Cool machine. And it's cool that linux can run on it ...
they added new primitive types and storage classes, like "vector", rather than bother to do loop vectorization in a compatible way. Nice of them, eh? But hey, you now only have to #ifdef every single piece of altivec code if you want to be portable.
You know..I used to think that I was falling behind the times when acronyms and buzzwords like RIAA and B2B were popping up. But with this, i'm absolutely sure that nobody knows what the hell an "AviTec enabled" application is, or cares.
I feel better.
Bowie J. Poag
Bowie J. Poag
a G4 system without the OS forcibly "bundled"?
--
--
E2 IN2 IE?
Mike Roberto (roberto@soul.apk.net) - AOL IM: MicroBerto
Berto
http://www.sun.com/servers/highend/10000/spec.html
Check out that URL and see why. How does scalability and clustering sound or internal bandwidth? Not to mention 4 meg of cache on each CPU. Hot swapping of nearly everything.
Only the State obtains its revenue by coercion. - Murray Rothbard
Why don't you try this?
This is perhaps a little of topic, but can anyone tell me the major advantages/ disadvantages of a Linux server compared to Sun? Why is it that a similarly equipped Linux server always costs significantly less than it's Sun counterpart? Could less be more?
---------------
---------------
JavaScript tutorials scripts
I heard that it is so fast that even the "sleep" command is 10 times faster!
Does GCC support 3dfx and SIMD yet? I know some MMX support is there, but when can x86 see some speedup due to the new instructions.
How is the compiler going to know what is to be manipulated as a vector? Which matrix will benifit from SIMD optimizations? These kinds of structures require definitions to distinguish them from ints and floats so that the compiler can generate the appropriate code. It will introduce incompatibilities, but the tradeoff os worth it, especially for things like scientific applications and games.
So many questions and so little information.
Back when I was a Mac guy, I quizzed Apple reps about the amount of 68k code left in the Mac OS Apple's line was that most of their effor was going into converting the most-used paths over to PPC, since it was the most productive use of programmer time. They never expected the Mac OS to be completely 68k free, since some of the code would never be worth replacing.
The acronym is SIMD. Which stands for Single Instruction Multiple Data.
http://www.whatis.com/simd.htm
Graphics programming uses lots of matrices and vectors to represent geometric elements. Often you have to scale a vector by a certain factor which involves multiplying each element by that factor.
[ 2 4 5 6 ] * 2 = [ 4 8 10 12 ].
Instead of multiplying each value by 2 using a seperate instruction you can multiply the entire vector, by 2 with just one instruction.
In a nutshell
cost savings to you? $0
why? because both the machine and software are made by Apple, it's not like Apple spends any money giving you the OS on your machine, so BFD.
In that case I'm sorry. However, if that was what you were trying to put across, what got quoted from the press release was badly worded. As for Apple, I'm sorry, I've just had bad experiences with them.
(currently testing something about signatures here)
One hopes that they get a performance increace just by writing certain standard libraries like the x libs and mesa.
One also hopes these people remember that all of those products, and gcc, are under the gpl, and that they're not the only ones with the right to use it. (Although if they have finished products now, this implies that Apple's been letting them fool around with it for longer than they've had the modifications public. Yet more fodder for the idea that they had more help from Apple, because they tend to stick more to Linux as a server rather than a desktop OS).
(currently testing something about signatures here)
True, and remember that MMX was first...
Stating on Slashdot that I like cheese since 1997.
pgcc can generate MMX code now. Check out The PGCC FAQ
#define X(x,y) x##y
#define X(x,y) x##y
Peter Cordes ; e-mail: X(peter@cordes ,
It's quite easy to put 3dnow instructions in your C source with gcc because GAS, the assembler knows about 3dnow instructions e.g. to add a couple of floats using 3dnow:
/*int foo;*/
/*foo=0;*/
#include
float a,b,c;
int main(void)
{
a=1.0;
b=2.0;
c=0.0;
asm("femms;");
asm("movq a,%mm0;");
asm("movq b,%mm1;");
asm("pfadd %mm0,%mm1;");
asm("movq %mm1,c;");
asm("femms;");
printf("%f\n",c);
return 0;
}
I'm new to inline assembler, and not very experienced, but that hack seems to work.
I'm out of my tree just now but please feel free to leave a banana.
A stupid question for the hardware gurus out there: what's a vector register?
___
___
If you think big enough, you'll never have to do it.
I know this was posted by an AC, but this is obviously not a troll--this is humor. Get with it.
Not all PowerPCs have these registers, but the G4s do.
Mod down posts with a "Free Mac Mini/iPod" sig, they're spam!
VLIW/EPIC is a BAD idea. i'm not sure what you mean by "schedulers in RISC chips having trouble keeping up". VLIW is a crippled architechture; no runtime scheduling, only static compile time (note that i'm ignoring majc+java right now -- majc is actually a very interesting arch). also, ia-64 is seriously flawed, in that it directly exposes the implementation to the isa (that's just one of many bad design decisions with ia-64, though).
hold the phone! The GCC had all the changes made to it. Linux is made to be a PORTABLE OS, and with a bit of work, will operate on any hardware with a MMU and (usually) an FPU. With you're arguement, there's no point in haveing Linux work on a Pentium chip, since it's nothing more than extra cache and some instructions more than two 386 chips wired in parrallel. We want Linux to work as speedily and efficiently on the chip we're useing it on as we can. And since we can omit uneeded code at compile time, bloat is impossible. Yes, the kernel tree can get a bit bigger, but it's not out of control. We're not trying to DO everything, we're trying to have the CAPABILITY to do ANYthing.
~Donald / Just RTFM
This is due to use of the vector registers right? Then does it realy work on _all_ G4's. I seem to have some vauge memory that not all G4 cpu's have those registers.
/das Ix
We have an old Fuijutsu here at work, it does 40Mflops whitout the vector regisers enabled and 1500 Mflops with.... drool
This is my sig, show me yours
You should do a little research before insinuating that someone else is a bad programmer next time. The vector datatypes being talked of here are VERY different from the STL vector template class. These are datatypes that represent the fundamental 128-bit data in the Altivec instruction set, much like double typically means a IEEE 64-bit floating point number.
Altivec adds the following data types to to C/C++:
vector unsigned char
vector signed char
vector bool char
vector unsigned short -- a.k.a. vector unsigned short int
vector signed short -- a.k.a vector signed short int
vector bool short -- a.k.a vector bool short int
vector unsigned int -- a.k.a vector unsigned long or a.k.a vector unsigned long int
vector signed int -- a.k.a vector signed long or vector signed long int
vector bool int -- a.k.a vector bool long or vector bool long int
vector float -- 4 single-precision floats
vector pixel -- 8 1/5/5/5 bit pixel elements (for graphics)
The elements of the bool types can only be all zeros or all ones. These vectors are usually used as masks or selectors in certain Altivec calls. The pixel type is for representing 16-bit color pixels and handles overflow within the 1/5/5/5 portions of the pixel.
This can all be found on pg 21-22 of the Altivec Technology Programming Interface Manual, which can be found on Motorola's site here.
If it's for-profit but free, you're not the customer -- you're the product (e.g., the Slashdot Beta's "audience").
That is, certain PPC Linux apps with Altivec perform 1000% times faster than without Altivec.
What did you think they were talking about?
If it's for-profit but free, you're not the customer -- you're the product (e.g., the Slashdot Beta's "audience").
This has been gone over before. The reason they won't do it is because they are afraid of being sued for the inclusion of copyrighted, patented, or trademarked material into Darwin without the ability to pull it.
Take the DeCSS thing. If Apple had been the originators of code that had had DeCSS tacked in, without the ability to perform fire control and remove the offending code without possibility of someone having said code with Apple's permission (as given in the GPL), then Apple could be sued for their open sourced code. Linux, as a system with more decentralized ownership over the code is a much harder to hit target than a large money-rich corporation like Apple. The potential legal losses outweigh the benefits. This way, they get much of the benefit of an open source model without the risk of being burned.
If it's for-profit but free, you're not the customer -- you're the product (e.g., the Slashdot Beta's "audience").
If I remember what I read a year or so ago, the compilers and libraries recognize C calls that look a lot like the assembler calls. In effect, you can call the assembly instructions like C functions. I believe there are also some libraries to do certain common vector tasks purely in Altivec. There are also additional data types to cover the different kinds of Altivec vectors (16 8-bit, 8 16-bit, 4 32-bit integer vectors and 4 32-bit FP vectors). At least, that's what I remember of the modifications done to Apple's exceptional MrC optimizing PPC compiler.
I'm not sure that any of the kernel is enhanced, unless they've found a way to have the compiler optimize to parallelize some of the code, but this has been shown in the past to be a monstrously difficult task to accomplish, and is usually is only applicable on small sections of the code.
If it's for-profit but free, you're not the customer -- you're the product (e.g., the Slashdot Beta's "audience").
It's not like we tell Motorola what to say in *their* quote ;) Anyway, my point was that Motorola and TSS worked on this. In fact, I think Apple's own AltiVec egcs/gcc patches are from Motorola (plus their own additions I suppose).
:P
By the way, what does your personal experiences with Apple have to do with Apple helping or not helping Linux companies? I'll I can figure it that you're a BeOS user
Regards,
Dan
Dan Burcaw
um, MMX is also x86 SIMD. It's not floating point, tho... and neither of KNI or 3DNOW are clones of the other... Vector processors have been around for a while.
Oh, and KNI is 128 bit, while 3d NOW is 64 bit - in this case, twice as many bits is twice as fast.
Become a FSF associate member before the low #s are used
I coulda sworn that it operated on 128 bit quadfloats... but I could be wrong. The way it does it is irrelavent.
Become a FSF associate member before the low #s are used
Ask abit or asus to make one. No one is stopping them. See here for more info.
--
Don't lead me into temptation... I can find it myself.
SPECINT95
Compaq Computer AlphaServer ES40 Model 6/667
Result: 40.0 Baseline: 35.6
(DEC Alpha 21264A 667 MHz, 4GB RAM, Tru64)
Digital Equipment AlphaStation 200 4/166 Result: 2.31 Baseline: 2.31
(DEC Alpha 21064 166 MHz, 64MB RAM, Digital UNIX)
Dell Computer Dell Dimension XPS Pro200n
Result: 8.08 Baseline: 8.08
(PPro 200 MHz, 64MB RAM, NT4)
Dell Computer Precision WorkStation 420
Result: 38.9 Baseline: 38.2
(Intel Pentium III "Coppermine" 800 MHz, 256MB RAM, NT4)
Dell Computer Precision Workstation 610
Result: 24.3 Baseline: 24.3
(Intel Pentium III Xeon 550 MHz, 256MB RAM, NT4)
Intel Corporation Intel VC820 motherboard
Result: 38.4 Baseline: 37.9
(Intel Pentium III "Coppermine" 800 MHz, 128MB PC800 RAMBUS RIMM, NT4)
Sun Microsystems Ultra 80 Model 1450
Result: 19.7 Baseline: 16.2
(450 MHz UltraSPARC-II, 512MB RAM, Solaris 7)
IBM Corporation RISC System/6000 H70
Result: 16.0 Baseline: 13.7
(340 MHz PowerPC RS64-II, 2496MB RAM, AIX 4.3.2)
IBM Corporation RS/6000 44P-170
Result: 25.3 Baseline: 23.5
(400 MHz PowerPC-II, 1GB RAM, AIX 4.3.3)
Source: specbench.org
SPECINT2000 is too new. There aren't enough submissions yet.
This is all for single CPU workstations. I dunno. Motorola doesn't seem to believe in submitting benchmarks to SPEC, so I had to use some older RS6000 systems running AIX. IBM doesn't seem overly interested in submitting benchmarks, either.
For my money, I think I'll go with an Intel or Compaq/DEC solution. Sure, the Sun and IBM workstations scale like hell, but they cost ten times as much as an Intel solution. I couldn't possibly see using Intel boxes as enterprise servers, but for workstations, they seem to be tops. If the DEC Alpha was cheaper, I'd go with that. As it is, I just bought a brand new Multia (166 MHz DEC Alpha 21064) for $150. It's hard to beat that. 64 bit computing at the speed of a Pentium 100 (integer) or 200 (floating point), for practically nothing. It should be upgradable to the 233 MHz 21064, as well. We'll see...
Of course, Intel systems suck at floating point, so I didn't bother to cut and paste that. We all know that Intel would come in dead last in that benchmark. Your only choice is the Alpha.
I'm not quite sure where the new PowerPC processors fall. They're more expensive than Intel Coppermine chips, but there's little chance they can scale or perform better than the other entry-level solutions.
I agree. Quake 3 is a great benchmark if you're mainly going to be playing Quake 3. It's also a very good benchmark of total system performance: video, CPU, memory, etc.
I'm most interested in pure CPU speed, though. Given a PCI motherboard, I can put whatever hardware I want in it. I feel kind of sorry for the Mac owners, locked into Apple/ATI hardware. It's really quite sucky. I just want the Motorola CPU. I couldn't care less about the rest of the Macintosh. I would just throw everything but the CPU (and maybe motherboard) into the trash.
It's probably best to forget about Motorola hardware and save up for your very on Compaq/DEC Alpha 21264. Those fuckers are expensive!!
Okay, so where do I buy a single or dual processor Motorola PowerPC motherboard with 5 or 6 PCI slots, 2 serial, 1 parallel, and perhaps a USB or Firewire port?
I don't see any of them on the market...
Motorola doesn't seem to believe in submitting benchmarks to SPEC, so I had to use some older RS6000 systems running AIX.
The SPEC benchmark is for complete systems, not the CPU. Motorola doesn't make any computers, so they can't submit scores. It is Apple who should do that work.
Kjetil T.
ummm.. KNI is actually two 64 bit functioning units operating in parallel. Intell had to do this so that they kept compatability with MMX instruction ergisters... so calling it 128 bit is like calling a dual processor PentiumPro machine a 64bit architeture....
I'm designing a new language (for a school/Intel Science Competition project, but I hope it'll be good...), and I'm wondering how well people would accept an extensible language that allows vector support... but would be a departure from C. There's no way to extend C enough to elegantly support vectors; is it time to move on?
(Yes, that's AltiVec assembler in my sig, it's a quine):
Where is my mind?
mfspr r3, pc / lvxl v0, 0, r3 / li r0, 16 / stvxl v0, r3, r0
Check out Project Upper/Mute, an all-around awesome compiler fra
Ey!
I've still gotta go explain that pseudocode I left off on, in the middle of the message...
I'll go do that.
Where is my mind?
mfspr r3, pc / lvxl v0, 0, r3 / li r0, 16 / stvxl v0, r3, r0
Check out Project Upper/Mute, an all-around awesome compiler fra
You don't have to. AltiVec allows fp and int operations to be performed on vec registers. You are allowed to take a value in an int register and replicate it to fill all the spots on a vec register, tho.
Where is my mind?
mfspr r3, pc / lvxl v0, 0, r3 / li r0, 16 / stvxl v0, r3, r0
Check out Project Upper/Mute, an all-around awesome compiler fra
I didn't realize mot makes stuff like that - that's one small-ass mobo!
Where is my mind?
mfspr r3, pc / lvxl v0, 0, r3 / li r0, 16 / stvxl v0, r3, r0
Check out Project Upper/Mute, an all-around awesome compiler fra
if they can just get rid of these friggin tiny keyboards
I'm still using my old-skool ADB keyboard with my G3. Before the iMac, Apple shipped two types of keyboard, Design and Extended - basically the same thing. There was very little market for third party makers because unless you busted yours, it worked just fine.
I personally feel that they got paid off from some of those third party manufacturers who come out with a standard USB keyboard that allows you to dump the cruddy iMac style board.
this seems to be the thing most complained about with the current Macs, that and the long wait for OS X client
- passion
C++ has the STL vector types (also a matrix type, right?)
:(
C++ has a vector, in the sense of a variable size array, and also a valarray, which acts like a mathematical vector. But valarray sucks hard (the design is based on F77 and gives quite poor performance on modern CPUs). There is not matrix type in the ISO libraries, however, Blitz++ is a (big complex) math library in C++ - it does matrix and vector operations, all kinds of weird functions that I don't want to know about, etc, etc. You can find in on Google, it's very well known (it's GPL/Articstic, BTW).
I'm sure that if this became well known and popular, the libstdc++ and Blitz++ people would add support for it in their code.
Damn - 32 128 bit registers! I fscking hate x86!! I'm so jealous!
There's a way more powerful and .. efficient API. I think efficient would be the best way to put it, because from what I know, the libraries are way cleaner and do not produce crap that you'd get from something like some x86 libraries out there. These puppies can also take things like recursion and clean it up better than anything else as well.
Mike Roberto (roberto@soul.apk.net) - AOL IM: MicroBerto
Berto
Sorry, I had to say it.
--Have a Johsonville brat.
The instruction definitions for MMX and 3DNow instructions are already there in gas. You can make use of them by means of the asm() feature in gcc and egcs. I've written several MMX-enhanced programs using egcs. But, these compilers will not themselves issue the egcs instructions. Also, you need to be careful to manually maintain the proper relationship with other FPU usage.
Problems with this approach are:
- gdb does not understand the FPU registers. Debugging MMX code is a real chore. You need to store things into memory before gdb can see them.
- It is up to you to decide when and how FPU registers need initialization.
- You are working in assembler and need to understand how to properly use asm(), __volatile__, and the like.
But it definitely works. I got reasonable speedups. MMX, 3DNow, etc. noticably inferior to AltiVec as an instruction set, but that has nothing to do with gcc and egcs. The asm() integration with the rest of the egcs compiler does make short MMX sequences quite reasonable. For longer code sequences it is better to write separate modules in assembler.
There you are!!!
:)
What's up?
I'll get back to you about AC/EC&Upla in a while, but for now my mind is fried.
Dilbert: I have become one with my computer. It is a feeling of ecstacy... the blend of logic and emotion. I have reached...
There's a good article at Ars Technica on SIMD architectures, including Motorola's Altivec.
This is actually really cool.
The one thing that has really bothered me about apple was their marketing claims, since that apps had to be specially writtent to get their preformance gain.. Now that you can get it under linux.. hmm..
thats all
*Not a Sermon, Just a Thought
*/
A reasonably accurate description of vector operations (or more accurately Single Instruction, Multiple Data operations), but is it truly necessary to bring in the linear algebra concept of a vector space? Strictly speaking, your defintion is not even entirely correct or complete. You define b and c as doubles, while b and c are formally scalar quantities; it is entirely acceptable for b and c to be defined over the scalar field of the complex numbers. Moreover, it is not entirely true that all vector spaces can be represented as an ordered list of numbers. For certain vector spaces, some structure (ie. existence of a inner product) is lost when the representation of the vector space is coerced into such a form. You also fail to present the closure properties of formal vector spaces with regard to scalar multiplication and addition as defined over the vector space. In the future, please karma whore in a more accurate fashion.
Correct. Just as sales of Micro$oft Offie fund development of the Sindows OS, sales of Mac hardware subsidize Mac OS 10. I refuse to call it X because it doesn't come with an X server, only some Display PostS#t they call Quartz.
Will I retire or break 10K?
A vector space is a set of objects for which the following are true for all b, c, x, y:
- double b, c; vector x, y;
- x + y == y + x;
- x + (y + z) == (x + y) + z;
- x + (vector)0 == x;
- x + -x == (vector)0;
- 1.0 * x == x;
- (b * c) * x == b * (c * x);
- c*(x + y) == c*x + c*y;
- (b + c)*x == b*x + c*x;
All vector spaces can be represented by an ordered list of numbers. A typical "vector register" holds a four-dimensional vector as an array of four scalars (ordinary numbers).A vector execution unit in a processor can do the same thing to all four components of a vector, or do other predefined transformations. For example:
- x + y is defined to be [x[0]+y[0], x[1]+y[1], x[2]+y[2], x[3]+y[3]]
- c*x is [c*x[0], c*x[1], c*x[2], c*x[3]]
- x dot y is x[0]*y[0] + x[1]*y[1] + x[2]*y[2] + x[3]+y[3]
Essentially, vector hardware increases the speed of doing the same thing to a lot of data. If you still need help, look for "linear algebra" on Google or any other search engine.Will I retire or break 10K?
Pentium III's KNI (your x86 simd stuff) is a (badly done?) clone of 3DNow!.
Will I retire or break 10K?
Did they do it the same way as the MacOS compilers? When the G4 Powermacs first came out, I took a quick look at some sample Altivec code on Apple's developer website, and thought the way they handled the vectors was pretty nasty looking... like to initialize a vector variable, you did something like vector v = (vector)(0x50147242, 0x72353233, 0xbedac0ed, 0x3aa10dab);
Wasn't exactly like that, but the gist of it was that it looked like a cast of a list of constants separated by the comma operator. Eew :)
This story is about Linux, remember? If the kernel starts using AltiVec for memcpy, TCP checksumming, etc. all apps will benefit.
Likewise, if some of the crucial libraries like libart and libjpeg get AltiVectorized then many apps will get faster with no changes.
Here you go
Of course, since Moto is not in the desktop market, these are embedded mobos...
a STL or java vector is something that's only slightly related to vector processing.
an STL vector is basically an array that can grow dynamically.
vector processing is when you want to do the same operation to each member of an array.
Will MPPC chips become commodity in the next year? One can hope so. I'm a big fan of the PPC architecture, I'd love to see it become a little more widespread. It would be nice to click through buycomp.com and see MPPC 750 and 7400's along with motherboards for them. I think a great use for G3/4 MPPC chips on Linux (or just about any other free Unix) would be media production. High power graphic workstations are getting more commong but they are still high priced pieces of equipment, the media companies have just now been able to afford them in larger numbers due to their relative success. The free Unicies make a real good bed for media to come visit. Open sourced kernels lend themselves to a good deal of optimization which will result in faster system performance in an area where time is money. I don't know if Linux is mature enough yet but FreeBSD on a render-farm of G4s would kick some ass, like the one from The Matrix but larger (and on MPPC 7400's).
I'm a loner Dottie, a Rebel.
Most cool!! Linux has Supercomputer Power now!!
(tongue-in-cheek, yah, but cool nonetheless)
I'm wondering... what do the C extensions look like? C++ has the STL vector types (also a matrix type, right?) but C just has arrays of int/float/double. Is there an API reference anywhere? Is an API even involved? What would AltiVec-enabled C code look like?
iSKUNK!
Can Altivec do register moves between the GPR, FPR and Alti-Vec unit without having to do a Store/Load to memory/cache? One of the real pains of PPC is that it can't do a direct 64bit single beat read/write without using the FP registers, but you cant directly manipulate them, you have to do 2 32-bit stores then a double float load, then a double float store to your 64bit bus device.. its slow. PPC EC parts dont have floating point and have no method if accessing 64bit devices except through cacheline fill/castout bursting, not very helpful for I/O devices.
Starman97@Gmail.com (bring it on spammers)
Where do you get this cruft? Do you see an Apple quote in this PR? Nope. Motorola has been working on AltiVec patches for gcc too, you know. (It is their technology). And yes, we helped them (Motorola) out some.
Go download the gcc patches and put them in "Phil-14's Linux OS". The GPL allows that, and we welcome it.
Regards,
Dan
Dan Burcaw
The .ppc.rpm files on altivec.org are from us and is what is shipping with Black Lab Linux.
.ppc.rpm's should show our information in the Vendor and Distribution fields.
altivec.org is basically the starting point for everything AltiVec, so we're putting the RPMs there and linking that site to our web page, etc.
rpm -qi on those
Regards,
Dan
Dan Burcaw
heh, with the X Project, i don't think ANYTHING seems like a no-brainer!
Mike Roberto (roberto@soul.apk.net) - AOL IM: MicroBerto
Berto
IBM is working on their new PowerPC Open Platform boards, which are very cool. Capability for an arbitrary number of CPUs and runs on a PCI bus.
Not 3dfx. (3dfx makes the Doodoo, erm, Voodoo graphics cards. At least they open sourced Glide.)
YM 3DNow! the streaming SIMD extended instruction set AMD added to the K6 chips and that Intel copied in Katmai/PIII.
BTW, SIMD = single instruction multiple data. First, instruction decoding limitations produced RISC (reduced instruction set CPU). Then the increasing popularity of graphics apps brought about SIMD (apply the same filter to a whole bunch of filters). Clock speeds rose so much that even the scheduler in a RISC chip was having trouble keeping up, leading to VLIW (very long instruction words) used in Intel's Merced Itanium and (internally) in Transmeta's Crusoe.
Will I retire or break 10K?
How easy are these c and C++ libraries to use?
Are they saying any vector type processing can be easilly rewritten, and so lots of aps can be enhanced?
I vaguely know altivec is cleaner than the x86 simd stuff, but can the same thing be applied to mmx, 3dnow etc... ?
what parts of their kernel gain performance?
You are either a bad C++ programmer or just haven't heard of these things: 'vector' is a member of the C++ standard template library and is therefore not added especially for AltiVec. The STL seems to be a good place to insert these assembler optimizations. Since the class abstraction is pretty high, you can do a lot of speed-increasing operations in the dark dwellings inside the classes. All applications written in standard C++ will benefit from this.
War is one of the most horrible things a human can be exposed to. And one of the worlds largest industries.
Altivec support has been in all of the 2.3.x kernels, but it hasn't done much yet -- only #ifdef'ed in a handful of lines of code. This is really quite cool; I'm already running Linux on a PowerPC 750 (the G3). My next machine will likely be a G4 or whatever's next.
There's a good bit of info on the alti-vec and the G4 in this Ars Technica article (that was slashdotted a while back).
John
Please, if anyone can flame my data and correct it I beg of you to do so ;) but I'm not a bit surprised that G4s are doing this. Altivec lends itself to big data operations, not just vector processing. Memory moves are faster 128 bits at a time, and so on. Screen blitting, likewise. I wouldn't be a bit surprised if someone is working on an optimized X that uses G4 altivec acceleration- that would seem to be a no-brainer.
I had a lot of trouble trying to actually find this code. It may be in the yellowdog cvs but the server seems to be down, as is the ftp server.
They do say to go to altivec.org to download the gcc and binutils. It's in the tools section behind a "you must sign up for our email forum" form. The packages there include a new binutils, gcc, gdb, and libc to support the altivec extensions.
Here are the direct links, for the curious: