SuSE Submits Enhancements for AMD Hammer
ackthpt writes "SuSE has this press release as they are submitting enhancements to the Linux kernal particular to the AMD's x86-64 processor instruction set. Anticipated for 2.6 kernel, some enhancements may appear in 2.4, as development is only beginning on 2.5. AMD's take on the announcement as well.". nik notes that SuSE join NetBSD in having ports to Hammer. Usenix members can see the paper Wasabi's Frank van der Linden wrote about the porting effort.
Hammer is definitely gonna be an interesting and very cool set of chips! Glad to see someone is working on enhancing linux for it. Especially since the big bad wolf in Redmond hasn't yet even done a beta of 64-bit XP for the Hammers.
Derek Greene
this is truely a great move in the right direction, but we also need to see something like a gcc support and optimization for this new architecture. AMD, please: you are the expert on your chips. As Intel made it's own free compiler, so too can you. Ideally, release your compiler via MIT-License, LGPL, GPL, or something similar, and releasing an optimization for GCC would blow my mind.
Use my userscript to add story images to Slashdot. There's no going back.
FreeBSD is working on an x86-64 GCC! Actually AMD itself has sponsored this! Take a look at the link!
Any technology distinguishable from magic, is insufficiently advanced.
To straighten things out:
Commodore machines have a kernal (Keyboard Entry Read, Network, And Link), linux has a kernel.
To make life more complicated: if you want to run a Unix like OS on a machine with a kernal (like the c64) it is not going to be linux but lunix (http://lng.sourceforge.net/).
Take a look at GCC main page and you'll see a note on the x86-64 port contributed by SuSE.
--
The Cap is nigh. Time to get a fresh new account.
Are Hammers available right now? If so, where can I get one? Strictly for research purposes, of course...... ;)
because intel put their itanium 64bit egg in the windows xp64 basket.
For a decimal example, multiply 123,456 by 2 to get 246,912. Imagine your old number system was limited to max. 999. With the new system (max. 999,999) you've effectively multiplied 123*2 = 246 and 456 * 2 = 912 by a single instruction. Of course you'll have to separate the resulting numbers at the end, but you might get improvements if you do multiple instructions in succession.
--
The Cap is nigh. Time to get a fresh new account.
No, SuSE is submitting enhancements for Linux for the AMD Hammer. Made me think they were actually making suggestions to the chip design for a second.
Imagine Dr Torvalds claiming Suse's patches stink and need to be re-worked from ground-up.
Imagine.
http://www6.tomshardware.com/cpu/02q1/020227/
:o
Interesting - they tested one of the Hammer CPUs on Suse, but they only ran XP in 32-bit...
Just a nit-pick, but Intel compilers actually cost: $500 for linux C/C++ compiler ($125 academic)
Intel does provide a number of free open source products, including an Intanium assembler, library routines, vision routines, and a network performance analyzer.
HIV Crosses Species Barrier... into Muppets
You will recall that when AMD demoed hammer recently, they showed a 32-bit Windows system and a 64-bit Linux system. People were commenting on AMD preferring Linux over Windows, therefore showing a more powerful Linux demo than a Windows demo.
The truth is that there is not a 64-bit version of Windows for the Hammer. AMD was able to modify the existing Linux code to create their own 64-bit version of Linux. This is the best example of the freedom granted by the GPL that I have seen in months. AMD is releasing a new product at the end of the year, and they are able to create a demand for it NOW by having software for it NOW.
Do you remember the lag between the introduction of Intel's Itanium and a Windows version for Itanium? It was not well coordinated. AMD has done the opposite, they created a demand and a use several months before the release, and it's working. We are all drooling over a 64-bit architecture, and we will have 6-8 months to think about (and save up for) the purchase of a Hammer.
This is the freedom to innovate that is granted by the GPL and denied by the MS EULA. GPLed software is going to make AMD some money.
I feel all warm and fuzzy inside.
While Hammer will fly at 32 bit code, the 64 bit code will really differentiate the proccessor. Two-way clawhammer Beowulfs should be a huge business. But, the differentiation will really not show on Windows until (unless) they develop a x86-64 bit windows. I wouldn't count on them doing that until Intel comes out with their version of x86-64. (note that I didn't say if). There will be great pressure to recompile and reoptimize Open software to take advantage of the Hammer.
I think this is a wonderful advancement. I run Suse on an athlon now, and will run suse on a dual hammer in probably a year in a half (I can't afford to be bleeding edge). I can't find many optimizations for the Athlon in compilers and such. However, with the Hammer, the optimizations will be out there. Not only will the compilers have flags, but entire distributions will likely be built with re-compiled applications. That would be something I would pay more for.
I agree with you. If AMD were to release it's own set of x86-64 optimizations for GCC, very few of them would find themselves in the GNU release of GCC. HOWEVER, I am suggesting an AMD-GCC distribution/supplement; this would be released as a set of diffs on the current GNU GCC and neither "waste" GNU developers' time nor "bloat" the standard GCC distribution.
Use my userscript to add story images to Slashdot. There's no going back.
When was the Itanium released? Where is Windows for the Itanium?
--
Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.
Anyway, who's to say AMD don't have a demon proprietary compiler for x86-64 up their sleeve for just this purpose?
Any sufficiently advanced technology is indistinguishable from a rigged demo
--Andy Finkel (J. Klass?)
Multiplication is a bad example, but it is possible to multiply several numbers at the same time by one or more coefficients. This usually isn't worth it unless the numbers are very small compared to the word size - e.g 4 bits vs 32 bits.
However - there are a lot of operations which can be dramatically improved by packing data without any extra SIMD hardware. For example, you can perform some tricks with bit shifting to do pixel masking 32 bits (or 64!) at a time. You can do addition/subtraction trivially with the only thing to watch out for being the carry.
Whether it's worth it is a case-by-case decision. Sometimes the packing/unpacking/carry correction takes longer than the performance gain.
And here's an example where there's definitely a performance increase! I've used the code below to do motion blur in the past. It's slower than using MMX, but not by much. I wrote it so long ago I don't have any comparitive figures though.
The idea here is that the framebuffer persists the image. The input and output buffers are 8 bits per primary. Now, you could do this a single byte at a time, but that would suck for speed. Instead, 4 bytes are computed at once. The formula for each output byte is based on:
out = (out * 3 + in) / 4
This is actually performed here slightly less accurately:
out = out / 2 + out / 4 + in / 4
I remove some of the visible artifacts in practise by a post-processing stage where 1 bit of noise is added.
The bit masks are applied to prevent the shifts "leaking" into the next byte in the word. Now, on the topic of 64bit - the above can be performed on 64bit words with no performance loss. This means it goes twice as fast. Although you'd be silly to do this on an architecture with SIMD instructions designed to do exactly this job.
On architectures without SIMD, tricks like this can give you several times speed increase. If anyone's interested in any other tricks I can pull some code onto a web page somewhere.