AMD Unveils SSE5 Instruction Set

← Back to Stories (view on slashdot.org)

AMD Unveils SSE5 Instruction Set

Posted by CowboyNeal on Thursday August 30, 2007 @05:14PM from the new-and-improved dept.

mestlick writes "Today AMD unveiled its 128-Bit SSE5 Instruction Set. The big news is that it includes 3 operand instructions such as floating point and integer fused multiply add and permute. AMD posted a press release and a PDF describing the new instructions."

6 of 85 comments (clear)

Min score:

Reason:

Sort:

Well, I'm excited. I think. by Harik · 2007-08-30 17:39 · Score: 4, Insightful

So, where's the analysis by people who write optimized media encoders/decoders? How useful are these new instructions, or are they just toys? How well did they handle context switching? What's the CX overhead? Is there a penalty for all processes, or only when you are switching to/from a SSE5 process? Will this be safely usable under all operating systems, or will they need a patch?
Re:...or are they just toys? by theGreater · 2007-08-30 17:58 · Score: 5, Funny

It ROUNDSS! It ROUNDSS us! It FRCZSS! Nasty AMD added to it.
It's a couple links deep... by SanityInAnarchy · 2007-08-30 18:29 · Score: 5, Informative

Read this interview with Dr Dobbs:

A floating-point matrix multiply using the new SSE5 extensions is 30 percent faster than a similar algorithm

I believe this helps gaming and other simulations.

Discrete Cosine Transformations (DCT), which are a basic building block for encoders, get a 20 percent performance improvement

And then we have the "holy shit" moment:

For example, the Advanced Encryption Standard (AES) algorithm gets a factor of 5 performance improvement by using the new SSE5 extension

If I get one of these CPUs, I'll almost certainly be encrypting my hard drives. It was already fast enough, but now...

As for existing OS support, it looks promising:

We're also working closely with the tool community to enable developer adoption -- PGI is on board, updates to the GCC compiler will be available this week, and AMD Code Analyst Performance Analyzer, AMD Performance Library, AMD Core Math Library and AMD SimNow (system emulator) are all updated with SSE5 support.

So, if you're really curious, you can download SimNow and emulate an SSE5 CPU, try to boot your favorite OS... even though they say they're not planning to ship the silicon for another two years. Given that they say the GCC patches will be out in a week, I imagine two years is plenty of time to get everything rock solid on the software end.

--
Don't thank God, thank a doctor!
1. Re:It's a couple links deep... by funfail · 2007-08-30 23:24 · Score: 4, Funny
  
  Why? Recovery is 5 times faster now.
  
  --
  Search RapidShare and MegaUpload!
Foundations for the GPU+CPU assimulation... by WoTG · 2007-08-30 18:44 · Score: 4, Insightful

I'm not really qualified to make an opinion on this, but my guess is that these instructions will prove increasingly useful as AMD integrates the GPU and CPU. To me, it looks like they plan to make accessing what was traditionally part of the GPU a simple process (relative to accessing a GPU directly through their own pseudo CPU api's).

It'll take a couple years for "SSE5" to show up in AMD chips... which happens to coincide nicely with their Fusion (combined CPU+GPU) product line plans.

Will Intel pick up on these instructions? Maybe not. Does that mean they die? No, the performance benefits for those areas where this will make the most difference will make it worthwhile. At the very least, AMD can sponsor patches to the most popular bits of OSS to earn a few PR points (and benchmark points).
Re:Can someone explain please by forkazoo · 2007-08-30 21:27 · Score: 5, Informative

The 64-bit designation refers to the width of the address bus*. For example, IA-32 processors have been able to handle 64 bit integers for ages.. so a 64-bit address-capable processor handling 128 bit numbers is nothing new.

Technically, the "bit designation" of a platform is defined as the largest number on the spec sheet which marketing is convinced customers will accept as truthful. Seriously, over the years different processors and systems have been "16 bit" or "32 bit" for any number of odd and wacky reasons. for example, the Atari Jaguar was widely touted as a 64 bit platform, and the control processor was a Motorola 68000. The Sega Genesis also had a 68k in it, and was a 16 bit platform. The thing is, Atari's marketing folks decided that since the graphics processor worked in 64 bit chunks, they could sell the system as a 64 bt platform. C'est la vie. It's an issue that doesn't just crop up in video game consoles -- I just find the Jaguar a particularly amusing example.

But, yeah, having a CPU sold as one "bitness" and being able to work with a larger data size than the bitness is not unusual. The physical address bus width is indeed one common designator of bitness, just as you say. Another is the internal single address width, or the total segmented address width. Also, the size of a GPR is popular. On many platforms, some or all of those are the same number, which simplifies things.

An Athlon64, for example, has 64 bit GPR's, and in theory a 64 bit address space, but it actually only cares about 48 bits of address space, and only 40 of those bits can actual be addressed by current implimentations.

A 32 it Intel Xeon has 32 bit GPR's, but an 80 bit floating point unit, the ability to do 128 bit SSE computations, 32 bit individual addresses, and IIRC a 36 bit segmented physical address space. but, Intel's marketing knew that customers wouldn't believe it if they called it anything but 32 bit since it could only address 32 bits in a single chunk. (And, they didn't want it to compete with IA64!)