Socket Athlons by early next year?
webslacker writes "That's what it looks like, according to the private eyes over at Sharky Extreme. The Athlon Select series, as it will be called, will be aimed at the low end and will use a new ZIF standard called Socket 423 (the number of pins). Oh, and get this... plans are being laid to integrate an 8MB L2 cache. "
The Register is reporting a rumor that AMD has hired a top Alpha engineer to guide the development of the 64bit K8 which may be demo'd as soon as the Microprocessor Forum early next month in San Jose.
SUN sells workstations that will beat the G4 handily except for Altivec stuff, which probably is faster. But they also have the Enterprise 10000 can be equipped with up to 64 processors, which means it will leave a G4 in the dust.
SGI sells Onyx2 InfiniteReality2, which will beat probably anything else on heavy-duty visualisation stuff, and can be equipped with up to 128 processors.
HP makes the J-5000 workstation, which will also beat a G4 on most tasks, as well as big-ass servers with up to 128 processors.
IBM makes RS/6000 workstations and servers, which can scale up to 128 processors.
Compaq sells XP1000 workstations with a 667MHz Alpha 21264 processor, which will beat the G4 on anything that can't make very good use of Altivec, and there are places that sell dual 667MHz 21264 workstations. Compaq also has the AlphaServer GS line, which can take up to 14 21264's, probably beating the G4 on anything.
Furthermore, the Athlon probably beats the G4 on stuff that doesn't parallellise well, and an 8-way Xeon should be faster for most, if not all, things.
Unfortunately all the systems here, except the Athlon, are far, far more expensive than a G4. But you can get faster systems if you're willing to pay the price. Oh, and all of those run some Unix variant, as well as Windows NT for Alpha and Athlon/PIII.
Also, when it comes to the speed of the G4, it all depends on how useful Altivec is for your app. If it isn't useful, the G4 isn't that impressive. If it is, the G4 should be very good value for money, if Altivec is anywhere near as good as the hype claims it is.
ust what we need to fragment the market a little further...another bloody CPU mounting standard.
Just like in the 70's when semiconductor manufacturers were putting out thousands of different kinds of transistors, these CPU's are meeting different needs. More and bigger busses, I/O, control pins, etc. Its just evolution. Sure, it means more of a choice and incompatibilities. Just like millions of transistor types, we now have cross reference manuals to allow subsitutions from a small stock. I'm sure we will soon have a market for cheap CPU adaptors.
This is incorrect. The Athlon die, coupled with 8mb of L2 cache would simply be enormous, even on a .18um process. Also, Sharky Extreme misreported this information. The socketed version of the Athlon is the Athlon Select, but this version is the low-end one. Only the high-end version, the Athlon Professional, will have these cache sizes. The Athlon Professional will also be in the Slot B (that's the current Alpha standard) format. Sharky Extreme is about as unreliable as The Register when it comes to processor information. For a better explanation of some of the problems with Sharky Extreme, take a look at http://www.jc-news.com/pc/.
Now, what I'd like to see is someone running an operating system *entirely* out of cache. 8Mb should easily be enough to run a cut-down Linux, and definitely enough for the earlier MS operating systems...
Question is, is it possible? I suppose if you're never getting any cache misses then it won't have to access any external memory, but I'd imagine that there's a whole load of problems to do with memory mapped I/O and booting...
anybody with a little more technical knowledge care to comment?
The chips are indeed hard to get, but it's not impossible. However, the mainboards are currently in short supply.
------------------
You may like my a cappella music
On a related note: The Register has a story about the K8 (note the register appears to be down right now, but the link is on the main page). Not very much information, but interesting never the less.
Another plus of the new architecture? Each CPU maker will use the technology they know how to use best. I wouldn't expect it any other way. I wouldn't force an artist who uses oil paints to move over to anything else if that is what he knows best.
sporty
---
Use the force lunk!
-
ping -f 255.255.255.255 # if only
this APIC patent for example, it's the APIC protocol itself is patented. Yet another case where the US patent system prevents competition and causes inflated prices. (just check out how much a Xeon CPU costs - ridiculous.)
--Coke
Cache is not something you get to control directly in code.
...read the rest of the cache line. Assuming your code is well optimized for cache performance, the next things you read should already be in the cache.
It can be, in a lot of newer architectures. Prefetching instructions area available in instruction sets of many RISC, and even x86 processors. The main problem is that most optimizing compilers can not do a good job of cache prefetching since there is no corresponding construct in high-level languages, i.e. you do not have a C structure which says "get this memory chunk in the cache". However, for bold people who like to play with assembly language, the instructions are there. Operating systems generally do not like these instructions, though.
Embedded system programmers, particularly those lucky enough to use modern CPUs with cache controllers, can use these tricks.
Of course, you have the assumption that the optimization mentioned here is simply making sure that the current working set fits in the cache line-that's not usually the case with modern software, and you will always have a lot of conflict misses.
IMHO, the previous poster makes a valid point and using the prefetching techniques, it might be possible to get the whole OS into the cache. This might be interesting to play with, but then I'm not sure if it will have any significant advantages-you might be better of loading the application and its working set into the cache rather than the OS. (unless all you do in your application is a bunch of system calls..)
Zigbee Central: A Zigbee weblog
It all depends on the benchmarks you use and the associativity of the cache, which the designers use to come up with a suitable cache line size. On simulators using RISC instruction sets, 2-4MB caches seem to perform very well in published studies, and particularly Alpha seems to be employing them well. Unfortunately you don't see many studies using the x86 instruction set since no graduate student or professor in his sane mind will try to write an x86 instruction simulator. So all you have is studies based on collected traces; and I remember to have seen several which tried running Windows applications, and found out that these applications will benefit from larger caches immensely. It makes sense given the recent increase in code and working set sizes of available software; and decreasing price of storage which makes having these large working sets feasible.
Zigbee Central: A Zigbee weblog
Don't get me wrong, I'm all for competition in the processor market, and 8MB of cache sounds like the chip would fly!
But why do AMD and Intel insist on this "war" about the socket architecture? One of the best things about the Super Socket 7 was that you could buy a mother board, and then slap a Cyrix chip, a standard pentium, a K6/2 or a K6/3. This gave people on a low budget a nice clear upgrade route from a cheap processor to something more worthwhile.
All this customising of sockets is good for performace, but why don't they take the cost / upgrading of systems into account? Not everybody can afford to shell out for a new processor every 6 months, let alone a new processor AND motherboard.
Manic.
If you ever drop your keys into a river of molten lava, let'em go, because, man, they're gone.
8mb, hmm.
Remember that cache is a fix, it makes up for the shortfall in the speed of memory and the bus architecture. I think this just indicates how far behind the rest of the x86 system is falling behind processor developement. I'd much rather have faster main memory and a better bus architecture than masses more cache. Cache is expensive and doesn't always give the benifits you would think. Remember the Celeron had only half the cache of the comparative PII but could equal it in performance because it's cache ran at full processor speed.
Its not size that matters, its how you use it and what you put around it that counts
Cache is not something you get to control directly in code.
When you read a dword from main memory (i.e. not already in the cache) and then do operations on it, the cache controller takes advantage of a free memory bus to go ahead and read the rest of the cache line. Assuming your code is well optimized for cache performance, the next things you read should already be in the cache.
If you're doing a lot of kernel stuff, large chunks of the kernel will be in the cache, as you would want. And if you're running Quake 3, Quake 3 will be in the cache. It's exactly what you want.
There are just so many to choose from :-)
:-)
:-)
:-)
Seriously:
"Socket 7"
Gee, for all you with Pentuim 1s, Pentuin w/ MMXes, and older K6s. Super 7 (just a minor mod) for K6-2 and -3. I expect the genuine socket 7s are dead now, with the Super 7s gone by next year.
"Slot 1"
It's already dieing because the Pentuim IIs/IIIs are outrageously expensive, compared to their performance (especially those damned PIIIs with their serial number ickyness). Celeron is in the cheaper Socket 370, and you know people love those things
"Slot 2"
If you think a PIII is too cheap, buy a Xeon PIII and one of these babies. Considering Intel's SMP design forces the CPUs to share the same bus, Xeons with 4mb of cache will not scale well past 4 or so CPUs, so why bother with the expense when Athlons are cheaper? This spec can die like the "Socket 8" of the PPro.
"Socket 370"
Perhas usefull, but the Celerons are ludicrously locked at a 66Mhz front side bus. I mean, Intel is embarrassing enough because their first-string proccessors (PIIs/IIIs) have a half-clocked L2 cache. Pathetic! They've hobbled the Celerons, and are just trying to prove they control the customer's demands.
"Slot A"
Well, seems OK. I mean, you can plug in an Alpha proccessor package of an Athlon package in the same Slot A, and you do get the benefits of fast bus speed, at chipspeed L2 cache, etc.
"Socket 423"
I guess this was inevitable. I doubt you'll be able to plug an Alpha into this, but the PGA format is a bit cheaper to make than ye olde cartidge (can you say SNES cartridge looking?) CPU packages. They are probably cheaper, and I know they're probably easier to stick into one of those wonderful Kyrotech units
Anyways, I know I'll be buying more AMD. I love that company
--
Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.