Dual Pentium III Xeon Review
Sander Sassen writes: "Intel has recently released its new line of Pentium III Xeon CPUs, based on their new .18 micron process. HardwareCentral takes a look at its performance, utilizing a dual CPU configuration on an Intel i840 platform with 256 MB of Rambus memory as a testbed. This Dual Pentium III Xeon review has all the details of their findings."
I doubt you'll be able to execute code from the scratch area.
The only way I can see a virus trying to use it is as a method of hiding the bulk of its code. Just think of a little micro-virus that is even harder to detect than regular viruses (because it is so small it doesn't have much of a signature) and works by loading itself from the EEPROM when it is executed. This might be viable if the virus detection companies don't think to check accesses to memory the same way the check accesses to the hard drive(s). Of course it would be rather difficult to spread this virus, as everyone would need a brand new ultra-expensive processor in their computers, and the people who tend to buy these things tend to know how to avoid getting viruses.
I read the internet for the articles.
You think I'm kidding...:-)
At a guess, anyone that needs to use a Xeon as opposed to a regular PIII.
"The invisible and the non-existent look very much alike." -- Delos B. McKown
If you read the full article it says the processor is only $50 to $100 dollars more then the slot 1/Socket 370 counter part. The big difference is the management functions that are part of the processor housing such as temp., 2 eeproms, and so on... Given the size of the cartrige and the metal back plate I would imagine that it also cools better and for a server that is good.
Anouther issue is the whole slot two thing. Alot of the i840 motherboards that are in production/planned are slot two making this processor nesicary if you don't want to use a 550MHz processor.
"... That probably would have sounded more commanding if I wasn't wearing my yummy sushi pajamas..."
-Buffy Summers
Goodbye Iowa
Well, maybe EMS wasn't as ugly as some other hacks, but it messed up the programs. I never used it on anything less than a 386, and when I got a compiler that could do protected-mode DOS programs, I promptly forgot all about it. And then I installed Linux instead, and got a flat address space (no need to keep the structures below 64K in size - Yay!).
And yes, XMS was hard to use, but when that's what you had... Again, flat address space rocks :)
And I can also fill you in on the HMA stuff. The extra memory space come from the segmented memory addressing, with a 16-bit segment and 16-bit offset. The true address is calculated by (segment << 4) + offset, which in general creates a 20-bit address, capable of addressing one meg. However, note the overlapping parts of the segment and offset - if you put the value 0xffff in the segment, and anything greater than 0x000f in the offset, you will overflow the address space on a 20-bit bus, and wrap back to low addresses. What they did was to disable the address wrapping / aliasing, and instead merrily continue up into high memory. Wrap or no wrap was selectable by the "A20 enable" thingy (don't remember exactly how it worked).
CISC instructions (fwik) rarely do more than comparable RISC ops. example: x86 mov will do register load/stores and mem->mem copies, but it can't do all at once. the reason x86 code is smaller is due to its variable opcode length. the fact that x86 ops can each do more than one job is just namespace (opcode) compression, and it necessitates microcode and complicates pipelining.
With 256KB RAM, this is clearly intended for the Workstation Market.
:p
Geez, nice workstation!
My nintendo has more than that!
What else are they going to do? They are working in other areas: advanced architecture (williamette, itanium) and Mhz (0.18 process). If increasing the cache also gets them a speed boost, however small, they'll do that too. You will always have a set of customers that are screaming for any amount of speed, regardless of the cost. Xeon is for them.
In addition, I'm sure it doesn't hurt them when comparing against Sparc, Alpha, PowerPC, etc. These all have a ton of cache (8MB!! in the case of high-end Sparcs) and, as we have discussed, more cache implies more speed to most people.
AMD is working on the next generation Athlon chipset, known as the AMD 760. Following shortly after this chipset will be the AMD 770, which will be the same as the 760, but with SMP. What I have read puts these chipsets coming out next year. Via, Ali, etc may come up with something sooner, but it is doubtful.
Intel uses a proprietary "standard" for their SMP implementation. This forced AMD, Cyrix, etc. to invent an open standard, OpenPIC(?). OS's have written SMP drives for Intel's standard, but since there has never been an OpenPIC SMP motherboard, there are no drivers for OpenPIC.
So, before SMP Athlons, you need a chipset, a motherboard, then drivers. Sound like a long, sad road to me. I want one too...
once you look at xeons you get into the price range of the good stuff (IBM, Alpha, SUN etc.)
While true, the "good stuff" doesn't run NT.
Warning: long technical post ahead.
It wasn't an ugly hack at all. It was a way of sending 64k "frames" back/forth over the ISA bus to an add-on memory board. It was SLOW, but not ugly. It supported 32 MB (an arbitrary number IIRC) of RAM on a machine with a 20-bit address space (8088/8086).
Anyway, 80286 had 24-bit addressing (16 MB) out of the box but few early motherboards supported it (although my PS/2 model 30 is expandable to 16 MB), and besides the cards were a lot cheaper than SIMMs. Plus you could of course only access the full 16 MB from the 286's broken "protected mode".
EMS was also easy to use from Real Mode. Of course XMS could be used too but it was a lot less clean than the EMS API (software int 0x68?).
Do not confuse EMS with EMM386. EMM386 emulated "hardware" EMS by providing a layer over top of XMS. It wasn't nearly as slow, and used the same clean API.
However, circa DOS 5.0, people stopped using EMM386 for EMS. Since EMM386 needed to fake its own adapter space (the original adapter boards used an address like 0xE0000 for the 64k "frames"), it contained code to support UMBs (Upper Memory Blocks). You could use UMBs to allocate unused memory in the adapter space between 640k and 1024k in Real Mode.
I would consider UMBs an ugly hack (remember MemMaker? LoadHigh? DeviceHigh? InstallHigh?). They're still in use by Win95/98/ME to hold things like the Real Mode mouse driver etc.
HMA was the ultimate ugly hack. That upped the Real Mode address space from 1024k to 1088k on the 80286 and higher. I have no idea how it works, but you may notice that your Win9x virus scanner looks at a full 1088k of "conventional memory" during boot.
Hands in my pocket
Intel has Physical Address Extensions for 36-bit addressing for a limit of 64GB. Win2k supports this in their Server version. I believe Linux support is done or coming soon. I also think SCO supports it. I have no idea about Solaris and various BSDs. Over at Unisys we have a monster system called ES7000 that supports 64GB. It also supports 32 processors and 96 PCI slots.
IA-64, AKA Merced, AKA Itanium supports full 64-bit addressing for a whopping 16 EB (exabytes!). Microsoft currently claims that Win64 wil only support up to 64 TB, although that may only be in Data Center Server. Anybody know what the other IA-64 projects will support?
-- soldack
Yea, but I doubt management features will carry the load of a large web server. I'm saying that if they intend to put this in the normal Xeon market, then they need more cache, or else people are going to continue to use the older proc, or they will get whopped by AMD and its 8meg cache Athlon.
A deep unwavering belief is a sure sign you're missing something...
The early K6's supported OpenPIC, but it was dropped in the later models.
The Athlon is an Alpha EV-6 protocol. The multiprocessing is point to point, and will be done with a protocol format, much like a switching hub.
The first chipset to support it should be the AMD 760, which should be out later this year. By this time, the Thunderbird processor, with full speed cache should be out as well. This will make the Xeon change their price/performance model to compete.
It's more a northbridge issue than an OS issue. In a point to point protocol, each processor needs its own northbridge chip on the motherboard, which gives each processor full bandwidth to the rest of the system. Once the multiple northbridges are present, implementing multiprocessor support in the OS should be trivial. The code should be essentially identical to supporting a dual P3, except the performance characteristics will completely different because of the full bandwidth of the multiple northbridge chips. A multiple P3 uses a single northbridge chip and GTL bus, which gives diminishing returns on more than a couple of processors.
--- "So THAT's what an invisible barrier looks like!" - Time Bandits
This is probably because the PIII/Aluminummine processors use a fully associative cache, instead of a 4-way or an 8-way associative cache. This basically means that the cache is very good at optimizing itself for the most commonly accessed data, but also means that the cache doesn't scale as well to large amounts. I don't know what the cache association is on the G4 (I could go look, but I'm lazy, so I'll leave that for whoever isn't) but that probably has something to do with this issue.
I have one word to add - Thunderbird. Full speed cache in good amounts, for about 10% of the price of a Xeon. Also, faster clockspeeds (somewhat relevant in this case) - the Thunderbird 1250 should be out in the next couple of months. Multiprocessing won't be available until the AMD 760 chipset comes out later this year, but if you can wait, I think the AMD Thunderbird is going to be a good choice over the Xeon.
--- "So THAT's what an invisible barrier looks like!" - Time Bandits
Or... you can go with real time hardware encoding, like a Matrox RT2000 or Digisuite card. The encoding is done on the video card so the processor really just has to handle the command stream. One of these type of cards is somewhere in the $1200 range, but that's still less money than a higher range Xeon.
--- "So THAT's what an invisible barrier looks like!" - Time Bandits
Re:Two Words For Ya! (Score:3)
by garver on 10:16 AM April 18th, 2000 EST (#36)
(User Info)
As always, when looking at cache, you compare bang for buck. Adding cache costs money, lots of money sometimes. Some processor architectures get more mileage out of added cache than others.
For example, the G4 seems to love cache and screams faster and faster as you add it. Apple/Motorola have found the 1MB cache level to be their sweetspot, most bang for buck. On the other hand, the PIII is not as cache loving. Giving it another 0.75MB doesn't do it all that much good, so why waste the money? Their sweetspot seems to be 0.25MB.
Then why dump so much more onto their high end systems if the performance increase is negligible?
Just fleecing their corporate customers?
I still feel that 1mb cache would do a bit towards increasing the performance of x86 processors based machines.
Kintanon
Check out JoshJitsu.info for Brazilian Ji
Not impressed.
Yippy skip, for 6K$ extra I can drop another 1.75 megs of fullspeed cache on a processor. Gee, big surprise, that increases the performance, whoulda thunkit?!
What I want is for the X86 processor makers to catch up with Motorola and put 1m of full speed cache on their regular processors. I have a hard time finding a processor with 512K of cache, WTF is the problem here?
This is just IBM slapping the market around to try and increase their profits without actually giving us anything new.
Kintanon
Check out JoshJitsu.info for Brazilian Ji
Way to read the article you're commenting on.
"The Iwill DCA-200 motherboard is a pinnacle of stability and performance, but doesn't come cheap. The same applies to the 256 MB of ECC PC800 RDRAM; it is fast, very fast even, we've never seen memory scores this high, but will set you back considerably."
AND
"RDRAM finally showed some of its muscle here, with the highest memory throughput we've ever seen on any memory architecture. The dual RDRAM channels on the i840 chipset really show off its benefits and low latency."
Say it with the group "low latency"
Oh, and about Tom - when you have cancer, do you go to a systems engineer? Then why do you go to an MD for your tech information? Pabst has had some good info over the past several years, but he's also had some very questionable conclusions, and he has been getting more, shall we say, touchy, since the video benchmark fiasco.
THE YEAR WAS 2081, and everybody was finally equal...
I'll agree that it isn't the most useful set of benchmarks, but I disagree with your server only comment.
The Xeon has been marketted as a Workstation/Server chip, and has seen it's way into the SGI NT workstations, etc. With 256KB RAM, this is clearly intended for the Workstation Market. The wider cache bus and the new motherboard are nice additions for the workstation market, but I think that the server market would give up some MHz for the larger caches of the "old" Xeons or get the "new" Xeon in the 512KB, 1M, or 2M (if available?) versions. I mean, 256KB of L2 cache is going to be useless in a large database server, as you'll never make a cache hit, while a larger cache is useful if most of the accesses are within a general range.
However, I agree that this was mostly a stupid review. Testing it against obviously inferior hardware wasn't interesting. I mean, testing it against dual 800 MHz P3s or 1GHz P3s would give an understanding as to what the new cache system does. Testing it against processors from the same family at 2/3s the speed and shouting, wow, it's fast, is kinda silly.
Alex
You are right, 256 MB is a little weak. My personal computer has 384 MB of RAM... The motherboard they used for the test was an Iwill DCA200. This board will support up to 2 GB of RAM. I think the reason that they only used 256 MB was because that much RDRAM memory runs about $1,100. Peguin Computing has an 8-way Xeon system that will support up to 32 GB of ECC SDRAM memory. I am sure there are other x86 based machines like this, but I don't know of any off hand.
.18u my rosy red arse! For those of you not in the know, the .18u measure is the smallest feature measure, or the Lambda of the chip. Every other dimension on the chip is a multiple of that number. It is the distance across the gate of a transistor from source to drain. Now, when they bake the chips, that distance shortens by a few mirons. Unfortunately, the marketing dept. got wind of this and took off with it. Now, they measure the shortest distance from source to drain right near the gate, because the further from the gate the measurement is taken, the wider the gap is. (Sort of a curve...) So in reality, those .18u chips are actually .20u or .21u. It doesn't sound like much, but when you're talking about millions and millions of transistors, that's a lot of space. (But probably still no more than the head of a pin.)
"I threw up my hands in disgust and wondered if it had been such a good idea to have eaten my hands in the first place."
Yeah, no one would be able to beat you to first post.
My other
--I was going to go for the quad setup but I found that two asbestos leg protectors was cost prohibitive.
--these are the same a Celerons right?
This
They just want to give an idea of raw processor performance. What you claim (and I agree with you on the fact an Oracle benchmark would be much more significant to most of us) is a benchmark measuring the overall system performance and no longer just the CPU performance. So, it may not be possible to claim significant performance improvements from such a benchmark, since the result will not depend on sole CPU performance, but rather on the complete disk subsystem performance, memory performance, database tuning, etc.
Bottom line: You are always on you own when time comes to figure out performance in real life situations.
who gives a shit about Dhry stone and Whet stone? i want the Q3Arena benchmarks. mp3s prOn and Q3 are the only thing i use a computer for...
That said, there is a great site that compares the servers and databases you mention, and will likely give you the stats you are looking for. Its www.tpc.org.
No, Thursday's out. How about never - is never good for you?
Anyone know how much RAM you can put into one of these? They tested the system with 256 MB, which is a spit in the ocean for high-end systems nowadays (well, I might be exaggerating a bit...). I think it might be possible to use more than 4GB physical mem by some page table magic, but the per-process limit might be restricted to 4GB... Wait, maybe not - anyone remember LIM EMS? (*) Although, that is very ugly indeed.
As I see it, this is what they have to solve, and solve it pretty quick, if they want to continue selling 32-bit processors. Today, there are lots of people running their programs on supercomputers, only because of the large memory, not because they need the processing power. It would be possible to save millions if the high-end PC class desktop systems could be fitted with, say, 24GB mem.
But the built-in EEPROM was cool, I wonder if you can trick it into using that for booting, a la Sun's OpenBoot prom...? One can always dream.
(*) To those who are too young to remember, EMS was an ugly hack by Lotus, Intel and Microsoft to be able to use more than 1 MB of memory on the 8086 / 80286.
From the article:
Has anyone considered that this could be used to store virii? It'd be a pain - but if manufacturers can use it to keep info about usage data, no doubt it's re-writable.
Just a tiny thought.
Whatever you do... don't read this.
Rumor has it that dual athlon boards will be coming out within the enxt few months.
There is a hitch though: Linux will no support them! Thats right, Linux will not support SMP Athlons today. FreeBSD will not either. The good news is at least NT will not have support.
I think you see the problem: nobody will support them, so they won't sell, so nobody will add support, so . . .
I hope that someone overcomes that problems (and I think the board manufactures are working on NT support before they release something) When it works the Athlons will do much better SMP then any Intel offering. Seems that AMD, Cyrix (Do they make processors anymore) and the like got mad at Intel's SMP scheme and created a better one. The K-5 and cryix chips supported it, but nobody made a board to support it. I don't know if Athlon uses the same older spec or a new (alpha compatable?) spec, but I do know the Athlons all support a SMP standard better then Intel's.
I suspect that a linux implimentation of Athlon SMP will happen when boards are avaiable. AFAIK AMD is not hiding the specs.
So while the test may have been somewhat entertaining it is completley useless. The benchmark isn't anything I recognise as an accurate simulation of a server environment and there are no real life tests. Show me a test comparing this to a Sun box running Oracle and 500GB of data and I might be interested.
Is it just me, or does Intel's new "use one die" for everything seem to have gotten them into a little trouble? I read the article, look for how exactly the new Xeon is different from a Coppermine PIII. Isn't the whole point of a Xeon the large full speed L2 cache? With the PIII having a 256K full speed cache, isn't a 256K Xeon, well, redundant? I do hope there are 2 meg integrated Xeons coming soon, because otherwise, you pay more for almost exactly the same processor.
A deep unwavering belief is a sure sign you're missing something...
Tom's Hardware Guide just had an article which convinced me to stick with SDRAM for quite some time to come. Maybe for highly memory intensive long processes RDRAM is worth it, but how many of us will fin that worthwhile?
The power of accurate observation is commonly called cynicism by those who have not got it. - G.B. Shaw