Intel Pushes Pentium 4 Past 3 GHz
denisbergeron writes "Yahoo has the news about the new P4 who will run at nothing less than 3.06 GHz. But the great avance will be the hyperthreading technology (already present in Xeon) that allows multiple software threads to run more efficiently on a single processor."
Yahoo has the news about the new P4 who will run at nothing less than 3.06mhz.
umm... I've got an XT clone that's faster than that... wanna buy it for about $600?
(/sarcasm)
Yeah, but what's its top speed?
Humm, this raises a point for me. Of course they claim it is faster, but when exactly ?
...
I mean, is it faster when doing stack swaps or when using TSS to multitask? *BSD uses the TSS to multitask, taking benefit of the i386's way to quickly swap registers and stack. Windows doesn't do this
So, from a pure technical point of view, how does it work? Did they just make TSS switches faster? Some OS-es benefit highly from that, but others, well, don't.
Unreal Tournament 2003 just kicked my 1.0 GHz machine in the nuts and then made fun of me. If for no other reason, I'm glad to see this announcement, because I can expect a price drop on the 2.6 GHz and 2.8 GHz chips.
Hyperthreading works well for certain types of software, and awful for others.
Here's an article from Ars Technica on HT/SMT.
I think you over simply todays word & document processing.
:)
For example, we use Microsoft word with built in excell spreadsheets and ODBC queries that update charts in real time from an Oracle database as well as include visio stencils and other good stuff. This is a 40+ meg file in raw format and a lowly 1.5ghz with 512 megs of ram takes time to re-draw. We daw a huge performance increase from 1.5ghz to 2.4 Maybe "hyper threading" will help out even more.
BTW, it is about the same performance under linux using staroffice or corel office. KDE Office is even slower, so i know its not just the tools
For people who *WORK* using there pc, you can never have "too much" power. Its like race cars, maximizing performance for the job at hand.
I was torn between building another dual-CPU box (currently on twin 533Mhz Celerons with an ABit BP6 board), or going the small form-factor route. Now I can do both.
More at Shuttle's site.
Cheers,
Ian
Hyperthreading is a complex proof of the limitations of todays CPU architectures. I belive in a CPU architecture containing many small CPU cores on one chip, instead of just multiplying the issue and commit parts and sharing the execution units.
It would be more scaleable and easier to implement to use several complete CPUs. The biggest drawback (compared to hyperthreading) would of course be that in special situations some CPU cores would be idle, but this simply corresponds to pipe-line bubbles in the hyperthreaded case. This is easily compensated by two facts: 1) multiple CPUs can be made very scalable and 2) most computer systems today always runs multiple threads (i.e. utilization will be good).
Of course, for Intel to maintain their market lead, everything has to be compatible, so they'll have to pay, time after time, for the errors they made in the eighties (the 286 paging + the CISC ISA). By breaking Amdahl's law time after time (SSE, MMX, etc.) they have made an even more complex beast. The only area where they really excel is in the production processing. They can squeeze out high frequencies and pack the transistors tight. For that, I'll give 'em cred. For their CPU ISAs, I'll just laugh...
3.06milliHz ! Wow ! That means about ten clocks an hour ! With the super deep P4 pipeline (20 deep IIRC), it means it will push some 200 "single clock instruction" in just an hour. But beware of pipeline stalls. They better have a solid branch prediction algorithm.
The speed is for Clippy, not YOU... he now is 3D ray-traced and has more artificial intelligence built in!
If it wasn't for the idea of WYSIWYG and fonts, I'd still be doing my word processing on AppleWorks for the Apple ][.
Never hit your grandmother with a shovel, for it leaves a bad impression on her mind...
Securing the physical pathways that transpoty data on a computer's motherboard. This will sure help me against those tiny little hackers inside my computer stealing my data!
Oh wait, you mean this is to protect the data against me? Looks like we have about a year before this is built into the PC architecture. Plan your computer buying wisely.
Bastards.
It's not wasting time, I'm educating myself.
Maybe you don't want 3.06 GHz for what you're working on, but our "Enterprise Class Systems" (Win2k application servers) can use all the CPU we can throw at them. Everyone has different needs, and for a lot of folks, faster processors are a good thing.
Are they actually CPU bound, or are they slowed by memory access and bus bandwidth? Apart from certain numerical computations, I have rarely seen cases in which the CPU is really fully occupied, altho' the tools often report that it is. For example, tools will report if the CPU is idle waiting for a page fault to the swapfile, but not if it's waiting for data to get to or from main memory, it just looks like the CPU is occupied.
Knowing what I know of Citrix, it alone is far bigger than the L2, and that's before even considering the user applications. It requires the CPU to switch context heavily, and constantly flush and reload its L1/2/3 caches. After all, if you need 4G of RAM to run the applications you are using, and you have say an 8M cache, the CPU is going to be spending a lot of time managing its cache rather than doing useful work. Given that, it is bound by memory access, not raw CPU.
Manufacturers, driving by consumer marketing which believes that higher Mhz == better product, are optimizing in the wrong areas. If they want to talk numbers, they should be pushing fast memory and buses which are actually a useful measure of a machine's performance, not CPU Mhz which isn't.
The advantage of Linux (and to a lesser extent W2K) and the low end Solaris, AIX servers is that for the first time it was sensible to scale horizontally, so rather than have 1 box that did everything ala a Mainframe you'd have 10 that shared the work, then you'd add 5 more. And because the real bottlenecks now are disk and other IO issues you start using things like EMC, Cached RAID disks and lots of other very expensive storage.
But if you are scaling an application horizontally the last thing these days is the processor speed, sure the heavy duty maths is still sitting on a mainframe, your ERP is still on an AS400, but that is more about reliability than power. Intel boxes fail, period, so having one box isn't a smart move, have 10 is a more sensible approach.
Dual NIC, external disk via fibre channel. That is where I'll spend the cash. The processor just needs to be fast enough, and I'd like there to be at least two in the box. 2 Boxes doing everything, federated systems.
If you lob everything on one box, then yes you need all the processor speed you can handle, you also need to think about what happens when the box fails.
If Intel announced that this new processor could degrade its performance when issues arose then I'd be interested. Overheating ? Turn off hyperthreading and drop the clock speed. Still got issues, move down to minimum speed and start a shutdown process.
I like servers that will run for 5-10 years with no down time. But with Intel/AMD boxen I'll stick with lobbing in lots on the basis that they'll fail.
An Eye for an Eye will make the whole world blind - Gandhi
Why not spend more R&D money in increasing the speed of the bus? It would give us way better performance.
Whoopie. Another EE student who has realized that the paper design of the PC architecture sucks wind and can't imagine that it works at all.
Don't worry folks. In a few years he'll graduate and get some real world experience. And then he'll probably realize that while the PC architecture does indeed suck on paper, in reality it's not all that bad. Could it be better? Sure. Should we throw the baby out with the bathwater? No way.
Compare the PC market to the rest of the computer market. Who's made more progress? Who has been rapidly pushing the niche markets into smaller and smaller niches as their "superior designs" find them running slower and more costly than the evil, horribly misdesigned PCs?
Coprocessors? Yeah... have you even bothered to look at a modern video card recently? The damn things are more complex and more powerful than the CPU. Modern audio boards are also powerful all by themselves. For the most part I/O is handled by separate chips as well.
The bus and memory interfaces on PCs could use some work. And that's happening, with 3GIO, PCI-X, and other buses being implemented in the next few years. There's some truely horrid cruft in the core too - the IRQs, DMA channels, etc. are still pretty godawful, but not nearly as godawful as they were back with the ISA bus. The issues haven't so much gone away as they've been hidden, but the performance limitations imposed really aren't all that absurd.
Design a better machine? Go for it. It'll die just like all the rest because while you may have a better electrical design, you've ignored the real world and the fact that people want to be able to make slow transitions from one architecture to another. Doing an all-at-once transition is not an option unless you control the entire market - which no PC manufacturer does (unlike Apple). Of course, the flip side of this is that the competition causes the current implementation to advance far more rapidly than would be otherwise possible. Which is why you can buy a $2000 PC that outperforms a $200,000 server.
Spreadsheets were the killer app that caused the PC to take off, and Lotus 123 came with a super-annoying floppy-based copy protection scheme. They intentionally misformatted the floppy, then the program verified that it was an original by doing low-level tricks with the floppy controller.
The most ridiculous and shortsighted part was that they used CPU-based timing loops to do the timing for their stupid floppy tricks. Of course, these were calibrated to the only CPU speed available at the time, 4.77MHz. As a consequence, if a PC was going to run Lotus 123, it needed to be able to slow down to the original 4.77MHz speed while it read the Lotus floppy. IIRC, Compaq had a nifty patent that automatically slowed the PC whenever the floppy controller was in use. Others had to make do with a manual switch.
The cost to society for this DRM fiasco, hundreds of millions of useless bezel switches, undoubtedly was far greater than any revenue that Lotus made by thwarting piracy. (In fact, their revenue from DRM might be negative, because they were eventually displaced by non copy-protected comptetiters.)
Sorry, but your post reeks of "armchair CPU designer" : It's all so clear and so obvious. I mean, it's not like Intel and AMD have a lot of extremely clever people who seek the best balance between all of the systems...is it?
Yes, they are slowly improving, but modern PCs are still behind where workstations were years ago, and a modern Intel based server is well behind a SPARC based machine.
Intel and AMD will spend their money on whatever generates the most ROI. They have collectively spent literally billions of dollars convincing Joe Public that CPU Mhz is the best way to measure the speed of a system - they aren't going to throw that away. A competent manager with R&D dollars to spend will therefore spend them on increasing Mhz.
Oh, and your post reeks of being underexposed to any architecture other than x86.
though the cost/benefit is out of whack. A P2 2.4Ghz with 2MB of L2 would get trounced by a 2.6Mhz with 512MB of L2 cache, disputing your claims that CPU speed doesn't matter. Large cache chips only make sense if you can't get a faster CPU:
Yes, assuming the code to run is 512k in size. If the code is ~2M, so it fits into L2 on the slower processor, then it will have the advantage, because the faster one will have to waste cycles moving the cache back and forth to main memory. Cache size is related to CPU speed only in terms of memory bandwidth: if your CPU cannot get data from main memory fast enough to keep it occupied, then you need faster memory closer to the CPU, which is what a cache is. If you are context switching, then you will have to keep dumping the cache and reloading it, which puts larger caches at a disadvantage.
Ultimately, caches are a hack; an elastoplast solution to the fundamental problem, which is the mismatch between the rate at which a modern CPU can process data, and the rate at which memory can supply it. In an ideal system, there would be no CPU caches at all, because the CPU could get data from main memory fast enough to keep it fully occupied. Systems used to be built like this, before the current obsession with clock speeds.
I know that there are some of you on here that will flame me saying that you DO use that power. And that's fine, you are the 1% of the population I mentioned earlier.
Everyone, of course, believes they're in that 1%.
I used to do commercial 3D video game development on a 450MHz P2. It was a bit slow when compiling, but acceptable otherwise. Then I upgraded to an 866MHz P3 and, even years later, it still feels like lightning. Compiles are quick. Everything is snappy. I've taken to writing tools in Perl and Lisp and Python, and they're snappy as well. I mean, geez, who would have thought ten years ago thay you'd ever be able write 3D geometry manipulation tools in Lisp and have no worries about performance?
Now, of course, you can buy a 2.5GHz P4 in an $800 PC. This is beyond ridiculous. Everything is three times faster than "beyond the point of caring"? I'm going to put C++ aside for almost everything, and just use whatever is the most abstract. Haskell? Yes, please.
Am I in the 1%? Certainly not.
It may help the economy in the short term, but you will just be wasting precious electricity (in this case gobs of it) just to say you have the latest and greatest. It's becoming a disease!
This bothers me, too. Yeah, people don't need all this performance, and that's okay. Who cares if your computer is too fast? But unfortunately you don't get all this performance for free. It's coming at the premature obsolescence of hardware and greatly increased power consumption. Hard drives and monitors are actually improving in this regard, especially with LCD monitors (awesome!). But now we have 70 watt processors and PCs that ship with five or more fans in them, and we're talking bottom end machines from Dell and Gateway here, not crazy high-end monsters. This is bad.