Slashdot Mirror


User: sampo

sampo's activity in the archive.

Stories
0
Comments
7
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 7

  1. Nothing goes with a Catalyst 6513 quite like http://www.monstercable.com/productdisplay.asp?pin=1727

  2. Re:To quote somebody more intelligent than me... on Apple's Aperture Reviewed · · Score: 1

    This is patently false. Aperture will export your raw files non-destructively if you use the Export Masters command instead of Export Versions.

  3. Maybe they used an Intel Centrino laptop on HP Introduces New Technology to Save Mobile Battery Life · · Score: 2, Informative

    While HP was busy researching this, Intel actually implemented a wide array of power saving measures for displays in their centrino chipset, including Intel DPST which can modify the backlight on the fly depending on the screen content to save power.

    http://developer.intel.com/technology/itj/2005/v ol ume09issue01/art05_perf_power/p04_gmch.htm

  4. P4 performance thoughts on C`t Throws Athlons And P4s In The Gladiator Pit · · Score: 2

    I figured I'd get my say in about why the p4 currently performs the way it does. All of the arguments I'll make here are based on data from Intel's various spec sheets and documentation on the Pentium4.

    1. Why doesn't the P4 blow away the P3 on normal integer code, despite having double clocked ALUs?
    Two reasons. First, is the issue that everyone keeps bringing up: mispredicted branches causing part of the pipeline to be flushed. At 20+ stages, there's a fairly large penalty for this. I don't think this is as big of a problem as most people have been led to believe. The branch predictor is 33% better than the one on the P3 according to intel, which helps a lot with this. With the use of a P4 aware compiler, branches are also laid out in such a way as to help out the static predictor (in the case the branch is not in the branch history table, or is unpredictable), and the P4 can use branch hints emitted by the compiler or assembly writer. So at least with newly compiled programs, branching shouldn't be a huge issue. With older programs, the improved branch predictor should help a lot. The real problem with the P4's integer performance (and why it performs dismally on RC5 is that shifts, rotates, and multiplies all have increased latencies (although the throughput remains either the same or very close per clock) compared to the p3. So code that expects to get the result of a shift or rotate back very soon is not gonna be happy when the p4 takes 4 cycles to do so. Small shifts can be replaced with a series of adds which have both higher throughput and 8x lower latency than shifts on the P4, but the compiler has to know about this in order to optimize it.

    2. Why does the FPU perform so poorly in some things, and great in Quake3
    My answer here would be that the FPU in the P4 is almost identical to the one in the P3 in terms of throughput, but the operations have a longer latency. So again, code that expects p3 or athlon latency instructions will get a rude awakening. Quake3 (and this is total speculation) probably uses a large unrolled loop of FPU ops that would hide the latency issue, but benefit from the same per-clock throughput as a p3...at 1.5x the clock speed. If you check out JC News (www.jc-news.com/pc) there's is someone claiming that Quake3 has no SSE/SSE2 optimizations whatsoever, so the FPU routines are probably just tuned to get as much throuput out of a pipelined FPU as possible. Use of the scalar SSE2 FPU ops on a P4 (by the compiler, or the assembly writer) can decrease latency while using the SIMD ops can increase FPU throughput. Obviously, Intel has decided to make some tradeoffs here. Shifts, rotates, multiplies are all slower, while add/sub/not/or/xor/and are all faster. The trace cache is perhaps the most interesting thing about the P4, as the x86 instruction set becomes a one-time cost for the core most of the time. It is in essence, a very simple code-morpher (x86 Ops -> cached uOps).

    Speculation time :-) ... I wouldn't be terribly surprised if a future p4 variant started making use of the huge number of transistors available on a 0.13 micron process to do some fairly massive optimization or redordering of code in the trace cache, similar to what Crusoe's code morpher does in software. Intel seems to have finally found a way to remove the "CISC penalty" of IA-32 code. I say good luck to them and AMD both, as long as they compete, we the consumer wins.

  5. Re:Whooppee... on Intel Pushes Low-Power Crusoe Challenger · · Score: 1

    Yeah, the G4 is an inferior processor to your athlon because the file system in a beta operating system can't keep up with NTFS. Wow, that's some solid evidence. Why don't you just say your P2 sucks, because it's running linux 2.1.50 and your K6 kicks it's ass with linux 2.4.0-test. Perhaps if you actually watched the Jobs demos, you'd realize he was comparing Photoshop performance, not disk performance.

  6. Re:Let's compare this to when the P6 was introduce on Pentium IV Problems? · · Score: 1

    But damn does that 400HP lawnmower mulch well, and if you put it in fifth gear, you can take it out on the freeway. I mean, it's almost as cool as pouring hot grits down your pants.

  7. Re:Forget dual boot, think omniboot... on IBM To Demo Crusoe Thinkpad · · Score: 2

    this is incorrect. They made macs which had PCs on PCI cards. Hitting command-return would switch between the PC's video display (hooked via a special connector to the mac's monitor) and the mac's video display. The keyboard and mouse input would automatically switch over, and you could cut and paste between them. The 486/66 was just as fast as a PC with a 486/66, and the mac side was a PowerPC 601 at 66MHz as well. They made other PC Compatible versions, up through the 4400/200 I think which had a P133 on it's PC PCI card.