Slashdot Mirror


User: Anderson

Anderson's activity in the archive.

Stories
0
Comments
39
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 39

  1. mmmmm, benchmark(et)ing :) on Interview: Debian Project Leader Tells All · · Score: 1

    Okay, I did some experiments on this end with bzip2, a completely integer code. I compared the default Debian compilation (flags -O2 -g, then stripped) against a couple of other compilation options. The test runs were to compress and decompress a 700K-ish file connecting the two processes via a pipe, a la: bzip2 -c file | bzip2 -cd file > file.0. I then compared file to file.0 to make sure that they were the same (and I found no errors for all of these runs). All runs were on my practically-idle P2-233 laptop.

    Flags -O2 -march=pentiumpro: 2.6% slower (a bit unexpected)
    Flags -O6: 1.2% slower (odd, again)
    Flags -O3 -march=pentiumpro -fomit-frame-pointer -funroll-loops -finline-functions -fstrict-aliasing: 3.6% faster

    I can't say these are scientific (although I did run each test four or five times, and took the mean score after throwing out the fastest and slowest), but they might help in guaging general performance increases for machine-specific optimizations. The majority of code is integer code not too unlike bzip2 ... the floating-point-intensive codes like xmms will probably benefit more, although I tried the same experiments with that and found similar results. (Hard to put real numbers on that since xmms is an interactive graphical program.) Ummm, I dunno. Since this is a Debian thread, another thing to note is that slink (Debian stable at the moment) was compiled with a completely different compiler than potato/unstable. So if you compare between the two, you'll probably find that potato (unstable) is a fair bit faster overall because of being compiled with the newer egcs compiler (which has the faster Intel-sponsored x86 code generation backend). As I've said before, for real (original) Pentiums the difference is probably significant, but for newer CPUs the optimizations get lost in the noise of everything else. Or that's how it seems to me.

  2. Re:pentium optimization on Interview: Debian Project Leader Tells All · · Score: 1

    Yeah -- for more precise timing measures, you might try running xmms in a non-interactive mode (is that possible?) and using the "time" command to get better measures of CPU cycles used. For floating point (and I forgot to mention this, duh) the P6 processor series can *really* benefit from architecture-specific scheduling. It has to do with the floating-point pipelines having a shared multiplier, and good scheduling can really boost the floating-point performance of the P6-series CPUs. Silly me for forgetting that. For the context of this thread, you might try seeing if "-O2 -march=pentium" gives you much of a performance boost -- after all, I don't think there are any "Pentium Pro/II/III/Celeron Optimized" Distributions out there, regardless of how much more efficient xmms is. :) Another (IMHO better) test would be to recompile something like the Python or Perl interpreter with these optimizations, and see what sort of speed benefit you get. gzip is also a good candidate for architecure-specific optimizations ... but in any case, test this on something that isn't so floating-point heavy, since you're right that the Celeron (being a P6-core processor) really likes good floating-point scheduling. That doesn't change my assertion that a "Pentium Optimized" distribution gains most of its speed from better non-architectural compiler optimization.

  3. Re:pentium optimization on Interview: Debian Project Leader Tells All · · Score: 1

    The optimal instruction selection is, as expected, different for the different CPUs. I was lumping that into "good scheduling" rather unprecisely. There are additional instructions defined by the Pentium and P6-series processors -- for the Pentium MMX processors, this is mainly the MMX instructions. The problem is that MMX is only applicable to algorithms that do significant amounts of vector-like integer work, and is furthermore a little difficult to code in an automated manner (e.g. by a compiler). The Pentium Pro (Cyrix 6x86 and Athlon also have this, I believe) adds some conditional move (CMOV) instructions, which can be used to fold control dependencies (branches) out of the way. In a deeply pipelined processor like the PPro or Athlon it's a big win. I'm not sure of non-MMX instructions added by the Pentium, but there are probably a few. There's of course the 3dNow! instructions from AMD (and iSSE from Intel) for SIMD floating point that accelerate things much like MMX -- again, no one has put compiler support for those out in the world at large yet, to my knowledge.

  4. Re:pentium optimization on Interview: Debian Project Leader Tells All · · Score: 2

    To elaborate on my other post below, while I don't have the numbers to back it up I'd be willing to bet that 95% of your performance difference is both in the -O6 and the version of gcc you're using, not the -mpentiumpro switch (although that helps a bit). Perhaps Debian should be more aggressive in their use of compiler optimizations (standard Debian flags are -O2 -pg, then strip the binaries), but again, I don't think architecture-specific optimizations are the real difference here. Especially not for Celerons, which have robust dynamic scheduling onboard.

  5. Re:pentium optimization on Interview: Debian Project Leader Tells All · · Score: 3

    Okay, here's some perspective on this from a computer architecture standpoint. "Pentium Optimization" is only some specialized scheduling to take into account the weird structural conflicts that arise in the original Pentium (and MMX) chip from Intel. This can give as much as a 30% performance increase (that's the number Intel likes to bandy about, so add salt) on integer code. But this -only- applies to the original Pentium and the Pentium MMX. Any other CPU out there gets no benefit, and some (e.g. 486's, AMD K5, Cyrix 5x86) actually slow down either from the increased memory bandwidth utilization, or because their own internal resource usage requirements conflict with those of the original Pentium.

    Now the fun part -- what about the K6, the Cyrix 6x86, Pentium Pro/II/III/Celeron, and the Athlon chips? Well, you can schedule specifically for them as well, but you won't see as large of a performance gain. The reason is that all of these CPUs have much better on-chip dynamic scheduling (out-of-order execution, register renaming, speculative execution, etc.) and thus don't need really good scheduling back-ends to achieve fast performance. This is especially true of the Athlon and Intel P6 core chips -- if you didn't know, the Pentium Pro, Pentium II, Pentium III, and Celeron are all very close architecturally, and are known as the P6 series (and oddly, they have almost -nothing- in common with the original Pentium and Pentium MMX).

    When are these machine-specific optimizations important? Well, for compute-intensive stuff that you execute a lot. So if you want 99% of the benefit of using a compiler that schedules for your CPU, recompile your C libraries, your X server, and any large applications (KDE is a good one). I will say that a lot of the performance difference you probably see between a "Pentium optimized" distribution like Mandrake or Stampede and something like Debian is not really due to scheduling for the Pentium -- it's probably in the makefiles used for compilation, in the choice of compiler code optimizations like -O2 vs. -O3 (the fastest instructions, after all, are the ones you don't execute :), etc. That doesn't mean "Pentium optimization" doesn't help, but it only makes a major difference for the original Pentium and Pentium MMX, AFAIK. The dynamic scheduling hardware in more recent CPUs can in effect "Pentium optimize" (e.g. reschedule) any code they encounter on the fly.

    Also, don't discount the user perception optimization factor. Tiny differences in latency, load speed, and just the knowledge that you're using a "pentium optimized" distribution can make a large perceived difference in the speed of a system, regardless of the actual performance delta.

  6. Debian is not in crisis (historical perspective :) on VA, O'Reilly, and SGI Sponsor Debian in a Box · · Score: 4

    Well, for historical perspective, this is a normal Debian modus operandi. :) Potato has lots of outstanding release-critical bugs because there is a *lot* of distribution there. And typically Debian releases take forever, everyone bitches about that, and everyone including at least a few Debian developers claims that the project will never make it because of the slow release cycle. (Incidentally, some believe that no Debian release can happen without a large hardware failure among the Debian development and mirror machines. This has held true for every release so far.)

    But I wanted to point out that Debian is a free software project like any other, and that means that all the dirty laundry gets aired on public mailing lists. That may make some people uncomfortable, but it's not a sign of impending doom. In fact, it's remarkably similar to what happens every time Debian takes more than about a week to release a new distribution. (Yes, some people were complaining about the release cycle within a week or two after slink was released. Go figure.)

    On a different note, there has been a lot of discussion on the Debian mailing lists about fixing the long release cycle-time, and eventually things will probably change for the better. But with a few hundred vocal, independent-minded developers, there's a fair bit of organizational inertia to overcome.

    Debian as a whole is doing quite well, compared to some past crisis points. In answer to someone else's post, I wouldn't judge the suitability of a distribution based on "its future". Use what works now -- and with Debian you can either have the rock-solid stable distribution, an "unstable" distribution that is often as solid as some commercially-released distributions, or you can selectively pull the sources from the unstable archives to update your stable distribution (this is what I use for machines I'm paid to maintain). Debian includes some nice source management tools to help automate downloading and building updated versions of packages from source. Debian isn't for people who have never used Linux (IMO), but for those who have some experience and want a system they can abuse and that will rarely let them down, Debian's the ticket.

  7. black pots, kettles, and the Florida peninsula on Review: The Celebration Chronicles: Life in Disneyville · · Score: 1

    To cpt kangarooski ... I must say I never thought I'd come to defending Tallahassee on Slashdot. It just never occurred to me that the two would coincide. :) Tallahassee does have culture, and while geographically it might be the armpit of Florida, otherwise it's pretty decent (and getting better in some respects). So thanks for sticking up for the place, although it certainly has its faults. But a lack of culture (however ass-backwards it may be :) is not one of them...

    I find it ironic that the heaviest criticism of Tallahassee and the panhandle (which can be very different things or one and the same, depending on how you look at it/them) came from someone out of Orlando ... or was it Tampa/St. Pete? If the latter, okay, I'll take offense and go on with life. :) Orlando, though ... I lived for a few years in Tallahassee, and now live in Houston, TX. And hey, if you ever want a preview of what Orlando will be like if no one down there figures it out quickly enough, come to Houston. It's this huge city with a disproportionately small personality ... it really is Orlando on a massive scale. Not that I dislike Orlando -- I even like Houston. They just lack personality ... which is something you really can't say about the panhandle. Like it or not, the place definitely has a distinctive presence.

    In my book, even ass-backwards culture outranks no culture at all -- for me, Tallahassee was pretty okay. In fact, I'd say Tallahassee, Jacksonville, and Tampa/St. Pete are some of the best places to live in Florida (but not necessarily in that order), although for the net-savvy crowd that Slashdot is (mostly), the overall level of technical sophistication in Florida is a bit on the low side. Obviously, there are exceptions to this ... but the concentration of good technical companies and good technical people is quite low in Florida, considering that it's right behind New York, California, and Texas in population.

  8. black pots, kettles, and the Florida peninsula on Review: The Celebration Chronicles: Life in Disneyville · · Score: 1

    To cpt kangarooski ... I must say I never thought I'd come to defending Tallahassee on Slashdot. It just never occurred to me that the two would coincide. :) Tallahassee does have culture, and while geographically it might be the armpit of Florida, otherwise it's pretty decent (and getting better in some respects). So thanks for sticking up for the place, although it certainly has it's faults. But a lack of culture (however ass-backwards it may be :) is not one of them...

    I find it ironic that the heaviest criticism of Tallahassee and the panhandle (which can be very different things or one and the same, depending on how you look at it/them) came from someone out of Orlando ... or was it Tampa/St. Pete? If the latter, okay, I'll take offense and go on with life. :) Orlando, though ... I lived for a few years in Tallahassee, and now live in Houston, TX. And hey, if you ever want a preview of what Orlando will be like if no one down there figures it out quickly enough, come to Houston. It's this huge city with a disproportionately small personality ... it really is Orlando on a massive scale. Not that I dislike Orlando -- I even like Houston. They just lack personality ... which is something you really can't say about the panhandle. Like it or not, the place definitely has a distinctive presence.

    In my book, even ass-backwards culture outranks no culture at all -- for me, Tallahassee was pretty okay. In fact, I'd say Tallahassee, Jacksonville, and Tampa/St. Pete are some of the best places to live in Florida (but not necessarily in that order), although for the net-savvy crowd that Slashdot is (mostly), the overall level of technical sophistication in Florida is a bit on the low side. Obviously, there are exceptions to this ... but the concentration of good technical companies and good technical people is quite low in Florida, considering that it's right behind New York, California, and Texas in population.

  9. It's not all execution units! on Athlon Reviews · · Score: 1

    If you look at a diagram of the layout, you'll see that a large portion of the transistors and chip area is in the L1 cache, all 128K of it. High speed L1 cache is just area-consuming and difficult to make, but it's an absolute must for scaling and high speeds. Really, the only difference between a decoupled x86 design (like K6, P6, and K7) and a "true RISC" (like there are any of *those*, laugh) design is the extra decoding and retirement logic. Internally, they're all just high-powered RISC-like machines.

    Saying "and it [an Athlon] only just beats a PIII" is really quite wrong -- in some areas, it completely dusts a PIII ... and in fact, in a lot of areas it dusts everything shy of the HP 8500 and the Compaq Alpha 21264. Perhaps you're referring to per-clock performance -- which is irrelevant, since if you can't (or won't) get to higher clocks, it doesn't matter if you can do twice as much at half the clock, at least not in a performance contest. The Athlon doesn't have all those transistors for no particular reason -- it has them because the design team focused on performance, somewhat at the expense of the compactness of the design. History (see Moore's Law) would say that they are correct in this decision ... the process will shrink to accomodate more transistors and lower the price, power consumption, etc. much faster than spending the extra design time. As the Athlon shows (beating the P3 in integer performance per-clock, and almost doubling the P3's double-precision floating-point performance, among other things), the x86 architecture has a fair bit of headroom left ... there are still many design tricks to pursue, as there are in all the other architectures "out there".

  10. Re:Linux on Athlon Reviews · · Score: 1

    Actually, the easiest and quickest (and most pronounced) change would be to add the Athlon MTRR (Memory Type Range Register, or the 'fastvid' thing for some of you ... :) control code into the kernel. The other optimizations can all be done with compiler tweaks (as you suggest). The MTRR change should be easy though -- apparently the Athlon MTRRs are compatible with the P6 versions. And as for me -- I'll be glad to code it whenever someone gives me an Athlon...

  11. Re:XFree4.0 on XFree86 News · · Score: 1

    And not only that, you forgot the large number of people with Matrox G200's. Those are accelerated, too -- they're actually quite fast under Linux (and cheap! :). So yeah -- only a few people with accelerated 3D? Maybe a year or so ago, but not now.

  12. Re:Linux support for this ? on Athlon Benchmarks Out · · Score: 1

    From what I heard, the boards should look like a standard chipset, from the OS programmer's point of view. So, really no adjustment would be necessary. Someone also noted on the linux-smp list that AMD supposedly said that the SMP support would be compatible (again, from a software standpoint) with the Intel SMP spec. So, in that case Linux would require *zero* modification to run on K7 platforms, SMP variants included.

  13. You're thinking of the K6, not the K7. on Athlon Benchmarks Out · · Score: 1

    First of all, as for 3DNow under Linux -- there is absolutely *no* reason you can't use 3DNow under Linux. In fact, the new version of the Mesa OpenGL-ish libraries can use 3DNow if you have it, and people have reported 10-15% speedups in 3D geometry setup (which is what it is designed to accelerate). There's also a patch to mpg123 to use 3DNow to knock down the CPU load playing mp3's, and I think one of the mp3 encoders also has a 3DNow patch. 3DNow is simply some special instructions for using the floating-point registers in a special, funky, parallel way -- there's no operating system support required (unlike SSE, which requires rewritten context switching code). In fact, the K6-3 is probably one of the best available Linux CPUs, because it's cheap, blazes on integer, and the on-chip 256K L2 cache helps it out a lot in multitasking performance.

    As for it being a headache to use AMD -- this is sometimes true. There were some Super7 motherboard compatibility issues (esp. with the TNT cards), and in general video card makers haven't made that big an effort to support AMD, thinking that most of their market was based on Intel processors. This is changing -- and if the K7 is as high-performing as it initially seems, then I would bet most video card makers will optimize for it, if for no other reason than to show off the performance of their high-end 3D accelerators.

    Your last question is about the K7 in 3D games -- if anything, the place the K7 should shine is floating-point performance. The K7 should wipe the floor with P3s in Quake and other 3D applications. I kid you not. And this is just straight, raw floating point power -- no 3DNow or SSE optimizations required (although they could make it go still faster, theoretically). If you want to go really fast come October, you're going to buy a K7-650 and a Matrox G400Max, and that will be faster than anything you have ever seen ... under Windows *and* Linux.

    If your thing is fast floating point, if gaming performance is what drives you, then the K7 is your chip.

  14. Re:*COUGH* BS *COUGH* on Athlon Benchmarks Out · · Score: 2

    As someone has pointed out below, SPEC benchmarks are designed to measure high-end performance, so they may not translate *directly* to Q3 frames-per-second. But they are proportional, to a large extent (at least on the x86 platform). And saying that CPU X beats CPU Y in Unreal FPS is a far cry from CPU X beating CPU Y in SPEC. The rules are a lot tighter for SPEC -- on the SPECbase numbers cited for the K7, there really isn't much room for compiler optimization or any other forms of cheating.

    SPEC is a collection of thing like gcc, some heavy-duty simulations, other text processing, etc. designed to measure the integer (no extra "r", AMD!) and floating-point performance of a CPU-motherboard-memory-compiler system (a CS class in architecture will convince you that these are largely inseparable components, as performance goes). So while SPEC is "only a benchmark", it does determine that measurement using things we care about (e.g. gcc).

  15. I think you oversimplified that. on Athlon Benchmarks Out · · Score: 1

    The short answer is: sorta, but not like you're thinking. Anything purely integer based, probably about the same speed as an equivalently-clocked K6. (Remember that the K6 really doesn't get much above 500Mhz on the .25mu process, so ... it's hard to compare them clock-for-clock.)

    All the things you mention, though -- playing mp3s and mpeg2 movies -- are floating-point intensive. So, in that case the K7 should completely leave a K6 in the dust -- on the order of 100% (or more) faster at the same clock. (Pipelining is a wonderful thing.) More like "AMD K7 can play a 44100 16 bit mp3 using 5% *of* the cpu time (95% less) that a K6 uses", and the same for the movie. That's a substantial difference, if you ask me. Granted, maybe you don't need that much power -- the K7 is targeted at the high end engineering workstation market and the "enthusiast consumer" (people who live for Quake3, they mean :) ... if the K6 does all you need (and you wrote the above as if you already own one), then don't upgrade.

  16. P3 optimizations == K7 optimizations, in general on Athlon Benchmarks Out · · Score: 1

    You are correct at first when you say that the K7's double-precision floating-point is quite fast. Actually, the K7 on double-precision should be (relative to the P3's performance levels) much faster than in single-precision. This is relatively speaking -- meaning that the K7 should be *much* faster than the P3 on double-precision, and only a good bit faster on single precision (depending on the code mix, see below).

    Also, the K7 is (or should be, depending on your level of optimism) quite a bit faster in single precision, too. The main difference is that the K7 has two true multipliers: the PPro/2/3 only has one multiplier, so to get full utilization of the pipelined floating-point unit, you have to intersperse your FADD and FMUL instructions. This is the "Intel optimization" that makes such a difference for floating point on the P6-series, and seems to hurt the non-Intel CPUs. The K7 should run fast even on code that is Intel-optimized, in addition to being able to handle the cases that cannot be optimized to run quickly on a P6 processor. Furthermore, the K7 has a halfway-pipelined division unit (something that is not pipelined on the P6 series), so that it can do a division and a multiplication sorta in parallel -- the multiply doesn't have to wait for the divide to completely finish, they sorta share the multiply unit. (Maybe the best way of saying it is that the K7 has 3 and a half floating-point pipelines. Or maybe that's an oversimplification.) Last, the K7 has a dedicated load/store/housekeeping pipeline in the floating point unit (2 pipes for add/multiply, 1 for load/store/housekeeping), and that eliminates a lot of the penalties associated with the x86 stack-based floating point architecture. This is one of those hard-to-quantify (in terms of absolute cycles) things that should make the K7 blaze on floating point.

    As for Intel-specific optimizations, the K7 and P3 are more alike than they are different: the K7 is just bigger (more instructions/clock) and easier to manufacture at higher clocks. (The deep buffering and scheduling depth doesn't hurt, nor does the 128K L1 cache ... :) So really, anything that's optimized to run fast on a P3 will also run fast on a K7, unless you're referring to SSE instructions. In the case of SIMD instructions, a P3+SSE and a K7+3DNow have the same theoretical throughput -- it's always up to the compiler/programmer to get the most out of that.

    There have been a number of different "third-party" (unnamed third parties, that is) confirmations of these numbers. Generally, they all seem to put the K7 and P3 about on par (with the K7 slightly edging ahead) in integer performance, and the K7 dusting the P3 by 40-50% in raw floating-point power. Furthermore, remember that the K7 has a number of design features (deep read/write buffers, a deep scheduler) that take better advantage of high clock rates -- so it should scale to higher clock rates better than a P3. How this all translates to real-world speed and real-world yield levels is yet to be seen, but the initial results would say the K7 will be the king of the x86 hill for the next year or so.

  17. Re:Which is faster? on AMD Athlon (K7) Ships · · Score: 1

    The simple answer: K7 (Athlon). Far and away.

    The real figures are about 12% faster in SpecInt95, and about 50% faster in SpecFP95. That's comparing a P3-550 versus an Athlon-550. For the P3 Xeon-550, the Athlon beats it by about 5% in integer, and 40% in floating point.

    If you want the full Athlon story, go to JC's page. More Athlon info than you ever wanted to know, including the above spec numbers and where they were obtained.

  18. Re:K7 Great... But what about MainBoards & on AMD Athlon (K7) Ships · · Score: 1

    The word is that the chip only goes to OEMs for a while -- you'll have to buy a whole computer with an Athlon inside for the near future. Those whole systems start shipping in August (?), and it'll be some time after that before the CPUs and motherboards make it to the retail channel. Sorry -- I wish they were here sooner, too.

  19. Here, have a clue. on AMD Athlon (K7) Ships · · Score: 1

    I'm just hoping that you were ... ummm, being humourous in the above post. (It's hard to tell sometimes around here.) In case you weren't:

    1) The K6-3 is approximately equivalent to a same-clock *Xeon* on 32-bit integer code. Floating-point? Not a chance, but what does a file server care about floating point?

    2) I've *easily* run 30 people off of a K6-266 running Linux/Samba. The machine didn't even really get warm, much less break a sweat. Load average was about 0.1 most days, and while these weren't software developers, they weren't watching their screensavers all day, either. Anything over 200Mhz for a fileserver is a genuine waste unless you have a serious disk subsystem and a well-built high-speed network.

    So, either I need to get a sense of humor or you need to get a clue. :)

  20. Re:PIII style promotion on AMD Athlon (K7) Ships · · Score: 1

    No, they're actually talking about cache prefetch instructions, also known as streaming memory instructions. When you know that you need something from main memory in about 30 or 40 cycles, you prefetch it, which means that when you get to the part where you need the data, you don't stall those big pipelines. I don't think AMD is trying to claim that the K7^H^H Athlon makes the internet come alive like Intel claims ... it's just some new instructions that help to mask main memory latency. Nice stuff, Intel has it in SSE, which may be the reason for the reference to the Intel marketing campaign. And yes, AMD's marketing department needs help.

  21. K7 Spec numbers on 1GHz Alphas · · Score: 1

    JC over on JC's News managed to get spec marks on a K7-550 with 1/3 speed L2 (the shipping version will have 1/2 speed L2). Check out the numbers and other good stuff yourself, but it scores 25.1/22.5 on SPECint/fp. As you have noted, though, real world performance doesn't always scale with SPEC -- although it isn't totally out of touch with reality. :) And yes, the microwave frequency is right up there around 1GHz ... I think the K7 will be in the 60W range, which is certainly better than my microwave (~800W, if I read the back of it correctly). I wouldn't run a bare processor on my desk and stare at it, though.

  22. That was then. New FP unit, Toto. :) on K7 Renamed "Athlon" · · Score: 1

    A couple of points -- generally, you are correct about the K6-X series of processors. The K7^H^H Athlon is a different beast, though.

    However, first things first -- on anything *except* raw floating-point muscle, a K6-3 will take out a P3 at the same clock. The K6-3's caching system combined with relatively short integer pipelines makes it a mean integer machine. And actually, the Perl apps you're referring to will benefit a lot more from the fast integer performance of a K6-3 than the extra floating-point ooomph of a P3 or K7. (If you're writing fp-intensive stuff in Perl, you should have your head examined anyway. :) However, this is offset by the P3 being available at higher clocks than the K6-3, so it becomes something of a wash -- assuming you can afford a high-end P3.

    Now, the K7 is a different beast -- while it doesn't really change the world on integer, it should redefine x86 floating-point. (The integer performance may be due to having 1/2 speed L2 cache -- there may be some headroom there. Besides, the chip is still faster than a K6-X or P3-Xeon at the same clock in integer, it's just not revolutionarily faster.) The floating-point unit on a K7 is just ... wow, something to drool over. For comparison, a K7-550 was benched as having a specFP95 in the mid 20's. To provide a point of reference, that's about where the MIPS R10000 is about now. It doesn't touch a 21264 Alpha, but it torches a lot of other (even non-x86) processors. It especially takes out other x86 processors -- think 40-50% faster than a P3 on specFP95 at the same clock. I can't translate that directly into Quake framerates for ya, but trust me it's a good thing. The other thing is that the double-precision floating-point on the K7 is more pipelined than the P2/3's double-precision, so for high-end engineering, ray-tracing ... probably Seti@home, if you care ... the K7 should rock right along. For the near future, the situation of Intel having better FP per clock should be reversed ... if you want the best x86 floating-point per clock, you'll have to buy an AMD chip. (Whether you *need* that much power is something that is often discussed and even more often ignored. :)

  23. Re:Video in from G200? on Matrox Releases G400 Specs · · Score: 1

    I don't know of any plans in particular to support the Rainbow Runner -- I've been following the GLX development, mostly. However, I wouldn't be surprised if someone made a Video4Linux driver for the Rainbow Runner. Full-speed MJPEG capture is a pretty nice add-on, if you ask me. I suspect that since Matrox gave us the specs on the 3d engine, the specs on the video capture board won't be that much of a problem.

    Your last paragraph confuses me, though: why buy Metro OpenGL under Linux? There are 3d drivers *now* that support your G200 under Linux (using Mesa), and do it pretty well. Besides, doesn't Metro's OpenGL support only extend to the Permedia cards right now?

  24. How fast is your CPU? G400 is CPU-dependent. on Matrox Releases G400 Specs · · Score: 1

    > So, resolution: goes to Matrox
    > Speed: 3D speed goes to TNT2,2D I'll bet goes to Matrox.

    The AC posting above this is pretty much right on about the 2D and 3D strengths/weaknesses of the two cards. I would only add that benchmarking so far has shown the G400 (and especially the G400MAX) to be very CPU-dependent. If you have a slower, older CPU, then the TNT2 will wax the G400. If you have something like a P3-500, it's a much more even competition -- and at high resolutions/bit depths the G400 starts pulling ahead. Oddly, I would vote the G400/G400MAX as more "future proof" but in the fast-changing world of 3d accelerators I doubt that's worth much. :)

    Again, under "normal" circumstances, the TNT-series of cards has much better driver support, etc. (I'm speaking of the situation under Windows.) However, if Linux 3D performance is your thing, my bet goes on Matrox for the best (OpenGL, ironically :) support under Linux, at least until/unless Nvidia puts together a publicly grokable specs booklet and releases it. It's rather ironic that pretty soon the OpenGL support for G2/400 under Linux will be better than it is under Windows.

  25. Re: point c on Matrox Releases G400 Specs · · Score: 1

    Oh, my bad. There were some Millenium II GLX drivers floating around out on the 'net that formed some of the basis of the G200 code, so I assumed those had either started from scratch or been built from SGI GLX code. I knew the two drivers shared a common heritage, but hmmm ... interesting to know that the Riva driver was the original. I wonder how the G200-dev folks got the Riva-less GLX code with hooks ... I'm thinking Terrence Ripperda is the likely suspect. :) (And, AFAICT, the G200 drivers are also under an XFree license, for easy integration into the XFree DRI source later this summer.)