Slashdot Mirror


Which Processor Is Best For Real-Time Computations?

NoWhere Man asks: "For the longest time my friends and I have been arguing over which processor is better (Intel or AMD). I know this is an ongoing battle everywhere as well, but it took an interesting turn the other day. Which processor would be better for realtime, high end mathematical computations? AMD's Athlon? P3 Xeon? or Dual Processors? If anyone could recommend system specs, keeping it cost effective at the same time, it would help."

21 of 232 comments (clear)

  1. Re:Depends on your software... by Malc · · Score: 3

    "Tyan 1832 dual motherboard [make sure it's latest-revision, to handle Cumines] ($180) "

    I have that board and I think it is excellent. You'll need a revision F for coppermine support. It comes at a good price as it doesn't have onboard SCSI like many dual systems. When I bought mine a few months ago, the best price/performance seemed to be P2 450's, two of which were costing less than a P3 500. It was $105 for an OEM P2 450 and $75 + $15 for a Celeron 466 + converter card (at that time, P3 500's were going for $280).

    Quake 3 with my system is faster under NT than it is under 98. It's nice, and the framerate is more stable.

    "Remember that the newest, fastest PIII's don't support SMP (yet, anyway); neither do the new Celerons; "

    Celerons are SMP capable. I don't know if Intel has completely disabled it in the new Celerons that they just announced. Either through a socket 370->slot 1 converter, or through some resoldering, or through a dual Celeron board, such as the Abit BP-6. I have heard that there are some stability problems with the Abit BP-6 that can take some effort to iron out.

  2. Alpha Alpha Alpha! by szyzyg · · Score: 3

    The alpha is basically twice as fast as an Athlon at the same MHz, or 3 times faster than a PIII of the same clock speed. Thir pfloating point performance is still better than anythign else out there. I use them for Orbital com[putations and they fly.

    We recently got a 250Grand 8 processor SGI number cruncher, for someone who wants to do MHD calculations, scarey big calculations. We did a benchmark on it on some of my code and found that 1 SGI CPU was 1/3rd of the speed of our 500Mhz DS20 CPU's..... we have a Dual processor DS20 which we acquired a year ago, for 20k, and this is 75% of the speed of our brand new 'Supercomputer'.......

  3. Re:real time, high end?? by dej05093 · · Score: 3

    I think the meaning of "real-time" in the original
    posting is "during data aquisition" with soft real time constrains instead of a calculation done after the experiment.
    In an industrial environment it might also be important to have an upgrade path and a processor
    family which will be supported for a long enough
    time (it is really expensive to switch from
    transputers to something else ...).

  4. Real-Time does not mean Real-Fast by Detritus · · Score: 3

    Real-Time is about predictability and timing guarantees, not high speed. Features that improve speed, such as cache and virtual memory, are drawbacks in a real-time system. They introduce uncertainty into the analysis of the system's timing.

    --
    Mea navis aericumbens anguillis abundat
  5. My vote by JohnZed · · Score: 3

    Well, I'd really like to vote for a dual P-III Xeon at 1 GHz with 1MB of L2 cache per chip, but Intel has been lagging behind with their newest Xeons. They used to give a 15% boost over a comparable non-Xeon, due heavily to their much larger cache, but Intel has had a hard time selling them because they cost SOOO much more than a comparable P-III standard, and the performance gap between the two models seems to be closing.
    Cache size (and speed) is a lot more important in these calculation-intensive benchmarks than it is in other uses, so it's really something you should look out for. It's also one of many small minusses that really make the dual celeron suggestion a less-than-optimal configuration for real scientific use. Today's best high-performance compilers do a reasonably good job of exploiting special P-III instructions and optimizing to squeeze the right data into the cache. Although the celeron will soon have SSE, it will still be configured with 128k of L2 cache, I believe (though I could be wrong. Anybody?). The real killer to the celeron idea is that you do still have to hit your RAM pretty often, and Rambus or DDR running on a fast bus can really, really help you here.
    So, if it comes down to the Athlon or the P-III (non-Xeon), I'd still have to go with the P-III. The biggest advantage is the ability to use multiple processors, as number-crunching code can REALLY benefit from SMP. AMD has been promising SMP Athlons for ages, but they're still basically vaporware. Another factor is the availability of (extremely pricey) Rambus RAM, while DDR is just starting to be accepted. Finally, Intel puts out some pretty fast compilers, while a lot of compiler developers fail to optimize for the Athlon as much as they could.
    Within the next year, however, we should see faster RAM for AMD chips, SMP Athlon boards, better compiler support (now that people no longer think of AMD as just a low-end provider), and a full-speed L2 cache on the Athlon. Then the chip's FPU can really shine.
    This is all assuming that we're talking about scientific crunching on x86 PCs. If you can go for an Alpha, you really should. Check out www.spec.org for benchmarks. Yes, we're all wary of benchmarks, but when a chip routinely beats its competition by a factor of 2 or more in a very respected, industry-standard benchmark, you have to assume that there really is a difference. I have a lot of hope for Intel's second-generation of IA-64 chips, though. They're doing some really interesting things with compiler/architecture design that could blow away the competition.
    --JRZ

  6. One word: Alpha by turne10 · · Score: 3
    Alpha still blows both Intel and Athlon out of the water, esp. on floating point. The best benchmark for such things is the SPECmark - see John DiMarco's handy SPECmark table, as well as the SPEC site itself for numbers, but the bottom line is that even a 500 MHz A21264 is about twice as fast on floating point than a 700-750 MHz PIII or Athlon, and DEC, er, Compaq is now shipping 667 MHz A21264's.

    Note that there is a new 1U rack version of the DS10, called the DS10L (code-named "Slate"), that is very attractive for highly compute-intensive tasks. There's a picture of a rack full of these in the Linux section of Compaq's web site.

    --
    NTAGARA
  7. Re:Thinking for difficult operations by wavelet2000 · · Score: 3

    1. Computationally intensive jobs mean lots of floating point operations. Here Athlon outperforms Pentium-III, but less so at higher Mhz, as L2 cache becomes more of a bottleneck. However, Alpha is (so far) the absolute best in floating point. in my experience (=floating point intensive programs in Fortran 90), alpha-21264-667 is 3.7 times faster than P-III-600. Given that one never gets linear speedup by going to more CPU, I predict you'd be better off getting single CPU alpha-21264 ($10K) than quad Xeon P-III, in performance and in price. And, of course, you want Unix/Linux with it, for you don't want results of months of computattion go down the drain due to OS crash :-) 2. Where computationally intensive jobs are required? NOT for web browsing, balancing checkbook, or even games. Hence you don't see either big iron hardware at local shops, and software beyond simple calc, anywhere except may be campus bookstores. Computationally intensive calculations are done in modelling structures, solving weather forecasts, nuclear research, aerodynamics, financial (options pricing), Monte-Carlo and optimization. So it is mostly done in research institutions. No surprize then software is written by Ph.Ds , texts are dry, market is narrow. 3. There still exist free or open source math programs/libraries beyond casual calc. Take a look at MuPad, Octave, LAPACK (http://www.netlib.org ) etc. 4. I am not aware of any incredibly easy book on this. The problem is extremely advanced math requires extreme effort to comprehend (or nobody cared to boil it down). In my area (econ), Judd "Numerical Methods in economics" is pretty good. 5. About breaking up calculations. Ultimately, any program (math or not) does exactly that. there are general principles embedded in compilers technology, and, at a higher level, Fortran or C/C++ programs split the atsk into subtask. But all-purpose ALGORITHM does not exist. During debugging you do the same - look if pieces of the program do what you want them to do. That is why debugging Mathematica programs is on the one hand easier (higher level of abstraction) and more difficult (you have to trust that subroutines you call are working all right) 6. For distributed memory coding, refer to Lusk et al "Using MPI"

  8. Not x86 by Skweetis · · Score: 3

    Newer x86 processors (Intel, AMD, Cyrix, etc.) aren't very good for real-time work. This is because they are so fast that they sit idle most of the time waiting for instructions to come from much slower RAM. They work, but it is overkill when you can get a 68K processor from Motorola for a few dollars, or an HC12 or something for a dollar, that will do the same thing with much lower power requirements. (By the way, just because I suggest Motorola chips doesn't mean I am endorsing Apple. I am trying to give you some information based on my experience in real-time computing, and an old 68000 is a good chip for this kind of thing.)

    For straight number-crunching, a fast x86 CPU is probably a decent choice. You will probably want to go SMP for maximum performance. (Are there SMP boards for a G4? That might be a good choice, as it has one of the best FPUs out there right now.) You might also want to look into an Alpha, they have very good floating point, and some really good SMP solutions exist for them.

    Another issue to keep in mind is the software side of things. You may want to get a processor that you can write, or can learn to write, assembly code for. You can write much more efficient code in assembly, and (and this isn't always a bad idea) if you do things this way, you can make things even more efficient and not run an OS at all. (Even a 12mHz 386 screams when you don't run an OS on it while you run your code.)

    The thing to keep in mind when approaching this problem is that there are a lot more solutions than Intel and AMD, and many of these are well worth investigating further. I am just trying to give you a few ideas here, you will probably want to do your own research and choose the solution that works best for you.

  9. Re:Intel by bmajik · · Score: 3

    You are 100% wrong. Intel has NEVER had a respectable FPU. Ever. AMD's Athlon FPU destroys the intel chips. The only thing you might be thinking of is how the early quake games required a Pentium CPU because the id boys hand tuned for the pentium FPU.. 486-class amd/cyrix chips obviously ran horribly here...(and were legitimately outclassed by the pentium FPU at the time)

    another thing.. what exactly is a "real time floating point operation" ? RT operation seems like it should have alot less to do with a CPU then it would a system, and more importantly, the software side of that system. RT computing has alot more to do with enforcable upper bounds on wall clock time for a given operation than it does with 'fast fpu'. How a CPU deals with interrupts and cache misses, and under what conditions, and how this interacts with FP ops _might_ make this "topic" relevant.. but i can't believe that some people are saying "AMD" and others are saying "intel" and not expanding any more..

    --
    My opinions are my own, and do not necessarily represent those of my employer.
  10. Here's what I just did by johndr · · Score: 3

    I recently put together a cluster of eight Gateway 2000 boxes with Athlon 800MHz proccessors. We loaded Red Hat 6.1 and are using one of the commercial Fortran compilers to do plasma simulations. The boxen are in a very basic Beowulf-type configuration.

    Our previous boxes were 600MHz DEC alpha stations running Dec's UNIX, OS/F or whatever it is. We find that the AThlon boxes, which are 32 bit of course as opposed to the 64 bit Alphas, are about as much faster as the clock speed would indicate, i.e. about 30%. As a result we increased our computation resources by a factor of four for less than $20K. We are very happy.

    I'm not sure SMP can be justified in this kind of case, as boxen that support it are typically way more expensive than our cheapo Gateways, and SMP generally does not increase speed by the factor you would think. However I'd be interested to hear any results to the contrary. When problems can be split between processors, performance per buck is what matters.

    The best solution really has to be tailored to the specific problem being solved; so for example 384MB or PC100 RAM was ample in our case but in a big 3D finite element case it gets to be a problem. As an example we also have an electromagnetic package that runs on NT; because of the software licensing we can't run more than one case at once, and that has to go on the huskiest system I can get the budget for, currently dual Pentium III 733s with 2GB RAMBUS memory. It is way less cost effective than our cluster system.

    Hope this helps.

    John

  11. the question is ill-posed by Signail11 · · Score: 3

    What exact question is it that you are asking? The answer depends to a great extent upon the specifics of your problem. For some possible usage scenarios, here are my suggestions (disclaimer: I have a preference for SGI machines, because those are what I primarily use from day to day)

    Standard office applications:Go for a cheap Celeron and lots of RAM. Most applications will be very responsive in any case.
    3D games: an Athlon would probably be your best choice. Decent FPU performance, good integer performance, won't cost you a bundle. Most games don't really benefit from SMP anyway.
    Render farms, other highly parallel, low internode communction applications: commodity x86 systems, the more the better.
    RT control of other systems/experimental setups: I personally prefer the StrongARM series of processors for this role, since the price/performance is practically unmatched, the documentation is through, and programming in assembly for the SA is truly a joy compared to the hideous mess that is the x86. Only problem is that there no FPU (it does have an integer multiply though).
    Data mining, data warehousing: I don't have any personal experience with these applications, but I have heard good things about Suns and the RS/6000's from IBM.
    Single-threaded or low parallelism scientific computations: Definitely Alphas. They blow any other processor away on floating point intensive operations. The only real drawback is lack of CCNUMA/massively parallel shared-memory systems. IIRC, they top off at 8 or 16 processors.
    Really big simultions, computational hydrodynamics, etc.: Keeping in mind my previous disclaimer, I would still have to suggest SGI Origin 2000 systems for this type of task. The out-of-box performance on a fully populated Origin 2000 is awe-inspiring. Another option might possibly be linked AS/400s or RS/6000s or even one of the Cray T3Es for vector oriented codes. A bit pricy, but if it's not your money...

  12. It's not only the processor, it's the OS by roman_mir · · Score: 3

    If computer science CSC468 Operating Systems course taught me anything, it's that no matter how great your hardware is, the software must be optimized to the maximum or your best processor with the largest bus and huge clock speeds and caches and number of registers will be worth NOTHING. The Operating System must be smart to use the cache in best ways, to maximize the performance of your IO devices, to manipulate all your threads in a smart way not to create thrushing.

    In fact I believe that if you simply need a calculator that is completely devoted to you, you should build your own very task specific operating system that is optimized for the task at hand. Half a year ago when I was implementing different parts of an OS, I remember that the most difficult challenges that stood in front of my group were the system optimizations for memory accessing, multitasking (paging), security and multitasking the IO devices.

    If all I wanted was a simple system for a single user I wouldn't need all those complicated algorithms, I could simply write a memory manager, and a simple IO and interrupt support and that would be the fastest way for a single user to operate. In a sence even DOS was too sofisticated for what you are asking, DOS had some TSR support and paging. (However its memory management was awfull.)

    So there you go, it's back to C and the Assembler time!

  13. Flamebait?? by www.sorehands.com · · Score: 3
    Is this a real question?

    This is one of the questions that are in the category of "Which is the best editor?" or "Which is the best Operating System"" or "How many angels can dance on the head of a pin?".

    Geeks have argued and fought over these (or at least the first two) for years, and will argue and fight over them until we are Borg.

  14. what "real time" means by jetpack · · Score: 4

    A couple of folks have alluded to this, but I'll reiterate in an attempt to make it clear; real time systems imply bounded response times. That does not directly imply a *faster* response time (altho the goal is to reduce the gauranteed bound on response times).

    Hence, the phrase "real-time mathematical computations" is almost an oxymoron ... if the computation by nature is unbounded, you wont get realtime performance out of any processor.

    Basically, I'm attempting to point out that "real time" and "fast" are not synonymous.

  15. Re:I don't think you understand what you are askin by costas · · Score: 4

    Excellent points; let me add a coupla more: I am not sure what 'real time mathematical computations' means here. CPUs that can crunch complex mathematical systems in real time, are usually developed by DARPA and cost millions per unit (I am referring to jet fighter avionics packages; usually TI or Rockwell chips).

    Now, if the poster is referring to mathematical problems (but not 'real time', more like 'solved in a reasonable time'), the above post is right on the money: do not buy an SMP system --at least not an x86 SMP system (I don't have any exprience with Alphas). The problem with x86 SMPs is bus speeds, i.e. communication between CPUs on the same board. Using fast network interconnects (gigabit speeds) you can usually get better performance between boxes than between CPUs for some SMP systems (particularly quads and 8-ways). Duals are not as problematic, so for compactness' sake they might worth the $$$...

    Keep in mind though, this is not the way things *should* be, it's just the state of the art --which sucks right now in x86-land. With better motherboards coming up (not to mention better SMP support by all the different OSes --the new Linux kernels seem to have solved context switching problems that were killing SMP machines, for example), eventually SMP machines will be the way to go...


    engineers never lie; we just approximate the truth.

  16. Architecture makes the difference by Microlith · · Score: 4

    If you're doing mathematical computations, your best bet would be a CPU good with floating point. Your best bet would be an SMP Alpha. With a better floating point unit than Intel and AMD, and an outright faster CPU, you'd get a lot more done in less time. Only problem in this case is cost, since the average Alpha system, IIRC, costs more than most x86 systems. That might not be true, so do your research.

  17. real time, high end?? by gargle · · Score: 4

    Which processor would be better for realtime, high end mathematical computations?

    High end mathematical computations are unlikely to run in real time on any processor. Do you really mean games?

  18. Athlon Has a Superior FPU by Zevez · · Score: 4

    The AMD Athlon has a superior FPU at the same clock speed, which is useful for scientific applications. Check this page here for further details. (Notice on the same page that the Athlon is not better at everything though).

  19. I don't think you understand what you are asking.. by slothbait · · Score: 5

    Are you referring to hard real-time applications? If so, you don't want an SMP system. Infact, you don't even want a system with a cache. Why? Because hard real-time applications optimize for worst case performance. Caches do well to improve average case performance, but usually hurt worst case performance. Thus, hard real time systems usually don't use caches.

    In general, real time apps (even soft real time, like video or audio decode) are concerned more with low latency then high throughput. As a result, you aren't going to want an SMP system. The complex caching systems in SMP's is going to make performance even *less* predictable which is precisely *not* what you want for real-time.

    If you just want a really fast media-cruncher, then you don't want to be running x86. If you're serious, you'll go for something like an Alpha, that will smoke any x86 in FPU. Besides, if you really want to get into real time media processing, you are going to need a great deal of bandwidth, and commodity x86 hardware isn't going to get you where you need to go.

    If what you really want is a budget box to run games on, then get an Athlon. A quick review of any games site in existence will tell you that Athlons beat Intel's offering in every regard these days. There's no reason to bother Slashdot with such common questions. Any of the DIY gamer sites will have a host of articles with benchmarks running Quake or Unreal or whatever it is kids play these days.

    I get the impression that this post is from someone who doesn't understand real time computation, and just through that phrase in there to make their question sound more sophisticated.

    --Lenny

  20. PPC 7400 by Pope · · Score: 5

    If you're doing Signal Processing or DSP-related tasks, the 7400 smokes.
    As always, it all depends on what numbers you're crunching, and for what purpose. The vector processing in the 7400 is pretty sweet if done right, and one of the Linux PPC variants has full support now.
    Just a though to get away from x86 ;)


    Pope

    --
    It doesn't mean much now, it's built for the future.
  21. The stupidest question I've ever heard by ShaggyZet · · Score: 5

    No, really, it is. What does real-time have to do with this? You can't solve an arbitrarily complex mathematical problem in a bounded amount of time. And that's what real-time is, bounded. Real-time has nothing to do with speed in the sense that this question is phrased. An example: A given problem must be solved in 15 seconds or less. Chip A can solve the problem in 1 second most of the time, but will take 30 seconds every once in a while. Chip B will usually take 5 seconds, but never more than 10. Guess which chip is can give you a real-time guarantee? That's right, even though Chip B is usually slower, it is the one you pick.