Slashdot Mirror


Which Processor Is Best For Real-Time Computations?

NoWhere Man asks: "For the longest time my friends and I have been arguing over which processor is better (Intel or AMD). I know this is an ongoing battle everywhere as well, but it took an interesting turn the other day. Which processor would be better for realtime, high end mathematical computations? AMD's Athlon? P3 Xeon? or Dual Processors? If anyone could recommend system specs, keeping it cost effective at the same time, it would help."

232 comments

  1. Re:Couldn't you just analyze the program? by Anonymous Coward · · Score: 1

    Ok programe are actually a series of many instructions couldn't the OS just be programmed to have say if you have 100 instructions that instruction 1 would be on processor one and instuction 2 be on on the second processor or just devide the program into two halves to be executed on each?

    Unfortunately, you CAN'T do that. It's not just the instructions that get sent to processors, but the things inside the processor's registers (which can get modified after an instruction) and the cache. So while you can send instructions to two processors, they can't share registers (or if they can, you can't guarantee that they'll be accessed/modified in the right order).

    For those of you who aren't programmers/engineers, registers are what hold your immediate data inside the processor. Data which changes or is accessed frequently is stored in a CPU register. Access to a register is *much* faster than to memory. Each CPU has its own registers. At the lowest level, the entire program (thread/process) is controlled with these registers. Since registers aren't shared between processors, this sort of thing isn't possible.

    This is why a single thread can go only to one processor. If your program runs using multiple threads or processes, that's a different story.

  2. Re:Couldn't you just analyze the program? by Anonymous Coward · · Score: 1
    Ok programe are actually a series of many instructions couldn't the OS just be programmed to have say if you have 100 instructions that instruction 1 would be on processor one and instuction 2 be on on the second processor or just devide the program into two halves to be executed on each?

    Yeah. And with 9 woman, you can make a baby in 1 month (Something like, one do one leg, another one do internal organs, and so on...)

    Cheers

    --fred

  3. Consider timer interrupt frequency. by Anonymous Coward · · Score: 1

    The x86 processors have a timer interrupt every 10 milliseconds, the Alpha every millisecond. If one is using a non-real time OS, the greater interrupt frequency will give nearer real time performance (correct?).

    PAC

  4. A Beowulf cluster of 65C02 emulators? by Mark+Edwards · · Score: 1

    We could run a beowulf cluster of thousands of 65C02 emulators, on a single Intel machine, and exceed even the power of our modern systems!

    Oh wait, today's the second...

    Mark Edwards
    Proof of Sanity Forged Upon Request

  5. Re:Alpha Alpha Alpha! (not $$$ SGI) by The+Man · · Score: 1

    The Onyx2 is a piece of crap. Try a Power Challenge or Origin system instead. I never have problems with mine. Alphas are nice, but there's nothing wrong with SGIs. Ultra 10? I stand before you, laughing hysterically in a mocking way. Let me guess, you run Solaris on it too. Free clue: don't venture off Slashdot; people elsewhere won't be so easily confused.

  6. If SMP did not work, then why clustering? by Jayson · · Score: 1

    Saying that the clustering is in a "basic Beowulf-type configuration" implies to me that you are doing parallel computation across them. If this work, then why would SMP not also work? It should be almost the exact same.

    1. Re:If SMP did not work, then why clustering? by mikefe · · Score: 1

      I don't think he is using the "beowulf" term correctly.

      It appears that he is running parts of a computation on different computers, not splitting the full computation across the cluster dynamically.

      Mike

      --
      There: Something at a specific location.
      Their: Owned by someone.
      Please make sure your english compiles.
    2. Re:If SMP did not work, then why clustering? by johndr · · Score: 1

      I must misunderstand SMP. A lot of scientific programs, that aren't written to be parallelized or don't parallelize well, never fork the running process. So I don't see where the OS can come in and allocate part of the task to another processor. In fact for a single computation running on the box, in a non-parallelized program, the only advantage I can see is that any OS housekeeping would take place on the second processor. With a good OS I can't think this would lead to a huge improvement. If more than one case of the program runs on the same box there will be an improvement but consider that the processors now have to share the memory and IO so the actual performance would depend on a lot of things to do with caching, how much IO is needed, etc. While if one just buys two cheap boxes instead this isn't an issue. Cheers John.

  7. Non-x86 Architectures by John+Goerzen · · Score: 1

    I feel that a common fallacy, illustrated both in the question and in slashdot as a whole, is an ignorance and failure to consider non-i386 architectures. In many cases, these architectures offer superior performance and a better design. Take a look at Sparc and Alpha options.

  8. Re:Alpha Alpha Alpha! (not $$$ SGI) by melkor · · Score: 1

    compaq will :-)

  9. Re:the 68000 BABY!! by neccoant · · Score: 1

    The fastest realistic processor today price/performance ratio is either the G3 (PPC 750) or G4. G3 is $25 each in quantities of 10000 for a 400Mhz chip, with 600Mhz on the way. The G4, if you want to do desktop work with one (two, three, four...9999) machine, is the fastest price to performance ratio. This chip, which I am writing this comment on now, doesn't play quake3 like a pentium 3 or AMD K-7 or 6, but 50 FPS in a new game is not important for the work you want to do. The G4 runs LinuxPPC wonderfully, cracks RC5 blocks at the fastest speed per Mhz, and also is excellent for embedded work (if you write the software) since it emits 34 less watts of heat than it's x86 cousins.

  10. the best system ever by VAXGeek · · Score: 1

    MicroVAX 3100 m38
    VMS 5.2
    8 megs of RAM

    you get one of these puppies, you be encoding mp3s at the rate of 1 every
    few days! i have used one of these for years and i must say it is simply the
    best.

    ps: VMS RULES YR WORLD
    pps: eat me.

    ------------
    a funny comment: 1 karma
    an insightful comment: 1 karma
    a good old-fashioned flame: priceless

    --
    this sig limit is too small to put anything good h
    1. Re:the best system ever by Issue9mm · · Score: 2

      What if we wanted to encode more than one mp3 every few days?

  11. Re:I don't think you understand what you are askin by GTM · · Score: 1

    I don't know if the poster of this question really masters this field, but it does make some sense to me. I have friends whose job is to validate satellites : they have a *huge* checklist, and each component is validated when the whole list is ok for this component. Since this includes satellite avionics, i.e. attitude and orbit control systems, they have to simulate in real-time the behaviour of the satellite : this is done whith a fast workstation that computes a complex physical model, and it must be done under hard real-time constraints. For instance, I saw them use a Sun Ultra5 operating in a real-time mode to simulate some parts of a satellite (I heard that you can load your own software and switch from Solaris to a priviledged environment with a certain syscall). I know that they also use Alpha boxes for their most CPU intensive needs.

  12. Re:the 68000 BABY!! by rve · · Score: 1

    10000 obsolete chips may give the best price/performance ratio in floating point operations per dollar, but the original poster whats to do 'realtime, high end mathematical computations?'. The world 'realtime' is a bit confusing here.. does he want an embedded numbercruncher, or does he want a numbercruncher that's so fast it can simulate a certain process in real time?
    Assuming it's the latter, he'd have to plug those 10000 processors into some other hardware to make them work. Even if it were possible, it would be very difficult and expensive. Plus the fact that the overhead of the communication between all those nodes would seriously diminish performance.

  13. Re: complaints from Tyan and Abit by unitron · · Score: 1

    I know Intel makes motherboards to sell under their own name and to OEM, but does AMD? I'd think that AMD would want their processors to work as well as Intel's, if not better, in as many different places as possible, to increase demand for them, so why wouldn't they bend over backwards to help motherboard makers bring boards to the market that help sell AMD processors?

    --

    I see even classic Slashdot is now pretty much unusable on dial up anymore.

  14. this just in-sorta related by unitron · · Score: 1

    The Register just added a story about long-time Intel-centric Dell buying 100,000 AMD Spitfire chips for desktop boxes. Maybe.

    --

    I see even classic Slashdot is now pretty much unusable on dial up anymore.

  15. Athlon or dual Celeron's by ZxCv · · Score: 1

    if you are only going to use a single CPU system or if your software wasnt written for multiple CPUs, the Athlon is definitely the way to go. every benchmark i can remember seeing, the Athlon's FPU smokes the PIII. but, if your software will utilize multiple CPUs, your best bet would be a multi-processor Celeron system. Obviously the FPU isnt up to snuff with the Athlon, but with their rather low cost and 2 or 4 of em crankin away at the same time, you should have all the power you need at a decent price.

    --

    Perl - $Just @when->$you ${thought} s/yn/tax/ &couldn\'t %get $worse;
  16. Re:Intel by ZxCv · · Score: 1

    Do your homework.. the Athlon's FPU is light years ahead of any FPU AMD has ever put out and noticeably better than any FPU Intel has.

    --

    Perl - $Just @when->$you ${thought} s/yn/tax/ &couldn\'t %get $worse;
  17. Re:R.A.I.P. by Doctor+Memory · · Score: 1

    Keep in mind that some computations (e.g., solving systems of large matrices) are inherently single-threaded (since the results of one computation provide the inputs for the next). In this case, SMP will not help.

    Check out the new toys at Microway -- 750MHz 21264s, 48-node Beowolf clusters, woo hoo!

    --
    Just junk food for thought...
  18. Re:Broader View by Mr.+Frilly · · Score: 1

    Well, when the Athlon came out, people were really looking forward to an SMP board. The slot A architecture (same one used as the dec alpha) looks like it should perform much better then intel's Slot 1, and scale well up to 8 processors (if I recall correctly).

    As far as I know, the Athlon chips are SMP capable. The problem is nobody has yet produced a chipset which supports multiprocessing Athlons.

    I think AMD is currently concentrating on the desktop market (where few people go for SMP) which is why the haven't been as aggressive as I'd like them to be getting out an SMP motherboard.

    Initially (last summer), AMD's faq said to expect a SMP board Q1 of this year, obviously that hasn't shown up yet, but the last I've read is that the AMD SMP chipset (and MB) is due sometime in Q3.

    A hope they hurry, not being able to by an SMP Athlon is the only reason I haven't upgraded my system yet.

  19. Re:I don't think you understand what you are askin by Steve+Bergman · · Score: 1

    I think you are being a bit overly sensitive on this one. There have been many times that I have seen a question that made me wonder if the poster is really asking the *right* question. The only way to find out is to supply some info and ask if this is perhaps *really* what was meant by the original question. To direct someone to the right question can be far more helpful than having an outright answer.

    It's kind of like "The Hitchhikers's Guide". We all know that the answer to the ultimate question of Life, the Universe, and Everything is '42'. The problem is that none of us seem to know what the actual question is... :-)

    On the other hand, the poster may be asking *exactly* the right question. In which case, I'm rather curious as to just what it is he *is* working on.

    Sincerely,
    Steve Bergman

  20. Re:multi-threading by Locutus · · Score: 1

    That depends on the threading model in the JVM you are running. The IBM JVM on OS/2, Wintendo, and Linux use native threads so you would see a difference with a SMP OS. IMHO.

    --
    "Anyone who stands out in the middle of a road looks like roadkill to me." --Linus
  21. Problem domain is what matters by SirTreveyan · · Score: 1

    I am rather surprised that everyone jumped to the hardware question without realizing that the problem domain was inadequately defined in the question. "real-time high-end mathematical computations" can mean just about anything. Before hardware can be selected, the problem must be examined and properly identified. For example, are you modeling or performing data acquisition? Then the questions of interval granularity, acceptable hardware reset/refresh latencies, and permissible timing errors need to be addressed. Once these have all been determined, a hard look at alternatives should be performed. For example, is a high end system required or would a Beowulf cluster be able to handle the job. Can you get by with acquiring data and shoving it to disk on a real-time basis, and make your calculations later? To many programmers today have little or no knowledge of directly manipulating the computers hardware. Why buy a real-time system when all that is needed is to set the interupt frequency of the programmable clock and chain your own interrupt handler to the current one? If your constraints arent extrememely tight even Windoze can function adequately( no not the WM_TIMER either...the timer controls in the multimedia DLL work fairly well ). Might not answer your question about what is the best hardware...but might help you determine if you really need a real-time system.

    --

    SELECT * FROM User WHERE Clue > 0

    0 rows returned

  22. Re:But dosn't having one CPU keep latency down? by SirTreveyan · · Score: 1

    It all depends on the situation. You DO want to store you data dont you? You do want to acquire your data also, dont you?? If you are doing data acquisition, the bottleneck isnt usually the CPU its latency the A/D converter's require for perform successive reads. And god forbid if you have to control a device in realtime... because the latency of D/A's are longer. If you are doing modeling then you have to worry about your hard disk seek times. In any event.. you have to look at worse case scenarios are far as acquiring and storing data. Usually the latency involved in these tasks far exceeds ANY latency the CPU might impose.

    --

    SELECT * FROM User WHERE Clue > 0

    0 rows returned

  23. Re:to ALL my brothers and sisters in latency by SirTreveyan · · Score: 1

    Right on mekkab.

    You're the first one I have seen in this post that understands whats involved. Who cares how fast your CPU is if your A/D converters can scan only 10 times a second. Trying to model processes that run at a rate faster than your hard drive's seek + write times, is sensless to do in a real time system...because it cant be done. Unless you want to throw away all data except then final calculation. But if you do that...how do you verify your model is correct??? Obviously, from this discussion most /.ers are more interested in horsepower than in getting the right tools for the job.

    --

    SELECT * FROM User WHERE Clue > 0

    0 rows returned

  24. Re:Couldn't you just analyze the program? by SirTreveyan · · Score: 1

    Data dependancies and branching currently prevents one from doing this. However, from what
    I understand, the forthcoming IA-64 architecture is supposed to implement this very scheme in the hardware. It is one of the reasons that the compilers for the IA-64 chips are going to have to be 'smarter'.

    --

    SELECT * FROM User WHERE Clue > 0

    0 rows returned

  25. Re:Architecture makes the difference by Seraph · · Score: 1

    Maybe it's just me, but I didn't see "embedded" mentioned in the article. My (loose) definition of a real-time system would be one in which all operations have a known upper bounds on execution time. Certainly applies to embedded systems, but also very much applies outside of that realm.

    Perhaps I should be a bit more specific. When I hear "real-time" I think "hard real-time." There is also "soft real-time," such as that used in multimedia apps, which basically means "as fast as possible."

  26. Neither; pick a DSP by Norman+Lorrain · · Score: 1
    You want to do math? *Real* time? Forget general purpose CPU's. DSPs are tuned to do math.

    You'll find them doing things such as encryption, compression, audio/video processing, yada, yada. All in real time.

    See this for more. (a bit dated but still relevant).

  27. Re:Discrete Event Simulation PIII -v- SPARC by um...+Lucas · · Score: 1

    But do SPARCs even exist that run at 500 MHz? And how much do they cost? And according to your calculations, which i recreated and got the same results with, you scaled the speed of the entire system, rather than just the speed of the processor. I don't think the system bus speed in a SPARC system scales linerally with the addition of faster processors, so anything that leaves the L1 cache would hit the memory bus and slow down a little bit.

    It's hard to predict the speed difference of machines unless you take everything into account. A faster CPU is just one part of the equation.

  28. 'real time' mathematics by evil · · Score: 1

    I understand the goals of a 'real time' system, and how they may conflict with doing 'arbitrary' mathematical computation. The former attempts to bound time taken to process, the latter is unbounded generically. That being said, this is a legitimate quesion for, e.g. real-time data analysis- i.e. one chunk of data is acquired and cooked while you wait for the next chunk to arrive.

    The question you need to be asking yourself is what real-time means for your application, and what calculations you need to perform in 'real time.' In the real-time data analysis example, if your data arrives every 10 ms, real time means cooking everything in 10ms. Fine. In comparison to a wimpy old 386, an RS6000 performs many more ops per 10mS cycle, and the e.g. FFT you need to take can be so much more accurate than it was before. If you intend on doing real-time mathematics, you should really be writing optimized assembly. Yep- instruction sets matter. And how your code is layed out matters. This will make much more of a difference than Athlon vs. PIII.

    In the end the choice of which x86 processor will be nearly irrelevant. The goal of real-time is to get some realistically accurate calculation fitting in a bounded time interval. Once the processor is 'fast enough' or your code is 'fast enough' it doesn't matter.

    ps. I am about to flame you : )

  29. Re:The stupidest question I've ever heard by semis · · Score: 1

    Yes I agree. This is a stupid question, and it's even more stupid that /. posted it.

    I might add, why are they even asking about Intel or AMD? If you want to do high level computational maths then I would NOT recommend either of these.

    Maybe A G4 (with altivec) could be considered, or perhaps Sparc or MIPS or Alpha.

    Oh but that's right. Nobody on slashdot would have heard of these processors, because we are all x86 user's right. And we overclock our celeries to 600MHz!!! And we run Redhat Linux cos it's da best and we can compile open-source from rootshell.com

    Get a life.

    Does what used to be a *decent* forum for geek's have to be turned into a place of cluelessness and trolls by a bunch of teenage kiddies?

    pfft.

  30. anything but x86 by nester · · Score: 1

    ignoring the "real time" part of the question (which, btw, doesn't make much sense), you'd better off with just about anything else other than x86. i'd say try out a 264 alpha (eg, compaq's alpha testdrive program) and if appropiate for the type of math you're doing, try coding it for altivec on a g4. altivec and alpha would be your best bets.

  31. Re:Thinking for difficult operations by IQ · · Score: 1

    You are right. NT is dead on Alphas. It is also dead on Intel, AMD and MIPS. Real machines deserver a real OS! Try running Linux on your Alphas it Rocks.

    --
    Adults are obsolete children. - Dr. Seuss
  32. Re:How much are we talking about? by IQ · · Score: 1

    Alphas are selling like fire! Just try to buy one right now. There are 1-2 week backlogs. Everyone needing fast float is using them both standalone and in Beowolf Clusters. Run Linux on it and use Compaq's 'ccc' compiler (ported to linux from True64). Wintel cannot touch this combination with a 20 foot pole. Too bad that M$ can't compete anymore using this platform.

    --
    Adults are obsolete children. - Dr. Seuss
  33. Re:Architecture makes the difference by IQ · · Score: 1

    I use Alphas in realtime applications and they ROCK! The application is Not embedded nor is the environment harsh but it is RealTime. An Intel Pentium III/800 has a Specfp95 of 19.8 An Alpha 21264/750 has a Specfp95 of 74! Alphas just smoke the competition in float.

    --
    Adults are obsolete children. - Dr. Seuss
  34. x86???? by Compuser · · Score: 1

    For scientific computing, you'd want
    something like RS6000, alpha or MIPS
    box with as many processors as you can
    afford.
    If you are thinking affordable you are
    not really thinking high-end scientific
    floating point number crunchers.

  35. Re:New category by cokane · · Score: 1

    The only four-ways left are Xeons or some more expensive RISC machines. You may be able to get some nice Sparc 20's off of ebay and get the processor upgrades to them...

  36. Re:Athlon Has a Superior FPU by ppanon · · Score: 1

    The AMD 770 chipset is going to be capable of dual processor SMP, with otherwise the same features as the forthcoming AMD 760 (266 DDR SDRAM support, 4xAGP, next-gen ATA 120?). The last I read, it would probably be available early third quarter for release with Mustang. However it will probably be a little while before motherboards using it are available. So, although you might see it as early as July, it will probably be later. I would guess they will probably try to have them available in time for "Back-to-School" in late August/early September.

    --
    Laissez lire, et laissez danser; ces deux amusements ne feront jamais de mal au monde. - Voltaire
  37. Re:Let's not digress from the original question by wazza · · Score: 1

    Actually, you should re-read the original question:

    "... AMD's Athlon? P3 Xeon? or Dual Processors? If anyone could recommend system specs, keeping it cost effective at the same time, it would help."

    So since he wants whole system specs (and asks about SMP explicitly), arguments like OS and SMP, etc. _are_ what he wants to hear about :>

  38. Re:I hate to ask a stupid question but.... by Slamtilt · · Score: 1

    Nobody moderated it up. He has high enough karma
    that he can post at a score of 2. You can tell this
    since it says 'score:2' but doesn't have anything like
    'insightful' or 'interesting' beside it.

  39. Re:Couldn't you just analyze the program? by akintayo · · Score: 1

    i think this is the most correct.

    The problem of data and control dependencies relate to chip design, and any leeway is generally exploited at that level. Pipeling allows the chip to run multiple instructions in parallel - this is why RISC kicks ass.

    The SMP speedup is limited by additional factors. In order to execute correctly the processors must communicate, and this takes time. So for maximum benefit it is advisable to reduce communication, so 'fast' parallel programs have to be designed with this in mind. Constant communication (aka synchronization) would require the processors to wait and/or cause cache invalidations.

    --
    Woe be on to them, all who rise against poor people, shall perish in a the end. Buju Banton
  40. Re:Intel's FPU does suck by Mr.+Piccolo · · Score: 1

    Well, considering how brain-dead the Intel floating-point implementation is, it's a wonder anyone else managed to implement it at all, let alone faster than the people who invented it.

    Anyway, the original poster is correct; Intel's hybrid stack/register-memory FPU architecture is a joke when compared to just about any other CPU (except X86 clones).</flamebait>

    --
    Glückwünsche, haben Sie Slashdot ermordet, indem Sie zum korporativen Druck beugten und Subskriptionen einlei
  41. Re:New category by log0n · · Score: 1

    BeOS lite has SMP support. It recognized both of my Celerons (using that little graphical CPU program, can't remember the name of it).

  42. Re:Flamebait?? by Anomie-ous+Cow-ard · · Score: 1
    and will argue and fight over them until we are Borg.

    i don't think we'll ever be Borg. How would the collective survive with "Emacs vs. Vi", "Linux vs. *BSD", "deb vs. rpm", "My favorite distro vs. Every other distro", and so on?

    Besides, what's "real time" for a complex mathematical computation? Fast enough for video games?

    -----

    --

    --
    perl -e'$_=shift;die eval' '"$^X $0\047\$_=shift;die eval\047 \047$_\047"' at -e line 1.

  43. best CPU for FP? by dutky · · Score: 1

    For serious FP work the only reasonable choice is an Alpha: x86 just doesn't deliver very good FP performance. If you check out the Spec-95 scores (at http://www.spec.org/osg/cpu95/results/cfp95.html) for various systems you will find that, on SpecFP-95, Alphas out-perform almost all other systems by about 2:1 (with the exception of PA-RISC where the difference is only about 3:2). While x86 systems have been able to scale up the integer performance of the architecture, the FP performance has lagged considerably, probably do to the stack oriented FP register design (although it is possible that neither Intel nor AMD have seen FP performance as a prime motivator for sales, and hence haven't felt the need to push it in their designs).

    1. Re:best CPU for FP? by JacobO · · Score: 1

      I use a couple of PA-RISC boxes. They rock. Most stable boxes I've ever dealt with, and they handle unexpected load like a dream.

      If I were shopping for a Unix box (and money wasn't first on my mind!) I'd get HPs.

      I just wish they'd give you an ANSI C compiler (as in free).

      Sequent NUMAs are nice too (sorry IBM :) but not so stable IME.

    2. Re:best CPU for FP? by RallyDriver · · Score: 1

      Actually, for sustained throughput, HP's PA-RISC fares a lot better than Alpha on many problems, because it has much better memory / cache utilisation. A 200MHz PA-RISC will leave a 600MHz 21164 choking dust on a big FE job.

      SpecFP only measures CPU core, it doesn't impact memory bandwidth, which is waht really separates a supercomputer. NEC's SX-5 (the world's fastest vector system) only turns over around 4.4GFlops / CPU peak, but it will do so all day long and has around 100x the memory performance of an Intel BX or VX or NX motherboard. It costs a little more than 100x the price of a PC though :-)

      Of course YMMV with your code.

  44. Re:Athlon Has a Superior FPU by erikn · · Score: 1

    AMD Athlon has a superior FPU at the same clock speed

    However, I have yet to see any multi-proc Athlon boards. If your problem has chunks that can be done in parallel, SMP would be what you'd want.

    Anyone have a date (or a contradiction) on the availability of those boards perchance?

  45. Re:Broader View by ravenwing_np · · Score: 1

    However - and here's my big complaint - there's still no SMP Athalon! That really, really sucks. Considering that the Athalon is down to $1 a mhz for the mid-range speeds (eg, 700mhz or so), it's almost a crime that there's no SMP motherboard available. A two or four processor Athalon system costing less than $2000 could probably do the same amount of rendering as a $10,000+ Alpha system. It's a REAL shame.

    The processor must have on chip support for SMP. If they never allocated the die space for those SMP command, no amount of begging, praying or hacking will give you SMP. Since the Athalon has been out for a while and no one has produced an SMP machine from them, I'll say they never planned on it. Can someone else back me up or shot me down?

  46. Re:The stupidest question I've ever heard by JBv · · Score: 1

    Not really. The discussion of CPU speed for specific tasks is a very important one for scientific applications, which usualy only require a small set of programs.

    I would direct the interested people to the "Computational Chemistry List" (CCL) which, among other things, is also interested in CPU x CPU performance.

    http://www.ccl.net/chemistry/

    Browse the archive. This week some of the mailings regard Athlon x p3. This is probably not the best source for this kind of information, but may still be usefull for some of you.

    They also have some links on beowolf and such.

    JBv

  47. Alpha not that expensive by gwolf · · Score: 1

    I bought recently a complete Alpha system at The Linux Store. It
    runs really nice. Besides, it costed me more or less the same
    than a good Intel system to be used as a server - US$2500.
    600MHz, 128MB RAM, 13GB. I could not be more satisfied
    with it.

  48. StrongARM info by halbritt · · Score: 1

    Check out Intel's Developer Site for info.

  49. My very recent experiences by Raleel · · Score: 1

    I have been doing alot of benchmarking at work...in this case, benching p3 coppermines against r10k and r12k octanes, as well as UltraSparcII's. Here's what we have found. In all cases, the p3's perform admirably well. Let me give you a brief rundown of system specs bob - p3 coppermine w/ SDRAM -- dual 700's w/ 1 gig of ram. 18 gig scsi3 lvd. smith - p3 coppermine w/RDRAM -- dual 733's w/ 1 gig of ram. 18 gig scsi3 lvd joe -- sgi octane, dual R12k processors (i think they are ~400 Mhz, but I could be mistaken..I am not the sgi guy) 2 gigs of ram. scsi hdd blow - sgi o2, single r12k, 300 Mhz, ~375M ram. ralph - dual UltraSparcII 450 (almost positive on this) 2 gigs of ram, scsi hdd julio - single p3 (not copermine) 600, 128 megs of ram. on floating point intensive calculations, the pentiums get waxed, and fairly hard (smith takes about half again as long as ralph). Joe also does better, with smith taking about 1/4 again as long. On integer calculations, things start getting quite a lot closer, with the p3's getting within 15% of the octanes and ultrasparc. When the calculations exceeds the memory of the system (notable on the o2) they start hitting the disk. For fairness, we'll compare julio (a $1700 machine) and blow (around ~$8k). Julio at least equals it in virtually every benchmark we ran, and apparently can access it's disks (ATA66) much much faster. Blow ends up taking almost double the time of Julio. What we did notice was that RDRAM does make some difference, but not in everything. If you do have to hit the RAM (as opposed to the cache, which one of our calculations manages to stay in) it can improve performance by up to 20%. We did not test any celeron's or Athlons. I would expect the Athlon to be very very good on the FPU. Unfortunately, I know of know dual Athlon configurations. Now, I want you all to know, I am not a pro at this. I do not have all the knowledge of some of my coworkers working on this. But I can say that the jobs we are running are short (less than 6 hours, as opposed to the days running on the big supercomputers) but the trends are holding across the board. I did not put up any hard numbers because I do not have them here. I will probably do so later. Personally, if I wanted a desktop compute server, and I had a certain $$ limit (let's say 30k) I'd take a small cluster of dual p3 coppermines or xeons with RDRAM and scsi over the same $$ figure sun or sgi. Interconnects of gigabit ethernet of course ;) Now, before you go thinking I am endorsing anything, I am not. neither is my company. Of cours,e you don't know who my company is (my info is out of date ;). This is just my preliminary experiences with it. I'll inform you ore as we get in more machines to test

    --
    -- Who is the bigger fool? The fool or the fool who follows him? --
  50. 6510 CBM by NiggaPet · · Score: 1

    Nah nah you guys, first you have to set up a beowulf cluster of c64's. then you can crank some numbers

    seriously, i think the best solution is the cheaper, 2 700 or 800 mhz chips will probably outperform a ghz, and will probably be cheaper since you aviod the premium pricing on 1ghz machines

  51. Re:New category by jmauro · · Score: 1

    Celerons can only come in pairs. If you want more than two processors, you're going to get only the Pentium Pros or Pentium II/III Xeons. And pay a small fortune. We'll see Athons with 4 processors, but not for some time off.

  52. Athlon by Sea++ · · Score: 1

    All Xeons are just the base proccessor plus more and faster(??) cache. Hence the size. Compared to the Athlon, the P!!!'s have pitiful FPUs, or shall I say less FPUs. 3 fully piplied FPUs on the Athlon, and only 1.5 on the P3. I think ArsTechnica had a good architecture piece of the diffrence. Also had one for P3 vs G4. (summary:good hardware, crappy OS. LinuxPPC fixes that.)

    I'd say right now the Athlon 1000 is _the_ fastest raw chip. Coming down the pipline the fastest will be:Wildcat (Athlon 1000 with cache running at full clock speed, current are 1/2 to 1/3) Then Sledgehammer. (AMD 64bit proc. Gonna rule!)

    Hmmm, first usefull post?? ;)

    --Sea++

    1. Re:Athlon by zozie · · Score: 1

      When I did my master's thesis, I did some very extensive calculations using Mathematica _without_ X windows running.
      I used "math" with an input script to do an overnight job and killed most other processes. I found this saving of memory a great advantage of Linux w.r.t. Windows, where this would be totally impossible.

    2. Re:Athlon by Anonymous Coward · · Score: 2

      These are just some personal experiences with number crunching in a scientific environment - oodles of 32-bit and 64-bit floating point and integer operations, crunching huge amounts of data on a daily basis.

      I am talking *nix here. The flavour is not really important, but for number crunching your *nix machine you don't want X windows (except for Mathematica), and you definitely don't want it running your web/mail server.

      Any *nix really - it is not overly important with the exception that it must not be a "leaky" implementation, and it must have a good, optimised gcc and fortran90 implementation. Another useful language (among others) is perl (yes, perl).

      A single-CPU process on a Cray J90 will run at approximately the same rate as it will on an x86 (PIII-550), provided various operations are made on the same chunk of data (i.e. it is operating out of the P3's cache).

      A celeron will not perform as well as a PIII if there is not the same data having continued operations performed on it - this can mean that trivial programming decisions such as putting a do-while outside of your for-next instead of INSIDE can mean minutes in program runtime. What I am saying here is that CPU cache is of great importance if you are dealing with anything but small quantities of data.

      Sun (Sparc), SGI (MIPS) and the other traditional number crunching processors work well, but a lot of this is the systems they are plugged into. A 4-CPU Ultra Enterprise 450 whips butt over something else that clocks at the same frequency - but remember the UltraSparc processors can do more than many others; and they are all-SCSI.

      The motorola-family processors: 68K, PPC, etc are seriously useful for number crunching. We have tested clustering 200MHz PPC processor-based (603e) machines (actually Mac clones) and the results were impressive. Even single threads worked suprisingly well for some complex tasks. From people I have been speaking to, they have had similar results with G3s.

  53. Re:Do you know what real time means? by penguinboy · · Score: 1

    No, they definetly need the 4004. Might be tough to find one though ;)

  54. Re:R.A.I.P. by wavelet2000 · · Score: 1

    Not so. Solving Linear systems via LU factorization can be multithreaded, or parallelized even if in not so obvious way. For example, take a look at MPI implementation in ScaLAPACK

  55. Re:How much are we talking about? by wavelet2000 · · Score: 1

    Compaq XP1000 with alpha-21264-667 with 1 Gb RAM can be had for around 10K. Also check out
    http://www.alphalinux.org for the list of vendors, including some in MST.

    And, no, alpha line isn't going to be discontinued any time soon. In fact, for Compaq, API, and Samsung it is high-margin business still.

  56. Re:Thinking for difficult operations by wavelet2000 · · Score: 1

    This is yet another reason not to use NT. Ease of use is not something you prefer over stability when running long jobs. Performancewise, NT is not taking full advantage of alpha. Try Tru64 Unix, or Linux/BSD

  57. PPC by macdaddy · · Score: 1

    Hands down, no contest. My 300Mhz PPC 604evMach5 had one helluva good FPU. It beat the G3 up until the G3 hits speeds of around 350 or so. The G4 just rocks my world.

  58. Re:Alpha 164sx by Ineversaidthat · · Score: 1

    I think today you can get cheap alphas of this type on ebay or in the respective *.alpha newsgroups.

    I'm posting from one. There was a 164SX/mobo on ebay today for ~$300US. 533MHz, 64bits wide, and an FPU that's untouchable anywhere else for the price. Uses plain PC100 DIMMs, too. Only way to go!

  59. ccc is a free download for axp-linux by Ineversaidthat · · Score: 1

    ...and yeah, it definitely outperforms gcc!

  60. Re:The icon by Ineversaidthat · · Score: 1

    I tried expanding the icon. Not enough resolution in it. How about just replacing the nondescript blue & white with slightly less nondescript black & white with a few yellow bits thrown in at the appropriate spots?

  61. I hate to ask a stupid question but.... by schuster · · Score: 1

    Who is the crackhead moderator that moderated this to +2? Don't bother moderating it down, I'm just curious. This is nothing against swordgeek, btw, but his comment really adds nothing to the conversation. Maybe that's the reason he got moderated up, he added nothing and lord knows that there aren't enough people on /. with absolutely nothing to say. Ah well, back to work.

    --
    --- Don't ever trust a woman until she's dead- B.B. King
  62. floating point by CAIMLAS · · Score: 1
    Floating point has a lot to do with mathmatical computations. In such a case, the K6-3 would be a lot higher than the athlon. I am not sure where the P3 comes in there, or the alpha, tho.

    -------
    CAIMLAS

    --
    ~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
  63. Re:The stupidest question I've ever heard by JacobO · · Score: 1

    Very good point.

    So which CPU offers the lowest latency in optimal conditions? For which OS? Does one perform context switches faster? Do either support any QoS features?

  64. Re:New category by JacobO · · Score: 1
    Can you even get a cheap 4-way SMP board?


    I got a couple of dual boards very cheap (used) but I've never even seen a 4-way board in local stores, or on the used-bits-circuit.


    Would be nice, I have to say!

  65. Two words: data dependences by p3d0 · · Score: 1

    instruction 1 would be on processor one and instuction 2 be on on the second processor

    Nope, because of data dependences. What if instruction 1 needs the result of instruction 2? How do you get it from processor A to processor B? Anything you can think of will be slower than simply running both instructions on the same processor.

    If you structure your program in such a way that the even-numbered instructions have no dependences with the odd-numbered ones, than what you have is two programs. Any interesting nontrivial program will have data dependences, and then there is no magic bullet to achieve parallelism. Everywhere you look, there's another trade-off.

    (I know: my research group does nothing but try to get decent performance (ie. linear scalability) from multiprocessor machines. :-)

    This leads me to your second suggestion:

    or just devide the program into two halves to be executed on each

    Sure, you can do that for trivial programs. Now show me how to divide Win98, with its 60 bazillion lines of code, into two halves.

    --
    Patrick Doyle

    --
    Patrick Doyle
    I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
  66. ...and the pentium has a better FPU than Intel... by p3d0 · · Score: 1

    Um, Athlon IS AMD.
    --
    Patrick Doyle

    --
    Patrick Doyle
    I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
  67. How did this get moderated up? by p3d0 · · Score: 1

    This is a smart-ass answer to a perfectly legitimate question. The example is contrived and bears no resemblance to the real issues faced when choosing CPUs.

    This post brings up an interesting (though fictional) point, but +5?? Gimme a break.
    --
    Patrick Doyle

    --
    Patrick Doyle
    I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
  68. Re:New category by Chao · · Score: 1

    if you want to see a real increase in speed, each individual program. nope to the latter.

    on that note, still have to try out BeOs professional's multiprocessor support :)

  69. Alpha 164sx by gpoul · · Score: 1

    About a year ago there were Alpha 164sx computers on the market which had a large computational power but not that much of 2nd level cache and not high I/O performance. The price was the same as a good personal computer.

    I think today you can get cheap alphas of this type on ebay or in the respective *.alpha newsgroups.

  70. Re:Alpha Alpha Alpha! by Monster+Zero · · Score: 1

    What i think you are excluding in your price/performance analysis is the architecture under which the SGI operates. It will give IMMENSELY more performance using all 8 CPU's than 8 separate Athlon systems, especially when you have tasks which can use the same memory spaces and such. If you check the top 500 supercomputers list, SGI is basically the top supplier of high-end supercomputers (obviously owing to the fact that they bought Cray a few years ago). If you look closely, their top end systems with only 256 CPUs beat the shit out of systems with 1000's. As with all "Supercomputers" vs. clusters, you pay for the ability to intercommunicate b/w the cpu's at an increadibly high rate, way beyond normal pc architectures... so in this scenario, the CPU is not the bottleneck. BTW, to stay on topic, I would suggest the Athlon for you real-time computations (for reasons already stated - better fpu, faster core...) *anyone get the reference?*

  71. Re:Architecture makes the difference by meepzorb · · Score: 1
    Your best bet would be an SMP Alpha. With a better floating point unit than Intel and AMD, and an outright faster CPU, you'd get a lot more done in less time.

    I've never heard of Alphas being used in "real" real-time embedded systems. The chip runs too hot, has too high a per-unit cost, and can have reliability problems in harsh environments (again, because it runs so hot).

    Same reason x86 is not that popular in the embedded world. I usually see PPC, ARM, and 68k (yes they still use them). In applications where raw number crunching is really a hard requirement, these computations arent done on the main processor at all: Usually a special DSP chip (say one of TIs) would be used.

    Then again, we may be talking apples and oranges here. the Slashdef of the term "real time embedded system" seems to mean something very different from the definition used by industry.

    :Michael (embedded systems/aerospace geek).

  72. I Rate this article 'Flamebait' by ffatTony · · Score: 1

    Assuming this is not an April Fools joke, this is ridiculous flamebait. I'm afraid to even read the posts.

    Slashdot owners, for goodness sake, the only way you could promote even more terrible posts is to post articles such as "VI kicks ass, Emacs sucks", "KDE blows", "the Gtk icon makes more horny.", etc.... (These are just examples, not my opinion, don't get too upset:)

    I'm ashamed. There are so many real articles, what is this crap.

  73. Re:Couldn't you just analyze the program? by CmdrPinkTaco · · Score: 1

    From what I understand in my OS class that I took last semester, there is a limit to the percent that a compiler is able to automatically detect concurrent code. The limit is due to weak algorithms and not due to OSs. Currently the limit is somewhere around 30%, but I could very easily be wrong.
    ------------------------------------------ --

    --
    Please give your mod points to others, Im at the cap. They will appreciate it more
  74. Do your homework on the ALU by EEE · · Score: 1

    In the growing discussion of which processors are better:AMD's anthlon or Intel's coppermine it is easy to think that clock speed is the only factor. 1GHz is nice but until AMD uses the Thunderbird core you have nothing but a glorified k75. AMD is also known to do well on floating point calculations, but for integer math Intel rocks. What really needs to be considered is the specs on the ALU. The true speed of you machine still depends on the fetch-execute cycles, but the real delays for number crunching rely on the alu. It is the alu that determines how fast and percise your machine will be able to do real time calculations. And also please use linux. For mission critical applications linux is the only choice. You might also want to consider gravitating toward a 64 bit machine than the standard 32. If this is an option wait for the Sledgehammer or if you can't wait buy a Sun(or the MAJC), the performance hits are too great for the new Intel box:Itanium that uses software for its 32 bit emulation on 64 bit hardware.

  75. Re:Do you know what real time means? by mbyte · · Score: 1

    Ahh .. finally someone who gets it right ! ;)

    as for the processor .. Z80 should be enough, if you set that guaranteed time high enough ;)

  76. bang for the buck? by darkrot · · Score: 1

    With prices nowadays, you can throw together a cluster of Athlon-650s for about $500-$600 apiece. This is about $200 cheaper than it's pentium cu-mine brother, and edeges over the cu-mine in performance just a bit. I've had this configuration up and running for over a month now, and it's a cheap, fast cluster. Forget the dual cpu celerons, they'll still cost you twice as much.

  77. Intel by BobLenon · · Score: 1

    It is my understanding that Intel would be better for real-teim floating point operations. Though, i realize this might've been more true a year ago then now. Sure the Athalon has a great bus, Alpha's and more cache... but the Intel has the FPU. Of course, multi-processing it would be even better...

    --

    /* Lobster Stick To Magnet!*/
    1. Re:Intel by JDevers · · Score: 1

      That year ago comment is the most important aspect of your comment...the Athlon FPU SMOKES the CuMine FPU...

    2. Re:Intel by bmajik · · Score: 3

      You are 100% wrong. Intel has NEVER had a respectable FPU. Ever. AMD's Athlon FPU destroys the intel chips. The only thing you might be thinking of is how the early quake games required a Pentium CPU because the id boys hand tuned for the pentium FPU.. 486-class amd/cyrix chips obviously ran horribly here...(and were legitimately outclassed by the pentium FPU at the time)

      another thing.. what exactly is a "real time floating point operation" ? RT operation seems like it should have alot less to do with a CPU then it would a system, and more importantly, the software side of that system. RT computing has alot more to do with enforcable upper bounds on wall clock time for a given operation than it does with 'fast fpu'. How a CPU deals with interrupts and cache misses, and under what conditions, and how this interacts with FP ops _might_ make this "topic" relevant.. but i can't believe that some people are saying "AMD" and others are saying "intel" and not expanding any more..

      --
      My opinions are my own, and do not necessarily represent those of my employer.
  78. Re:I don't think you understand what you are askin by samantha · · Score: 1

    Good information. But you can take your geeker than thou "what a lamer" attitude and stuff it where the sun don't shine.

    Even the most geeky of us have our spots where we aren't so well-informed. If we can't ask each other without being spoken to or about as if we are total idiots then something is freakin wrong.

    Why not just answer the technical question without trying to second guess the questioner's knowledge, skills and general psychology?

  79. Ask /. picked at random? by Big_All · · Score: 1

    Oh my god what is next!

    "Uhhh hey slashdotters, uhhh what graphics card company is best for realtime high end video.
    My friends and I have been arguing this for a while.
    I say it's Nvidia, while Johnny says it's gonna be 3Dfx soon, and my other bud Petey insists it's S3.

    Of course budget constraints are a major issue - I must be able to purchase the card with the $300.00 I've saved up from my allowance & shoveling walks all winter. So what do you guys think?"

    I'm sure there are waay better questions submitted to Ask Slashdot than this - Aren't there???

    --
    "Uhmmm this might sound a little paranoid but, I want shielded twistedpair. I figure if I wear a tinfoil hat, my data s
  80. UltraSparc IIi has 2mb cache by one_who_uses_unix · · Score: 1

    Consider the Ultra-10 from Sun. You can get a 400mhz Ultra-IIi CPU with 2mb cache that just SCREAMS for less than $2000. The UltraSparc kicks booty when dealing with FP. Another good choice would be for the Alpha line of CPUs - very strong FP performance for a very reasonable price tag. The PIII is a weak also-ran when it comes to scaling with multiple-CPUs vs. the UltraSparc. Sun ships systems with 64 CPUs - just try and find an Intel based multi-processor stacked like that.

    --
    KK4SFV
  81. Re:Couldn't you just analyze the program? by jmccay · · Score: 1

    My assembly is a little rusty, but I don't think it could be done the way your suggesting. You'd have to break up the code in a way such that the second instuction doesn't need the values of the registers from the first instruction.

    --
    At the next eco-hypocrisy-meeting, count the private jets used to get to the meeting. Should be interesting to see that
  82. Re:400-CPU real-time number cruncher by lscoughlin · · Score: 1

    No.... This is not what goes into a jetfigher,
    at least not any of the current production
    ones. The F-16 runs of what is basically
    an 8086.

    Hunh, whodathunk your XT could star in iron eagle XII.

    Anyway, that's one of the reasons for all of the slashdot dubbed insane US export restrictions... because some of the most powerful "modern" war machines are laced together from "old" technology.

    -T

    --
    Old truckers never die, they just get a new peterbilt
  83. Re:Oh yes it is used in fighters by lscoughlin · · Score: 1

    Ground to air, not air to air.

    remember, Sparrow and Pheonix AAM's have been
    in use since the early 80's... Tamahawk cruise
    missles were designed in the 70's and most
    were manufactured in the mid to late 80's... that kinda precludes G4's doncha think...

    Anyway, i was specifically referring to the
    fly-by-wire subsystem of the F-16, which is
    infact... an 8086.

    Have a nice day.
    -T

    --
    Old truckers never die, they just get a new peterbilt
  84. Re:Motorola for real time, not Athlon or PIII! by alprazolam · · Score: 1

    a 68k for realtime? no i dont think so. try a dsp chip. thats why this is a moronic question. you want dsp chips for real time stuff, not anything these people are talking about. cs people dont know shit even though they like to think they do

  85. Re:Consumable Processor Units by nerdling · · Score: 1

    nono.. its ahh... not that ;)

    --
    [w00t@freaky.bish]# rm .signature
  86. Mips,Alpha vs. Pentiums by mattr · · Score: 1

    Oh P.S. the higher end HP is Pentium III.. and I am not sure what's in the Visualization Center but likely Pentium III.

  87. Re:OT: Bug Report! NT 4.0 with SP5 Fails DST by witz · · Score: 1

    I have 30 NT machines running SP5. Every single one was set to GMT-5:00 with DST. Every single one rolled over perfectly. You're full of it.

  88. Re:multi-threading by odysseus_complex · · Score: 1
    Possibly, but only if the VM that you are using can handle multiple processors.

    Now, the big question: How can I tell? You can tell that a VM can utilize multiple processors if it uses native threads and the OS is capable of scheduling threads across processors. The green-threads linked VMs use a single OS thread so, unless its doing something really wierd, using multiple threads in Java on one of these ports will only use a single processor.

    If you installed a second processor you might see a performance boost, but that would only be because the OS is scheduling processes away from the processor running Java.

    And now, because you just happened to mention Java on a thread discussing real-time (embedded) computing, I have no choice but to mention NewMonics, Inc..

  89. multi-threading by eshaft · · Score: 1

    so, if I'm doing a lot of programming in ultra-portable (laugh), OS-independent (bigger laugh) Java, and I have a bunch of threads, should I see a performance difference in Windows or Linux or BE because of the underlying mutli-threading support with multiple processors? Would I just see a boost if I installed another processor?

    --
    lf.o
  90. But dosn't having one CPU keep latency down? by slashdot-terminal · · Score: 1

    If you have just one CPU and use something like the real time linux patch on your machine coupled with a fast machine wouldn't that work?
    I mean if you can do something much faster wouldn't the delay also be much less?

    --
    Slashdot social engineering at it's finest
  91. How much are we talking about? by slashdot-terminal · · Score: 1

    And where can they be bought in (MST) zone? I haven't seen too many around and not too many on pricewatch. Areen't they discontinuing the alpha line?

    --
    Slashdot social engineering at it's finest
  92. How many FPU registers? by epeus · · Score: 1

    This is a key question. PowerPC chips have enough to keep a matrix, vector and result in registers. Pentiums don't.
    Altivec PPCs have even more. This makes a huge difference to lots of mathematical calculations, iff your compiler respects the register keyword or allocates them sensibly (eg for parameter passing).

  93. Re:OT: Bug Report! NT 4.0 with SP5 Fails DST by jerdenn · · Score: 1
    sure your server date is correct? I have multiple NTSP5 servers/workstations, and everything worked -like.. well, like clockwork, actually... ;-)

    -jerdenn

  94. The abacus by Ukab+the+Great · · Score: 1

    The most kick ass processor of our time. Has incredibly low power consumption, does not contain any closed proprietary technology owned by a single company, and has had several thousand years of debugging. Though it might take a lot of bead pushing to get good frame rates on Q3 Arena.

  95. Broken Question by Artie+FM · · Score: 1

    Hey slashdot I have a question for you: which CPU will serve my web pages fastest?

    You can't answer that can you... you need to know what kind of web pages I'm talking about.. what kind of web server I have, what OS, etc..

    In a simliar way this "ask slashdot" question is broken. When kind of "high end math", what kind of "real-time" as we talking about?

    In general you OS and driving program matter a lot more than anything else for real time. Real Time almost always means not running as fast a s possible.. instead it means running at a predictable rate.
    --
    Be insightful. If you can't be insightful, be informative.
    If you can't be informative, use my name

    --
    Be insightful. If you can't be insightful, be informative.
    If you can't be informative, use my name
  96. IBM Power3 by hardave · · Score: 1

    Well I guess it all depends if you're doing floating point or integer work. Intel based for integer work, but for floating point I would have to go for the IBM Power3. This is the chip in their new RS/6000 44P-270. We just got 4 of these boxes at work on Thursday, and believe me, they are sweet boxes. They can scale up to a 4-way with 8gig of Ram. I haven't had a chance to run real benchmarks, but seti ran 2-3 times faster then the 43P-260 Power2. One work unit took about 4 to 5 hours I believe. For high-end research, this is the box to get for sure!

  97. Re:Architecture makes the difference by randombit · · Score: 1

    Why use the Athlon for integer stuff. The Intel Pentium 3 has a faster specint95 score...

    Hmmm... that's funny. Ars Technica compared a 600 Mhz Athlon and a 600 Mhz P!!! and concluded that "Clock for clock, the Athlon's integer performance is undoubtedly superior to Intel's P6 core-based CPU. But heck, AMD's K6-III was, as well."

    Typical bloody slashot AMD supporter with no regard for facts.

    I am a AMD supporter. I like the Athlon. I detest Intel for many things, including inflicting the horror that is the x86 architecture upon the world. Alphas rock, and Athlon is as close as I'm going to get to an Alpha I can play Freespace on. I'm sorry if you don't like it, but AMD is shipping 1 Ghz CPUs. Intel's is coming out... oh, yeah... Q3!!

  98. Re:Not x86 by randombit · · Score: 1

    you can make things even more efficient and not run an OS at all.

    So you're suggesting he run DOS? :) Sorry, couldn't resist...

  99. Re:Broader View by jxxx · · Score: 1
    The processor must have on chip support for SMP. If they never allocated the die space for those SMP command, no amount of begging, praying or hacking will give you SMP. Since the Athalon has been out for a while and no one has produced an SMP machine from them, I'll say they never planned on it. Can someone else back me up or shot me down?

    Say it ain't so! On the issue of SMP support, having support on the processor could certainly be a help. However, lacking it (processor ID) does not bar the possibility. Otherwise, things like 32 processor Xeon systems wouldn't fly, as IIRC, the processors only have support for 4 IDs.

    On the Athlon, I've read that they do indeed have 'support' in the manner mentioned above. references:AMD

  100. Re:the 68000 BABY!! Still the fastest by SonofRage · · Score: 1

    No, they upgraded from a 386 to a 486. I remember because there was all this talk at the time about whether or not Intels upcoming 64bit cpu would be used but Nasa said no way because they needed a chip that would work without a fan.

  101. Re:the 68000 BABY!! Still the fastest by SonofRage · · Score: 1

    Actually, now that I think about it, it was the Hubble Space telescope that was upgraded from a 386 to a 486

  102. Alpha! by JeremyH · · Score: 1

    I would say, in order of preference:

    1. Alpha
    2. Ultra SPARC II
    3. Athlon or G4 PPC

    I leave the PIII completely out of the list because without SSE its a piece of shit. And SSE is probably not much use in dedicated mathematical computations (nor are MMX and 3DNow, but the Athlons FP unit beats the PIII even without 3DNow help)

    Also, to really get good mathematical performance, you will need really efficient code. ie. Assembler or FORTRAN compiled with a well optimized commercial compiler like portland

    --
    -JeremyH
  103. Real-time requirements do not follow Moore's Law by Wolfier · · Score: 1

    Brief definition of Real-Time:
    "response time some predefined threshold"

    Does the choice between IA or AMD or PowerPC or Alpha or anything that is fast enough matter now?

    Thanks to Moore's Law, I don't think so - merely two year ago, the high-end was 300 Mhz. Now we have 1Ghz.

    As long as the real time requirements of most applications do not follow Moore's Law as well, chances are you can pick your favorite processor.

    The OS seems to matter tho - different scheduling algorithms, and even quantum length, largely determine the "worst case response time". That's what makes QNX a realtime OS, Windows CE not, despite what some people may want you to believe.

    So far, the requirements of the OS seems to be...

    1. do not use RAM as cache
    2. programs must all fit in physical memory
    3. a right scheduler with quantum suitable for real-time

    So I'd say as long as you have a real-time-oriented OS, it does not really matter what processor you use.

  104. AMD Vs. Intel by dianos · · Score: 1

    The latest AMD chip Athlon has a much superior design then any Intel chip that has been released yet. That goes for the FPU as well. The problem however is in everything else but the chip, in other words the whole package. Intel being a much larger company has much better manufacturing plants, has control over the motherboard makers and whatnot. All these things are slowing Athlon's progress down. However AMD is not giving up in this little chip war so we should see it ahead by one step at least for a little while.
    Because there are so many differences between the architectures, you have to consider the whole package not just the CPU in designing a "faster system". What is this system main functionality? 3d gaming, web server, 3d graphic design?

  105. Re:New category by sopwath · · Score: 1
    The prgram is called Pulse. The BeBox had two mac processors (I dont know which ones) in it, and two rows of lights on the outside of the case. Each row of lights corresponded to the same info the Pulse program displayed. Kool!

    I know OF a link to an open standard for Be that shows how to do the same with newer computers but I cannot remeber the actuall address. Sorry.

  106. Re:Architecture makes the difference by SWroclawski · · Score: 1

    Though cost isn't really that big a difference if you go with a small shop like Patmos International. http://www.patmos-international.com

    If you call them, you can customize a rather cheap and very powerful machine at a cost:performance ratio far beyond anything those 32 bit processors can do.

  107. Here's what I use: by chandler · · Score: 1

    I find that my dual PII 400Mhz works just fine - use it for things like factorization, etc, and it's fast (as well as cheap - $120/processer, $170 for Tyan Tiger 100 mobo). If cost is your concern, try that.

    "The romance of Silicon Valley was about money - excuse me, about changing the world, one million dollars at a time."

    --

    Visit

  108. Re:New category by kcarnold · · Score: 1

    > Hmm, interesting. It seems we've got ourselfs a new category: upgrades.

    The stupid English language says that it's spelled 'ourselves'.

    Really, look at the icon. beady thing (abacus?) => printing calculator => laptop (is it just me, or do I see <evil>clouds</evil> in the background?). Let's add => Linux :-).

  109. Re:The stupidest question I've ever heard by Punto · · Score: 1
    What does real-time have to do with this?

    I think the question is "which chip will calculate the position of Lara Croft's polygons faster?". That's the real question.. Polygons have to bounce around, and you need good fps to watch the polygons bounce fluently.

    --

    --

    --
    Stay tuned for some shock and awe coming right up after this messages!

  110. One word. Sledgehammer. by Bushwacker · · Score: 1

    Amd's clock rocks! Besides just being faster, they are comparatively more powerfull when run at the same clockspeed. Plus, AMD's Sledgehammer (1.2GHz) will completely kill intel. for a few months, at least.

    --
    -----------------------------------------
    Perversely greped and groped by PowerPenguin
  111. No reason to bother Slashdot? by Aquitaine · · Score: 1
    ...No reason to bother Slashdot...

    It does take some of us a little longer to achieve true uber-geek status, you know; not all of us are born into a Slashdot queue. I would hate for someone to get the wrong idea about Slashdot by seeing such an informative piece (as yours was) ended by a 'You are too dumb to be here' remark. If it is truly so below you to respond, then don't. You obviously know what you're talking about, but the arrogance that comes across in your statement astounds those of us who aren't quite as smart into not wanting to ever ask anything on Slashdot, because God Forbid it should be beneath you.


    -Aq
  112. "Real Time" computation. by Doctor+Wonky · · Score: 1

    The important question is exactly what is "Real-Time." Broadly, I would see this as 'a sequence of computations that must be performed on data with a fixed amount of latency'. Often, we are talking about streams of data, and want low latency. There are two common reasons for needing low latency: 1) To have the data ready in time for a synchronous communication protocol, and 2) Because fast response based on the data is neccesary, such as with robotic control systems.

    Depending on how much you need to do, ANY processor can give respectable real-time performance. Certain processors might lend themselves to better timing predictability & lower latency while still offering excellent floating point performance. A perfect example would be Sun's new MAJC, which is designed pretty much from the ground up for low-latency streaming computation.

    As for the Pentium / Athlon matter, it's not so much the processor, it's the PC architecture. The PC (most notably the PCI bus) isn't really meant for working on streams of data, and timing can be completely unpredictable, depending on which devices have pending interrupts. If you take the Pentium & Athlon out of the PC architecture, I would expect one could achieve much better real-time performance. But I know of very few architectures for the Pentium, and none for the Athlon, so I expect a real comparison is impossible at this time.

    Also, multiprocessing has been mentioned repeatedly. There are two ways to increase real-time performance with multiprocessors. They can be connected in parallel and series. Each has drawbacks. Parallel can complicate programming & design, and is only really effective if the computations lend themselves to it. Serial connection would almost always be possible, but at the cost of latency.

    The simple answer is that the best real-time processor is the cheapest one that meets the needs of that particular application.

    Doctor Wonky

  113. Re:Broader View by afs · · Score: 1

    The Athlon has great SMP support. There just aren't any SMP-capable chipsets yet. The AMD 760 will be the first, to be introduced midyear (June-July). It will support 266Mhz EV6 bus, DDR SDRAM, etc. The EV6 bus is the same that the Alpha uses. Each CPU gets its own connection to RAM and the Northbridge. This removes contention for the bus you see with multiway x86 systems, dramatically increasing memory throughput.

    Via has stated they will not produce a SMP-capable chipset for the Athlon in the near future; it's not the market segment they're aiming for now.

  114. Re:well, actually by afs · · Score: 1

    Actually, gcc will almost certainly fall into that category within the next year. gcc is THE compiler for MacOS X. Apple has already added Altivec support themselves, as well a host of PowerPC optimizations. They are now in the process of giving the FSF copyright to their some 50000 lines of modifications so it can get rolled back into gcc.

    Cool. gcc produces slower code than most optimizing compilers on most platforms now, so it's good to see people with a vested interest working on this.

  115. Re:Here's what I just did by johndr · · Score: 1

    I think Cliff headlined this one wrong. Read what the guy asked, it doesn't mention realtime anywhere. He wants to know how to do cheap scientific computing. Cheers John

  116. Re:Bug in Slashdot. by The_Messenger · · Score: 1
    If you enter ID and password without cookies on , then preview, it will appear as if you are logged in, but you are not. There's no crack.

    In a similar vein, if I have a Slashdot cookie with my ID set, I can turn off cookies, hit "reply" and write my comment, and even though it says I'm AC, if I turn cookies back on before I submit the comment, 95% of the time it will read the cookie and put my ID on the comment.

    In fact, I'm doing that now. ;-)

    I usually surf with cookies off (damn Doubleclick isn't helping me change my mind about that ;-). That's how I figured that out.

    Just remember: with cookies off, you can only submit; if you preview, the ID won't be saved when you go to submit. That's what the cookies are for, silly.

    --

    --
    I like to watch.

  117. Re:What calculation? by The_Messenger · · Score: 1
    But if you had written your program in machine language from the beginning you would have a completely different architecture that an equivelent program written in C.

    Programming languages were invented to make coding easier. First, they make data abstraction easy, especially when you start getting into the realm of OOP. Secondly, by learning a widely-used language such as C, coders don't have to know the assembler language of each platform they work on.

    Now, hand tuning known slow spots in compile-generated code is fine, but any application that needs the tune-up is usually large enough so that doing the whole thing in hand-coded assembler is just ridiculous.

    What I would *really* like to learn, is tuning up Java bytecode. That platform has potential, but is too goddamn slow for any criticial work. Obviously the VM implementaion is half the problem, but there are bound to be some optimizations that can be made to the bytecode.

    Hopefully speed will improve with JDK1.3. Of course, being a *BSD person, I'll still be waiting for a stable port of Java 2 SDK when JDK1.3 is released for Linux in September. :-(

    --

    --
    I like to watch.

  118. Re:Couldn't you just analyze the program? by The_Messenger · · Score: 1
    Excuse my poor English. :-) You are correct, threads are what I meant to say.

    In the project code I am currently maintaining, processes are analogous to threads. Ugh. Perhaps that is the cause of my mental cloudiness.

    --

    --
    I like to watch.

  119. New category by Wizard+of+OS · · Score: 1

    Hmm, interesting. It seems we've got ourselfs a new category: upgrades.
    About the problem: I think a multiprocessor would perform best, a cheap board wich 4 cheap celerons outperforms a 1Ghz Athlon, but your program's need to be written for multiprocessors.

    --

    --
    If code was hard to write, it should be hard to read
    1. Re:New category by maniack · · Score: 1

      I've also seen quad pentium pros at ebay, but these chips are relatively slow: 200 MHz.

      --

      "Control the media, control the mind."-Cabal

    2. Re:New category by Locutus · · Score: 2

      Use the threads Luke.

      If the OS supports threads, there is a good chance that if your applications are designed for threads then the OS will spread the threads across processors. Having come for UNIX in the 80's on x86, to OS/2 (very threaded) in the 90's (migrating to Linux now) I have to say there is nothing like a well threaded application and a OS that really supports this (OS/2 and BeOS come to mind today). Nothing like it. In the 90's I was emailing all the x86 clone manufacturers in hopes they attack Intel with multiprocessor systems but they kept playing the MSFT/Windows game and there haven't been too many survivors. Windows didn't and doesn't thread that well (NT is only OK).
      The just of it is, if your current apps aren't multithreaded or you don't run more then just a few applications at one time, SMP won't do you much good.

      --
      "Anyone who stands out in the middle of a road looks like roadkill to me." --Linus
    3. Re:New category by pe1rxq · · Score: 2
      Does each individual program need to be written for multiprocessors? Isn't it enough if the OS supports multiple processors?

      Usually you get the best results if a program can be diveded into several processes, the os can then spread these over all processors. I believe Mosix works this way, for beowulf the programs have to be specially written in order to use it.

      Grtz, Jeroen

      --
      Secure messaging: http://quickmsg.vreeken.net/
    4. Re:New category by SirEdward · · Score: 2

      Does each individual program need to be written for multiprocessors? Isn't it enough if the OS supports multiple processors?...

  120. Re:Consumable Processor Units by _fuzz_ · · Score: 1
    multiple CPUs will only help if there are multiple threads of your app running simultaneously

    While multiple CPUs may not help a single threaded app directly, they do allow the OS to offload other processes to the other CPUs, giving the single threaded app more processor time on one CPU. Generally, by adding a second processor you see a 50% increase in overall system performance.
    --

    --
    47% of all statistics are made up on the spot.
  121. Re:Consumable Processor Units by _fuzz_ · · Score: 1
    C'mon. Read the thing again - this guy is obviously talking about games. That's what the original question was looking at. As many others have pointed out high-end mathematical computations aren't exactly real time material, unless you're talking about frame rates.

    Now, what I was saying about about the second processor is accurate. On a generally loaded system where you have several things going at once, you will see about a 50% increase in performance. While Joe Gamer is playing Quake, the CPU is split between his crappy winmodem, the ide disk trying to swap things in and out of his 32 megs of memory, his vector calculations before they get offloaded to his video processor, and his 3D positioning sound. And, since he pirated the game and doesn't have the music on the CD, he's got winamp playing mp3's, too. You've got a lot of threads and processes going on there. The OS is going to split them up and you'll see about a 50% improvement with the addition of another processor (most likely an overclocked Celeron in this case).

    Okay, that's not quite what he was asking about either, but it's more in the spirit of it.
    --

    --
    47% of all statistics are made up on the spot.
  122. Re:PowerPC by friedo · · Score: 1
    Hello. Dumb Mac Fuck here. PowerPC chips are far better than x86 in design. It's not the chips that are more expensive, it's the Macs. But of course, PowerPC chips are made by IBM and Motorola, not Apple, so you can build your own PowerPC box from a PPC Open Platform Board any day of the week.

    For massive floating point calculations I would reccomend a PowerPC running Linux or BSD or an Alpha over Intel or AMD any day. x86 simply sucks - that's all there is to it.

  123. Re:Architecture makes the difference by gatekeeper-eu · · Score: 1

    Agreed. But the question mentioned only 'math'. What kind of math - no mention of 'vector' if so the G4 would be suitable at the lower end.

  124. wrong field probably by marx · · Score: 1

    The definition you have just described is the computer systems definition. In virtual reality contexts, real time means what it sounds like, i.e. the appearance of a real time flow. I don't think there exists an established rigorous definition in the literature, but it's commonly defined as a visual update frequency of ~30 Hz, and sometimes also as a correspondance between virtual time and real time. Presumably the question referred to the second definition.

  125. Alphas ... by (void*) · · Score: 1
    I used to work on a dual processor Alpha and dual processor Pentium IIs. Floating point operations noticeably faster on the Alpha thn the pentium.

    But the compiler probably has something to do with it. I hate to break the news, but DEC's proprietry optimizing C compiler consistently produced faster code than the GNU C compiler for that alpha machine. A difference of 2 hours vs 1 hour of computation (Unloaded).

  126. Which OS? Not which processor. by PlaidSprayPaint · · Score: 1

    A real time computation means that there is input right now.
    I've got to analyze it right now.
    I've got to produce a result right now.

    And I will tell you that I trust my life on a regular basis to the QNX OS and no other. I'm a skydiving penguin and use an automatic activation device for safety.

    I'm not sure if I'd trust Windows CE or even my beloved Linux kernel to deploy my chute 750ft off the ground at 120mph.

    --

    Enforce Darwinism

    Crap, that stupid

  127. Re:the 68000 BABY!! Still the fastest by hartze11 · · Score: 1

    > btw, didn't Nasa recently (last 5 years) upgrade > the Space Shuttle Main engine controllers with > an embedded 68000 series chip? Maybe, but they put a 486 onboard of Hubble just a few months ago!

  128. Compute intensive--anyone can do it! by rthardy · · Score: 1

    Just want to point out that anyone can test their CPU at "high end mathematical computations" (distinct from real time). Run SETI or the Mersenne prime program for two different kinds of tests.

    I'm doing ~4.7 SETI units/day on a dual PII 400 running two Linux processes, or one unit per 30-40 hours on Win98. I don't think it's all in cache.

    HTH :-)
    --
    Tom Hardy
    rthardy@email.msn.com
    rhardy@blakeschool.org

    --
    Tom Hardy
    1. Re:Compute intensive--anyone can do it! by rthardy · · Score: 1

      Easily--I just provided a couple of datapoints. Note that SETI itself doesn't support SMP. I'm running two processes under Linux, and only one under Win98.

      --
      Tom Hardy
      rthardy@email.msn.com
      rhardy@blakeschool.org

      --
      Tom Hardy
    2. Re:Compute intensive--anyone can do it! by frinkster · · Score: 1
      I get about 3.7 SETI units/day (~6:24) on a single processor G4 that is unlucky enough to be running MacOS 9.

      Distributed.net also had good things to say about the G4 and it's altivec unit. Check out this link. Basically, it said that the G4 was much better than the PIII, but I don't remember the exact numbers. The link wasn't working when I tried to look at it just now, but it is still on their website. Or at least links to it still are.

  129. DO you know what Super computers are for? by mekkab · · Score: 1

    Yeah, but THAT is not the advantage of a super computer. The reason why yr spending a two orders of magnitude more in dollars, is for the CPU interconnects! Sgi has put a lot of time and energy into making sure that when one proc wants some data that's just been crunched on another proc, it'll get to where it needs to be, FAST. You don't want to get hosed on the context switch.

    --
    In the future, I would want to not be isolated from my friends in the Space Station.
    1. Re:DO you know what Super computers are for? by szyzyg · · Score: 2

      Ahhhh so you make the CPU's slow enough that the conytext switch looks fast..

      Seriously - a couple of points
      (1) - I'ts pounds not dollars - yep - almost 400,000 dollars worth ;-)
      (2) - The DS20 has and even faster memory bus -- I mean Waaayyyy faster - I've seen benchmarks for 8 processor alpha based Beowulfs and they're *still* faster than the SGI hardware.....

      Of course... i've always thught of SGI as being nice graphics platforms... so got knows why we got a big cabinet with no graphics card...

  130. to ALL my brothers and sisters in latency by mekkab · · Score: 1

    big up ya' chest to all of those who replied "you have no idea what 'real time system' means, do you?"

    It's not speed/bandwith, its about predictable worst-case latency. BUT its also about the ability to have a fully interruptable OS- I mean the 'basic' definition of a 'real-time system' is a system that can make resources available when they're needed. And what kind of system are you using this for; do you have hard or soft deadlines?

    But I'm getting ahead of myself there...

    --
    In the future, I would want to not be isolated from my friends in the Space Station.
  131. Re:Couldn't you just analyze the program? by krogoth · · Score: 1

    OK, try that with this (sorry for any errors, i dont really use printf):

    main( )
    {
    int a = rand();
    printf( "The number is %i", a);
    }

    As you can see, it doesn't work. A program that uses multiple threads could easily be separated.

    --

    They that quote Benjamin Franklin on liberty and safety deserve neither.
  132. Daft question by jeff_bond · · Score: 1

    What on earth does the choice of CPU have to do with real-time performance? Surely it's the operating system that determines the response time of a system?

    Jeff

    --
    stty erase ^H
  133. Re:the 68000 BABY!! Still the fastest by homoted · · Score: 1

    Godamnit I admit I am wrong :P

    I had a little knee jerk reaction because i believed it was a troll. Except I ended up being the misinformed troll.

    LOL

    --

  134. Re:the 68000 BABY!! Still the fastest by homoted · · Score: 1
    btw, didn't Nasa recently (last 5 years) upgrade the Space Shuttle Main engine controllers with an embedded 68000 series chip?

    NO! They upgraded from intel 386 to intel 486.

    Please not try to change the facts only because you happen to like the motorola 68000!


    --

  135. the 68000 BABY!! by BiggestPOS · · Score: 1

    its all about the Motorola 68000. Can't be touched.

    --
    What, me worry?
    1. Re:the 68000 BABY!! by fishexe · · Score: 1

      No, gotta be z80. z80 all the way!!
      -------
      For the next generation of real-time applications,
      "Zilog Inside"

      --
      "I don't care about the Constitution!" --Bill O'Reilly, November 17, 2009
  136. Missing the point by selfandother · · Score: 1

    The processor _can_ make a large difference in mathematical computations, but quite a few of you seem to be assuming that the computational program is going to be written in a high-level language, and begin speaking of threading. What you seem to be forgetting is that for raw number crunching you need to get all of the redundant awkward arrgh-stupid-compiler code out of the way and (are you sitting down?) learn assembly. ...Preferably for an architecture designed with fp-math in mind, unlike intel or amd (superfluous graphical instructions all around). High-level is fine for huge apps with lots of "features," but to find the next largest prime in your lifetime, you have to convert to binary and get the nose to the proverbial grindstone.

    --
    "C'mon, this isn't rocket surgery." - Anon.
  137. Re:Not an x86 by pe1rxq · · Score: 1
    With 'real time mathematical problems' they probably ment 'lots of frames in my windoze shoot-em-up games'. So they are not likely to choose for anything but x86. But if one of these guys wants to correct me please do!

    Grtz, Jeroen

    --
    Secure messaging: http://quickmsg.vreeken.net/
  138. Processors by dorzak · · Score: 1

    Depending on whose bench marks you use, it can be argued that the PIII coppermine or the Athlon has a faster FPU. However the Athlon has a potential for a higher bus speed. When it comes to number crunching though, RISC processors are where it is at. G3/G4 Multiprocessor board

  139. Athlon I'd say,BUT... by sokoban · · Score: 1

    ... the motorola 7400 smokes them all in price/performance. I remember seeing an 8 processor 7400 setup. For pure GFlops, it's about as fast as you can get on a reasonable budget. A 500 Mhz 7400 is about as fast as an athlon 1500 in floating point.

    --
    09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0 is the magic number.
  140. Re:Couldn't you just analyze the program? by maniack · · Score: 1
    What is the absolutely cheapest dual processor system that one could get? Where is this sold?

    Try an Abit BP6 with dual celerons. There are even modification chips that let you use dual FC-PGA P3s. I don't know where you can buy a system, but it would be cheapest to build your own.

    --

    "Control the media, control the mind."-Cabal

  141. Re:What calculation? by TheNightAngel · · Score: 1

    Whatever. Yeah, printf is pretty hard to optimize further than it currently is. But if you had written your program in machine language from the beginning you would have a completely different architecture that an equivelent program written in C. Maybe it wouldn't be 50x faster but it would be an order of magnitude faster at least. But this it isn't worth the effort, with Moore's law, who gives a damn - wait 18 months.
    Anyway, you didn't dispute a specialized computer vs. a general one. Think of how many more data blocks you'd be through on your SETI@Home account if you had a mini-refrigerator sized analog fast fourier transform computer plugged into your machine.
    The Night Angel

  142. Bug in Slashdot. by TheNightAngel · · Score: 1

    Ahem. Can someone retag the above post so it belongs to me? =) I have cookies turned off in my browser. I previewed my post once and the preview page had me marked as being logged in (as I entered my name & password on the original feedback screen). Then when I submitted it, lo and behold I went down in history as an anonymous coward. Heheh.
    Guess that makes me one of those ingenious fools to discover a crack like this ;)
    The Night Angel

  143. Re:What calculation? by TheNightAngel · · Score: 1

    =) Oh!!! Well structured and blocked C!. Heheh, you don't work for MicroSoft then =)
    The Night Angel.

  144. Re:OT: Bug Report! NT 4.0 with SP5 Fails DST by aclaudet · · Score: 1

    I have the same setup (NT4SP5) and everything worked just fine.

  145. Re:Thinking for difficult operations by veggiefish · · Score: 1

    Alphas are good, but you must rember that they are not supported for NT any more.

    (oops, mabye I should not have told people that sometimes I still use windows; I'll lose Karma for sure.)

    --

  146. Re:Depends on your software... by tesserae · · Score: 1
    Kinda hard to talk about it, since it doesn't appear to really be on the market yet. If their single-processor 1854 VIA board is any indicator, though, it should be pretty nice. The 133MHz FSB and memory access make a difference!

    My major concern is that the VIA chipsets seem to leave something to be desired when it comes to memory access -- clockspeed for clockspeed, the BX boards will still whomp the VIA boards (hell, they beat the 820's with DRDRAM on some of the benchmarks!). But if you're not willing to overclock your system, the 1834 should be better than the 1832, speedwise (at 133MHz vs. 100MHz, of course). We'll see soon, I hope.

    ---

    --

    ---
    Politics is about making compromises. Religion isn't. --Michael Horton

  147. Re:Discrete Event Simulation PIII -v- SPARC by zozie · · Score: 1

    In my department the standard config is:
    dual 166 Mhz SPARC server(solaris)+WinNT boxes, which can run exceed to access the server
    However, since my research is all with Unix-based programs, and the WinNT box I used is a PII-400, I was very glad I was allowed to install Linux on it, and see quite a speed improvement ;-)

  148. some chip by oog_rocks · · Score: 1

    well, i'm sure that it's some chip, hands down of course

    --
    Don't be mean or my friend Oog will smash your head
  149. Re:Thinking for difficult operations by sethgecko · · Score: 1

    Depends on what you're crunching, but a 450MHz PPC G4 does 4 Megakeys/second on distributed.net. If you're wondering, my dual celeron 550 only does 3.08 MKeys/sec. Since the celeron uses the same P6 core as the P3, and the distributed.net client is heavily multi-threaded, this is one of the few cases where 2 processors should be faster than one much faster one. In other words, in this, and only this case, 2 celerons have the number crunching power of a 1.1 Ghz P6 processor (read Pentium II/III). Or they have 75% the number crunching power of a 450MHz G4 using the altivec unit. If your number crunching can be done in vector operations and not floating point, the G4 rocks. Oh yeah, you can run linux on a G4 and use the altivec unit www.blacklablinux.com.

    --
    Be ot or bot ne ot, taht is the nestquoi.
  150. Here's what "real time" means! by BlowChunx · · Score: 1

    It sounds to me like a bunch of sys admins are caught up in the terms of the jargon of their field... This "real time" mathematical calculation is a simple concept. I will type slowly so you can understand... 1) Simple case: say Joe drives his car at a constant velocity of 40 mph due west. Where is Joe now? I need to know where Joe is, at the precise moment he is there, not days later. If I can find out earlier, no sweeat. This allows me to put up a red blinking dot in a map, and track him "real time". 2) Research case: say you want to model the waves that radiate from someone dropping a pebble in a pool, using the equations of motion. So you pick some way of solving partial differential equations, and look at the time dependent solution. And when I say look at the solution, I mean just that. Watch the pretty pictures generated by say some OpenGL rendered isosurface of the waves radiating out from the dropped pebble. For kicks, you could start the simulation at the same time as you drop a real pebble into real water, and check for similarities! This guy needs straight flat out FPU power, with some decent speed hard drives to write the data to, and good video performance so he can render the results to the screen. He also needs a good compiler. gcc ain't gonna cut it. Sorry. My thumbs up recommendation would be most likely: a nice Sun enterprise system, with a gob of processors in it, and SunWorks compiler set (which has a nice auto-paralleliztion option to it). But if money were no object get you hands on a box from IBM with Power4 chips in it. The real oxymoron here is fast vs. price.

  151. This one's a real toughie. by itarget · · Score: 1

    It's hard enough to perform realtime computing, let alone finding one processor better at it than another.

    AFAIK, there aren't currently any consumer processors capable of performing this task. They've all got latency barriers that would prove a major hinderance, if not make the task outright impossible.

    This is just speculation based on what I know, though. I'd love to see some more info on the subject.


    ---
    Where can the word be found, where can the word resound? Not here, there is not enough silence.

    --

    "Where shall the word be found, where will the word resound? Not here, there is not enough silence." -T.S. Eliot
  152. Re:The stupidest question I've ever heard by fishexe · · Score: 1

    Well, the question may have been stupid but it got us into a decent discussion of what real-time means, how its considerations are different from normal work, and why AMD & intel are not the best choices for high-end math. Now the original question may sound stupid, but only because the poster didn't know these things. And if you already know, wtf is the point of asking a question?

    --
    "I don't care about the Constitution!" --Bill O'Reilly, November 17, 2009
  153. Cost effective? by oobfrist · · Score: 1

    Well it't hard to decide what is cost effective without more details. Reinventing the wheel is never cheap. Do we really need to design a whole new system? Are we just silly? How much processing really needs to be done? I suggest something between a TRS-80 and a G4

  154. What about DSPs by crazydsp · · Score: 1

    If you need really a fast floating point real time, why not use a DSP? Although DSPs are primarily built for signal processing, the new processors handle other operations like interrupts pretty well. Many of these DSPs can perform floating point inversions in one cycle. A lot of them are Harvard Architecture machines and are primarily built for fast Multiply and Accumulates. But, these processors run real time OSes and are capable of doing multithreaded operations. Some newer processors have really fast ports that can talk to other processors and can be easily scaled to a multiprocessor system. And - manufacturers like TI and Analog Devices have been making really fast DSPs lately (TI just announced its 1GHz DSP). They are not that expensive either (They are comparable to the embedded processors made by Motorola). And, the best part is that the new processors really little power. So, why use Intel or AMD for real time processing when you can use a really fast DSP which costs less.

  155. Morons! by Anonymous Coward · · Score: 2

    Hey, wake up. The guy asked a question and all he gets for a response is "Motorola 68000's are the best for embedded applications." and "I seem to remember that Intel chips are best for math calculations." Duhrr.. No. First of all this guy is not looking for an embedded application, and he wants to now the current state of relative processor power. For mathematical operations, you have two choices. Intel is not one of them. Intel's processors still use the same archaic method for math calculations as the pentium. Until Intel comes out with something new they are out of the race. Plus, their chips are more expensive than AMD's. The Athlon however, uses a monster math process it inherited from the Alpha. This thing eats, sleeps, and breathes math. Plus it's cheaper. For comparison, here are the specInt & specFP (Industry standard Integer and Floating Point benchmarks (the math in a processor))
    Athlon 600 : specInt: 28, specFP: 22
    Pentium III 600: specInt: 24 specFP: 15.9
    Add to this that floating point calculations are the more important of the two, and the athlon is the clear winner. Now, there is another contender. The IBM 7400 (G4), it's spec scores are:
    specInt: 450Mhz specInt: 21.4 specFP: 20.4
    pretty close to the athlon 600mhz. But the important part is the Altivec unit (Velocity Engine), which is a monster 128 bit wide SIMD math destroyer. The only thing is the software has to be optimized for it. With an Altivec enhanced RC5 decryption client, an G4 450 outperforms a 1ghz (700mhz overclocked) Athlon. With properly coded programs this thing absolutely screams. So, if you are writing your own program, and are proficient enought to include altivec, a G4 may give you the most bang for your buck. The only way I know how to get one is to buy an Apple thought, I hear IBM may be releasing reference boards for Linux systems though. Check it out. As for multiprocessor systems, unless you know specifically that your math calculations won't be done in a series of steps, i.e. one calculation can be performed without knowing the results of a previous calculation. Which, by the way, is rather unlikely. I wouldn't suggest them. If you want some more information, check out these articles.

    G4 vs AMD Athlon:
    http://www.arstechnica.com /cpu/1q00/g4vsk7/g4vsk7-1.html
    Comparison of Altivec, and some other SIMD's:
    http://www.arstechnica.com/cpu /1q00/simd/simd-1.html
    Pentium III vs. Athlon:
    http://www7.tomshardware.co m/cpu/99q3/990809/index.html
    http://www7.tomshardware.co m/cpu/99q3/990823/index.html

    Spec scores taken from http://www.ugeek.com If you have anymore questions, you can email me at guso@geek.com

  156. Real-time and high-end math don't mix well. by Erich · · Score: 2
    Doing heavy math in real-time isn't what you want to do, probably.

    Most math problems take a variable amount of time to do. And if you don't want to always use the worst case, you can't do it in a garunteed amount of time.

    The best solution, in my opinion, for a system that, say, collects data in real-time and does analysis on them is to have a machine (or part of the machine -- wait a minute) running on a garunteed real-time operating system and the math stuff queued up and done later on another machine. For instance, have one machine do measurements and spit out data over a serial port, and another machine that reads from the serial port and does the fourier transforms or whatnot.

    You don't necessarily need another machine, however. Real Time Linux allows you to have garunteed processor time / time interrupts... all the things you need for Real Time tasks... and you run all the rest of the Linux stuff after all the real time stuff finishes. This means you could use the same machine for reading as for analysis. Have the data collection as a real-time thread, do the analysis and other stuff in normal mode. I bet other OSs have this too, but I'm only familiar with RTLinux.

    If you really need to do the heavy math in real-time, I'd test how fast the math stuff runs on it, and make sure that it runs in roughly 1/4 the time you need. That should leave you with enough leeway so that you don't have to worry about caching, etc. as much, but can still leave the caching on (because caching really helps). Unless it's something where you can never fail, ever, like where human lives are at stake. But then you shouldn't be using plain-vanilla PC stuff anyay. In any case, you'll have to run Real-Time software (like Real Time Linux or no OS and do interrupt stuff).

    I still find that the best solution for doing real-time stuff is some nice microcontroller code, if it doesn't involve too much heavy processing power. Stick a PIC in there, you can count up exactly how long it will take really easy.

    --

    -- Erich

    Slashdot reader since 1997

  157. Broader View by Adam+Wiggins · · Score: 2
    Here's my experience:

    PII's pretty much smoked the hell out of the K6-2. At the time, the K6-2 was mainly just a low-cost alternative. Along came the K6-3, however, and that all changed. (Unfortunately, the K6-3 seems to have slipped between the cracks, and is somewhat hard to find these days.) On an identical system I had both a PII-450 and a K6-3 400 (of which, I might add, the second cost about 1/3rd of the first). For floating point, the PII was certainly more impressive - Quake ran at a signifigantly higher framerate. But for most everything else, from running Netscape to compiling the kernel, the K6-3 pretty much rocked the PII!

    So the K6-3 is now my server processor. My website, my mud, and in fact any non-FPE duties I delegate to those nice-n-cheap K6-3's. (You can get a K6-3/400 for $80 now, and there are 475 and 500 mhz versions on the way.)

    If you have signifigant floating point operations, then the PII smokes anything in the K6 series.

    On the higher end, the PIII is not much more than the PII - just higher clockrates and some FP enhancements. Coppermine gives it a nice speedy bus throughput, so certainly I would say that a PIII/copermine/SMP system would certainly make a very nice server - not cheap, but still cheaper than the equivilent in, say, Alpha processors.

    The Athalon, on the other hand, destroys the PIII when it comes to floating point performance. Anything that relies on raw FP performance, such as ray-tracing or other 3D rendering, will show the vast superiority of the Athalon. For other tasks I believe that the Athalon and the PIII (w/ copermine, anyhow) are more or less equivilent.

    However - and here's my big complaint - there's still no SMP Athalon! That really, really sucks. Considering that the Athalon is down to $1 a mhz for the mid-range speeds (eg, 700mhz or so), it's almost a crime that there's no SMP motherboard available. A two or four processor Athalon system costing less than $2000 could probably do the same amount of rendering as a $10,000+ Alpha system. It's a REAL shame.

  158. (d) None of the above by The+Man · · Score: 2

    No x86 processor is good for anything. If you're really asking about realtime stuff, you want Motorola 68k-based microcontrollers. If you're really asking about high-end scientific computation, you want MIPS, SPARC, or Alpha depending on the specific properties of the work involved. If you really mean "I want to do high-end scientific computation in real time," for example, simulate a nuclear bomb explosion in real time, I don't know whether to laugh or recommend a large area such as Asia stacked eight deep with SGI Origin 2000s. Rephrase your question please.

  159. Do you know what real time means? by Kaz+Kylheku · · Score: 2

    It means that the system can respond to some event within a guaranteed time. This requires a combination of the right hardware and software.

    The two processors pitted against each other rely heavily on caching to achieve their performance. Caching makes it difficult to make real time predictions, unless you stick to the worst case analysis: i.e. ensure that the deadlines can be met even with all caching disabled.

    As for the software, you need a real time operating system. Not a workstation OS like Linux that can disable interrupts on a processor for eons of processor time, and cannot be preempted while running kernel code.

    I think you may be talking here about quasi-real time or soft real time: which means ``fast enough to draw pretty pictures on my screen at a decent refresh rate when my system is not too bogged down''.

  160. Re:I don't think you understand what you are askin by slothbait · · Score: 2

    There's nothing wrong with not being well-informed. However, there are better and worse ways of looking for information.

    His question, as stated, can not really be answered. Thus, we have to second-guess the guy in order to provide *any* answer. "Real-time, high-end mathematical computations" doesn't make much sense. About the only systems that really match that description are military grade custom hardware systems. If he is truly trying to implement such a system on commodity x86, then he doesn't understand what he is getting into. This, in turn, means that he hasn't done his homework, and is wanting the Slashdot community to do it for him. To use the Unix lexicon, he should RTFM.

    Now, if he *had* done some homework, and was instead asking about people's particular experiences with certain systems or configurations to gauge how the theory works out in practice, then I would have sympathy for his cause. But, asking about *just* the microprocessor implies no understanding of the situation. Actually, your processor has very little to do with the real-time behavior of a system. Real time characteristics are more influenced by choice of OS and memory system.

    However, I doubt that he was really asking about anything hard real time. He may be trying to build a shoutcast server, on-the-fly mp3 encoder, or somesuch. This is a fairly interesting project, and fits the description of "soft" real-time. It does not fit the description of "mathematical", however.

    Given the scenario presented doesn't much pertain to processors, it seems to come down to "Which is better: Athlon or PIII". Without background, we can only answer this in the general case. In the general case, this question has been answered *many* times over and does not need to be repeated in Ask Slashdot.

    I'll apologize now if I came off as being abrasive, but it is irksome when people ask questions that they don't even understand enough to communicate properly. My impression is that he threw "mathematical" in there just to sound more interesting, but I could be wrong.

    --Lenny

  161. Re:Consumable Processor Units by warmcat · · Score: 2

    >Generally, by adding a second processor
    >you see a 50% increase in overall
    >system performance

    This simply isn't true if you are running a single instance of a single main application, and there are only background OS tasks competing for the cycles. If your OS supports thread/CPU affinity, you will see one CPU go to 100% utilization, and the other sit at around 2 - 3 % servicing the OS tasks. If your OS does not use affinity and tries to spread the thread between CPUs each timeslice you will see both your CPU's utilization at 52% or so.

    If your machine was heavily loaded and your app was slowed because it was already competing for cycles with other processes, then what you said is true, but I don't think that's what the original question was looking at.

    -Andy

  162. Consumable Processor Units by warmcat · · Score: 2

    Briefly, multiple CPUs will only help if there are multiple threads of your app running simultaneously. Assuming that is possible for you, then you will at best see something like a 75% or so performance improvement with the two processors going at it. Two more additional processors might deliver another 50% of the single processor throughput each. Of course, if you algorithm and data fits entirely within the 32K L1 caches, you will get nearly 100% improvement with each processor, but is that likely?

    The question is a little broken, though, because in this day and age the trailing edge processors (eg, Celeron-400) are so cheap that you would be better staging your machine to use, say, three drops of trailing edge processor(s) on a BX PC100 motherboard, and upgrading as Intel and AMD update their price lists. You should bear in mind that the fastest CPUs are only double the speed of the modern trailing edge ones (I mean here Celerons, not AMD K2s), yet cost five, six times the price or more.

    I am sure plenty of people will disagree, but nowadays the CPU is more or less a consumable (especially if you have other Slot 1 motherboards that can get the hand-me-downs).

    -Andy

  163. Not an x86 by howardjp · · Score: 2

    68k, PPC, 88k, Coldfire, StrongARM, i960, MIPS, SPARC, any of the above, but never an x86 for real time processing.

  164. Re:Alpha Alpha Alpha! (not $$$ SGI) by szyzyg · · Score: 2

    Heh this is an origin system.... lots of nice flashing lights though .... and Gigs of memory... the DS20 only has 1 gig - memoryis pretty critical in these calculations.

  165. depends on type of computations... by Barbarian · · Score: 2

    Some computations cannot be performed in parallel..

    --

  166. well, actually by mcc · · Score: 2

    this _is_ one of those cases where the PPC should be highly recommended.
    why? altivec.
    Speed improvements are always arbitrary. Yes, there are times when a G3 350 will be twice as fast as a pentium 350. THere are times the pentium 350 will be faster. Benchmarks are not something you should be listening to, and different processors will be better at different tasks.

    However the question is not "which processor is better overall"; the question is "which processor is better for real-time heavy computational math". In which case you really kinda do probably want to go with the G4. "real time" implies you are going to be taking one speciallized [difficult] task and doing it over and over and over with different data, which is what Altivec is designed for (SIMD) and what it excels at. As long as you are willing to go ahead and specially code for Altivec, in this case you will get a speed jolt virtually unparallelled.

    Unfortunately, due to manufacturing problems, Motorola and IBM are for the moment having trouble making G4s that run at over 500 mhz, and there are _still_ no multiprocessing G4 mobos available as far as i am aware.

    So as soon as the third parties would actually get around to shipping a SMP G4 mobo for use with linux/bsd (apple is a bit tied up in their own problems..), that's what you'd want. As of now G4 may not be the best choice. A good choice to be sure, but i'm a bit dubious as to how well a single 500 mhz G4 would do against, say, four 800 mhz athlons.

    1. Re:well, actually by JohnZed · · Score: 2

      But use of Altivec, despite what Apple and Motorola will tell you, is very limited. And it's just now becoming experimental in LinuxPPC. You really need a compiler that knows Altivec inside and out in order to get a real advantage for numerical code, and gcc simply will not fall into that category in the forseeable future.
      --JRZ

    2. Re:well, actually by iMoron · · Score: 2

      Unfortunately, due to manufacturing problems, Motorola and IBM are for the moment having trouble making G4s that run at over 500 mhz, and there are _still_ no multiprocessing G4 mobos available as far as i am aware.

      XLR8 is working on a multiprocessing G4 upgrade card which should be out by the end of 2000.

  167. Re:Flamebait?? by Forkenhoppen · · Score: 2

    "We are locutus of borg. Prepare to be assimilated."

    "Um... yeah, sure... but first, I have a question to ask you. You all have embedded microprocessors in your bodies, right? So my question is, which type of microprocessor is best?"

    "AM--Int--MIP--Alp---ARRRGGGGGGGGGHHHHH!!!!!"

    *cube blows up*

  168. Re:PPC 7400 by Surak · · Score: 2

    Nahh. My money's on Alpha, which is designed for high FPU performance. Remember, Alpha was primarily designed as a high-performance RISC chip for CAD workstations, which do a lot math problems.

    But as other posters have pointed out, "realtime" math processing requires much greater performance than any chip designed for PCs and workstations. This is where we get into the supercomputer realm.

    Now for a Beowulf cluster, which still really isn't designed for "realtime" processing, but it might be good enough for a particular application, for math my money's still on Alpha... some of the fastest Beowulf clusters have been based on Alpha.

    I will always go back to this, though: it depends on the application. 99.99% of the time, you don't really need or want real-time processing. Its just too expensive and requires really, really sophisticated hardware. In many cases, when you THINK you need real-time processing, what you REALLY need is real-enough-time processing. Which brings me back to Beowulf and Alpha. :)

    PPC 7400 may be good for DSP-related stuff, but an all-around math chip it ain't. The Alpha is it.

  169. Re:Couldn't you just analyze the program? by Abigail-II · · Score: 2
    How much more effort would you have to do say in a standard C++ program to get it to fully equally use the 2 processors in doing something like calculating all of the primes between 1 and 9,000,000,000,000?

    That's the wrong question. By far the fastest methods to calculate primes from 1 to N, for some N, are algorithms based on sieves. Simple calculations, calculations that can easily be parallalized, but sieves take memory. You're accessing memory all the time, while doing trivial calculations. Large amounts of RAM, a fast and large cache, fast memory banks, and a fast disk (for swap) are more important than processor speed. Even better is a tailored algorithm dividing the work in chunks to minimize swapping.

    Processor speed might be interesting for some, but it's utterly pointless without context. A slow processor with a large cache, can do many things faster than a fast processor with a small or slow cache.

    -- Abigail

  170. Real time description by maraist · · Score: 2

    Ok, I've seen some good posts here, but I've worked at a company that does real time work, and I've picked up a few things.

    To summarize, I think there is no best solution. Determination of response time verses throughput requirements, Floating point verses logic computation, hardware acceleratable verses pure SW execution, which OSes are allowed, custom verses off the shelf are all huge factors and completely change the rules.

    First of all, no VM.. None.. period. I believe you can accomplish this buy setting swap size to zero, but that's not generally enough.

    You're probably not going to want a multi-app optimized OS like windows ( don't flame please ) or UNIX.

    You're going to want one of the embedded OS's or just hard-wired drivers. You could even get away with DOS. Where I work, we use QNIX, which is a bit aged, but labels itself as a real-time OS, so I assume it's their primary focus. The best part is that you get all your UNIX functionality, plus several really cool features and network-centric operations ( even more-so than UNIX ).

    If you seriously need response time ( and we're talking micro-second response ), then you're probably going to want an embedded processor. I remember back in the days of the 486, there were embedded varients ( I can't even remember the names anymore ). You don't want interrupts to be a part of your basic operation, since you're stealing cycles in an unpredicatble way. IO and polling all the way baby; Can't get much more deterministic than that.

    If you're doing much of a custom job, then you might do well with a co-processor type CPU, that gives you the added flexibility of your design. MIPS still takes this approach I believe. Plus there are plenty of high-perf off the shelf Co-Processor designs. DSP and geometry processors are readily available as seperate chips ( Glint comes to mind ).

    The author seems to speak about name-brands, so I assume they're not dealing with anything so intricate.. Most likely, they're thinking of MS and real-time apps like video etc. If this is the case, then I'd have to say, the CPU with the quickest response time or the CPU with the greatest ability to handle your type of data.. Obviously floating point is going to point towards Intel ( unless you're dealing with Athalon ). But if you don't use FP, then AMD's K6 line had the shortest pipeline for the bang. A K6-3 is probably your best bet. It has nearly the caching capability of a P-II, comparible integer performance, but lower latency ( especially with branch misses ).

    If these calculations are graphics based, again, seperate specialized components are going to be your best bet ( high-end video cards are now parallelized )
    If this doesn't suite your needs, and general computation is your requirement. The more cache the better. It doesnt' suite real-time in that it's not deterministic, but, you will achieve more noticable throughput. When Athalon gets it's 2 and 4 Meg caches out the door, I'd vote for that.. But a maxed out Xeon is probably your best bet ( no numbers to back me up.. sorry ).
    On the topic of SMP. Deterministic time can not be garunteed. But if we're talking about throughput ( such as live video ) and not response time ( such as an missile tracking system ), AND your application is algorithmically threadible, then 2 or more CPU's will be worth your while. For example, I get nearly dbl my MP3 encoding performance with a dual celerons ( but only when encoding multiple wave files ). If, for example, you were dealing with some non-hardware accelerated video and audio stream, then dual CPU's could probably work to your benifit. ( there's simply no excuse for not having hardware support though ).
    Assuming multi-CPU configurations, Athalon has a better setup than the Xeon. And if Athalon could ever get their L2 memory size up to 4 or 8 meg, then you'd have a high-throughput device.
    Now, when we introduce price, I'd probably have to say that Athalon is going to win out here. The Xeon just falls out of the picture, value wise.
    The coppermine's optimal cache configuration applied to the Athalon would seriously make it the best all around ( minus the deep pipelining/latency ).
    The alpha seems to be a good contender everywhere except price. Small simple ordered pipeline. Decent cache size, good bus and SMP features. Problem, of course is price and application availability.

    --
    -Michael
  171. Thinking for difficult operations by slashdot-terminal · · Score: 2

    I think this would be a touch call. But I would still go with an Intel machine. For the longest time (and for the most part what I am currently reading) AMD chips are more closely designed for graphics (pretty pictures and quake). Intel machines are more number crunching.
    What really strikes me as a problem is that to actually get to the point where mathmetical calculations are a real problem you have to get quite far in your education. Largely CS majors don't have to (and I argue shouldn't have to) know anything above maybe say trig or so.
    Unless you are going to work designing something that is actually doing said math (writing Maple/Mathmatica/Mathcad) is almost not used. I have looked over the majority of code for most of the OSS applications that come in a modern Linux app and there isn't one shread of calculus level math. Also considering how many people have historically failed calculus (same with Latin and such) I think this is really hard to measure. The most complex thing I ever did was calculating something like upper levels of digits of pi with a simple C program and that worked fine for everything up to about 300,000 on a 386 I had handy. Plus isn't getting a quad CPU computer a rather large hit in the wallet? I would almost bet they don't even sell them anywhere in a local way. Also judging from the almost complete lack of advanced premade software to do such calculations I am almost at a loss to determine how such software get produced. I think that almost all of it is produced from people who have multiple PHds and such.
    Again the most complex piece of software that I ever saw sold was Mathmetica (can't even buy that in any local stores).

    Is there any say quite easy kind of text that can teach a person say with nice graphics, tables, figures, and example problems extremely advanced math which would allow someone to get a bearing on what kind of purtchess to make? Is there an incredibly advanced software package (for some PC type system or similar Mac/Windows/Linux machines) ? Or is this all just pie in the sky stuff.
    The reason I say this is because most of the books beyond say standard calculus books (because people such as I find it dry, irritating, hard, and a general nightmare) are just dry tomes that present information in a difficult to comphrend way and procide few if any explanation or actual implimentation details.

    Is there a general algorithm for say breaking up calculations into steps or pieces that can be done on a machine one step at a time? I mean maybe people would feel better if when calculating some problem that could take a while to solve that the steps (in say machine code/asm) could be displayed and interpreted?

    --
    Slashdot social engineering at it's finest
    1. Re:Thinking for difficult operations by wavelet2000 · · Score: 3

      1. Computationally intensive jobs mean lots of floating point operations. Here Athlon outperforms Pentium-III, but less so at higher Mhz, as L2 cache becomes more of a bottleneck. However, Alpha is (so far) the absolute best in floating point. in my experience (=floating point intensive programs in Fortran 90), alpha-21264-667 is 3.7 times faster than P-III-600. Given that one never gets linear speedup by going to more CPU, I predict you'd be better off getting single CPU alpha-21264 ($10K) than quad Xeon P-III, in performance and in price. And, of course, you want Unix/Linux with it, for you don't want results of months of computattion go down the drain due to OS crash :-) 2. Where computationally intensive jobs are required? NOT for web browsing, balancing checkbook, or even games. Hence you don't see either big iron hardware at local shops, and software beyond simple calc, anywhere except may be campus bookstores. Computationally intensive calculations are done in modelling structures, solving weather forecasts, nuclear research, aerodynamics, financial (options pricing), Monte-Carlo and optimization. So it is mostly done in research institutions. No surprize then software is written by Ph.Ds , texts are dry, market is narrow. 3. There still exist free or open source math programs/libraries beyond casual calc. Take a look at MuPad, Octave, LAPACK (http://www.netlib.org ) etc. 4. I am not aware of any incredibly easy book on this. The problem is extremely advanced math requires extreme effort to comprehend (or nobody cared to boil it down). In my area (econ), Judd "Numerical Methods in economics" is pretty good. 5. About breaking up calculations. Ultimately, any program (math or not) does exactly that. there are general principles embedded in compilers technology, and, at a higher level, Fortran or C/C++ programs split the atsk into subtask. But all-purpose ALGORITHM does not exist. During debugging you do the same - look if pieces of the program do what you want them to do. That is why debugging Mathematica programs is on the one hand easier (higher level of abstraction) and more difficult (you have to trust that subroutines you call are working all right) 6. For distributed memory coding, refer to Lusk et al "Using MPI"

  172. Couldn't you just analyze the program? by slashdot-terminal · · Score: 2

    Ok programe are actually a series of many instructions couldn't the OS just be programmed to have say if you have 100 instructions that instruction 1 would be on processor one and instuction 2 be on on the second processor or just devide the program into two halves to be executed on each?
    How much more effort would you have to do say in a standard C++ program to get it to fully equally use the 2 processors in doing something like calculating all of the primes between 1 and 9,000,000,000,000? Are there good examples of this? What is the absolutely cheapest dual processor system that one could get? Where is this sold?

    --
    Slashdot social engineering at it's finest
    1. Re:Couldn't you just analyze the program? by The_Messenger · · Score: 2
      Ok programe are actually a series of many instructions couldn't the OS just be programmed to have say if you have 100 instructions that instruction 1 would be on processor one and instuction 2 be on on the second processor or just devide the program into two halves to be executed on each?
      I'm guessing that you've never done much real programming. Parallel processing is just that -- processors working at the same time. In order for a program to fully utilize both processors, it needs to be able to supply two or more processes. In C, C++, and Perl, it means creative use of the fork family of system calls. In Java, it means knowing how to use threads. Some types of applications suit themselves to this better than others. Daemons, especially server daemons, are particularly enhanced by multi-processor systems.

      Many non-technical people assume that a system with two 500MHz processors equals a 1000MHz machine, because 500 + 500 = 1000 (unless you're using those old Intel chips, eh? ;-). This is not true. However, this is not true; processing power is not cumulative. Two 500MHz Xeons are not one 1000MHz Xeon, they're two 500MHz Xeons! Simple, ne?

      Your idea of "dividing the program into two halves and having one processor work on each" shows a definite lack of understanding about how computers and parallel processing work. Even if the instructions being processed were completely irrelevant, which is what your idea of processor utilization would require, you would gain no performance advantages.

      Parallel processing is most basically dependant on having a multi-process OS, such as Unix. Simple operating systems such as Windows 95 are mechanically unable to utilize more than one processor at once. This is perhaps the most fundamental difference between 95 and NT.

      How much more effort would you have to do say in a standard C++ program to get it to fully equally use the 2 processors in doing something like calculating all of the primes between 1 and 9,000,000,000,000?
      Something that trivial would be rather simple -- one processor computes the primes from 0 to 4.5e12, and the other processor from (4.5e12 + 1) to 9e12. However, any real application is better off being designed "from the ground up" to use multiple processes. This way, no speed will be lost on single-processor systems, and multiple processors may be utilized on SMP systems.
      Are there good examples of this?

      Heh... yes, dear boy, parallel processing has been common practice for some time now. Anything designed to run on a server, from httpd to Oracle, uses multiple processes, even when it doesn't use multiple processors.

      What is the absolutely cheapest dual processor system that one could get?

      Two Celeron 350s and an Abit BP-6 mainboard. The board will run you between $115 and $135 (in USD), and the processors about $35 each. You may be able to find better prices. About the Celerons: spend the extra fifteen bucks each and get 400s or 500s. If you get the 350s, make sure you get the versions WITH 128k L2!!

      Something tells me, perhaps the fact that the UID seems awfully familiar, that this fellow was trolling. Oh well.

      --

      --
      I like to watch.

    2. Re:Couldn't you just analyze the program? by pe1rxq · · Score: 2
      Ok programe are actually a series of many instructions couldn't the OS just be programmed to have say if you have 100 instructions that instruction 1 would be on processor one and instuction 2 be on on the second processor or just devide the program into two halves to be executed on each?

      This could be done, but than you would get problems if instruction 2 depends on data that will be created with instruction 1. Or if instruction 1 is a jump to instruction 10 in which case instruction 2 should never have been executed. You see there are a lot of situations in which this would not be possible.

      How much more effort would you have to do say in a standard C++ program to get it to fully equally use the 2 processors in doing something like calculating all of the primes between 1 and 9,000,000,000,000? Are there good examples of this?

      This depends on how splitable your calculation method would be, again if one calculation is dependand on data from another you might get one processor waiting for another. Effective paralel computing depends very much on how much and what kind of traffic you have between the processes.

      Grtz, jeroen

      --
      Secure messaging: http://quickmsg.vreeken.net/
  173. Re:Architecture makes the difference by randombit · · Score: 2

    If you're doing mathematical computations, your best bet would be a CPU good with floating point.

    Well, FPU performance is good, but OTOH I happen to like really good integer performance (probably because I code silly things like crypto that doesn't use floating point at all). For that I like Alphas or Athlons. I'll bet PPCs and G4s are good for that too, but I'm not much of a Mac person.

    Only problem in this case is cost, since the average Alpha system, IIRC, costs more than most x86 systems. That might not be true, so do your research.

    A low end Alpha will cost more than a high end Athlon (I'm generalizing here). OTOH, the "low end" Alpha will kick the Athlons butt (much as I like Athlons, the x86 architechure limits what it can do). However, since this guy seems to really be asking about games, he probably is running Windows 9x, so he wouldn't have too much fun on a Alpha running Linux.

    For actual real-time stuff (ie not games), I'd go with something ARMish. Ah, those cool little ARMs. :)

  174. Re:the question is ill-posed by randombit · · Score: 2

    3D games: an Athlon would probably be your best choice. Decent FPU performance, good integer performance, won't cost you a bundle. Most games don't really benefit from SMP anyway.

    I'm not too much of a gamer (except Freespace2, oh, love that game...) but I think your 3D card is much more important that CPU. For instance, I have a PII-350. I used to have an i740 video card with 4Mb of RAM. Games ran slow and at low res. Then I got a Voodoo3 3000 16Mb of memory (good drivers probably also had an effect). Games ran fast. I was happy. etc etc

    Render farms, other highly parallel, low internode communction applications: commodity x86 systems, the more the better.

    Or really big SGIs. Or both. :)

    RT control of other systems/experimental setups: I personally prefer the StrongARM series of processors for this role, since the price/performance is practically unmatched, the documentation is through, and programming in assembly for the SA is truly a joy compared to the hideous mess that is the x86.

    Do you know where I could find info about StrongARMs? I know DEC designed them and Intel bought it (it being the ARM design, not DEC!), but that's about it. I'm just thinking it would be really fun to buy an ARM (or more probably something that has an ARM in it, plus memory and control boards and etc) and mess around programming it (over the serial port of a PC?). It would become my robot slave! Wuahahahahahahahahah!!! :)

    Data mining, data warehousing: I don't have any personal experience with these applications, but I have heard good things about Suns

    I've also heard that in reference to file servers, etc. However, I can say with some experience (ie, the CS dept here is a Sun shop and so is where I work) that anything smaller than an Ultra2 is fscking slow. The nice big servers rock (of course), but I'm not totally convinced that an Athlon (or dual PIII or whatever) with a lot of RAM and big SCSI disks wouldn't do better (and it would certainly be cheaper).

  175. Discrete Event Simulation PIII -v- SPARC by cadelor · · Score: 2

    Hi,

    Recently I did a project involving discrete event simulation.
    A certain task took 11.827 second on a PIII 500Mhz system running linux, and 16.588 on a SPARCII at 300Mhz. The equivalent time for a 500Mhz SPARC would be 9.953 seconds (I think, just scribbled out the calculation now!).

    This gives the SPARC the edge, but there are other factors like the OS to consider. Both machines had 256Mb memory and used gcc to compile.

    However if cost is an issue, this will give the x86 chips an advantage. Depends on who is paying really!

    Then you have to look at whether you are writing your own code in say MPI to work over a cluster of machines or something to work on a highly SMP machine, like an SGI or high end SUN box. It very much depends on what you plan on using it for; high end maths physics applications - get a nice sun or sgi box if you have the budget!; if you dont have the budget or dont need 'that' kind of high end then a PC will do. And im sure there are plenty of people willing to carry out the which processor is better argument here!

    Cheers
    ~Al

  176. Re:Flamebait?? by oneself · · Score: 2

    emacs, RedHat 6.2, 4.31

  177. My limited knowlege of SMP by DNS+Error · · Score: 2

    From the way I understand it, Multi processor software is affected as this:

    The software DOES need to have smp support to run to it's fullest using multi-processors, however that does not mean that the program will not run, or will not run faster with smp. Where the drawbacks of smp come in is in the fact that if a program is not made with smp support, it could run even slower than with a single processor because of the fact that the software will not know how to properly divide the data into more than one processor.

    For example, If you are using a program that uses heavy mathematical functions, made for smp, the software will understand that it is quicker to give all of the data having to do with X together to one processor, instead of splitting it. Without smp support, the OS or whatever does it will try to split it as best it can see, where it may be not the most efficient method or maybe even the worst method.

    that's just what I understand, anyone who can correct me is more than welcome.

    --
    -DNS
  178. How long is a piece of string? by mlfallon · · Score: 2

    It depends on a number of factors, first of all is the problem integer based or floating point based. Is the problem cache bound or not. What sort of memory architecture is the machine running. I have seen codes work faster on while running over two boxes rather than a dual processor box becae of the memory bus architecture of the PC, it was faster to use message passing and VI interface cards. Commodity processors tend not to be the best for large scale math problems, because they are geared to be good all round chips. If price was not an issue I would look at IBM Power 3 based machines. Also for really high maths that are too big for cache Hitachi have a nice solution with the pseudo-vector facility that they use on the CPUs in their SR8000 machine. Back to the problem in hand, taking specInt or SpecFp numbers can be very misleading. The best option would to come up with a test case that is representative of the problem you wish to solve, with a data set of the same order you are going to work on and run it against the processor. After that you should be looking at clock speed not just for the CPU but also for the memory bus, size of cache if this can affect the problem you want to solve and last of all but definitely not forgetting the IO performance to disk, because with most serious problems you will have to read in and write out large amounts of data.

  179. Re:Here's what I just did by swordgeek · · Score: 2

    Very nice. Wish I had a setup like that.

    Of course, it has NOTHING WHATSOEVER to do with realtime computing.

    --

    "People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban
  180. Processor Benchmarks by ebola-zaire · · Score: 2

    Here are some SPEC CPU95 Benchmark results. Sorry they are not the latest greatest processors, but this is the most recent I could find.

    Processor (Floating Point (SPECfp95), Integer (SPECint95))
    Digital Alpha 750 MHz (75, 38)
    Intel Pentium III 800 MHz (32.4, 38.9)
    AMD K7 Athlon 750 MHz (33.0, 26.5)

    As you can see the Alpha really whips the P3 and Athlon, but the pricing for Alphas is also rediculous. The motherboard and processor for an Alpha 750 will likely run two or three thousand dollars. A P3 800 will run about $750 for just the processor, and an Athlon 750 will be aproximately $320. For price/performance Athlon is definately the winner.

  181. Re:the question is ill-posed by Signail11 · · Score: 2

    developer.intel.com has many good instruction reference guides in PDF format. I don't have any exact links, but try searching under "StrongARM Instruction Reference." I'm quite an advocate of the *ARM family; it's one of the few ISAs that is orthogonal enough that everything makes sense, and yet complete enough that you don't have to resort to crazy idiomic code fragments to get something useful done. If only Intel would give up their Pentium II's and use some of that great process technology on the StrongARMs :-).

  182. Re:What calculation? by Signail11 · · Score: 2

    "Here's a contrast to hopefully clarify it a bit: writing a program in machine code is typically 50x faster than letting a compiler generate it."

    Bullshit. On most applications, good compiler generated code is not usually more than 60-70% slower than hand-tuned assembly. I've seen some exceptions to this general rule (ie. naive dot product vs. scheduled dot product on the x87 FPU stack), but the worst I've ever seen was about 8-10 times slower. Show me an application which gets a "50x" performance increase from writing the assembly yourself. Hell, just show me a code fragment from the kernel of the function, and I'll either show you why the code is either miserably written or I'll submit a patch to GCC to optimize for that case. That's a promise and you can hold me to it.

  183. Re:What calculation? by Signail11 · · Score: 2

    Hello!?!? Have you ever implemented an algorithm for use on high performance computers (I'm not talking about dual Pentiums or Athlons here)? I sincerely don't believe so. There is no standard computational task that will have an order of magnitude improvement when coded in assembly compared with well-structured and blocked C. Please give me a specific example of such a kernel. I am honestly very curious about what algorithm is giving your compiler so much trouble. It certainly isn't a F90 intrinsic or a LAPACK kernel. Most of the VPP codes that I've used also compile nicely with a few well-placed compiler directives.

    I agree with your point about specialized hardware; that's why I didn't bring it up.

  184. Re:yo! compilers! by Signail11 · · Score: 2

    Compilers are certainly important, but I think even more important is the algorithm. It goes well beyond simple big-O complexity notation. To take the example of inverting a matrix, even though most classic algorithms are O(n^3), you have to take into consideration the cache characteristics of the algorithm used, the sparsity of the matrix, and other factors such as the specific form of the matrix and possible numerical instability. If your algorithm involves trig functions, or for that matter, division, in its innermost loop, you've probably got a big problem. I don't know of *any* processor with a division operation latency of under 8 clocks. It's probably time to reshuffle your implementation or pick a more suitable algorithm.

    BTW, O(a^n) in standard notation is ill-defined, but would probably be interpreted as an exponential growth function with some arbitrary, but fixed, constant a. That places the problem squarely in the category of intractable. Although there are some very clever algorithms that can help in some cases (lattice basis reduction comes to mind).

  185. It depends on the problem by rutger21 · · Score: 2

    With real-time, high-end mathematical calculations one can think of things such as a satellite control system. If you don't complete your calculations in time, serious (expensive) accidents could happen. This leads to various mathematical models which give answers with a probability. For example, the calculation must always be completed within a second with 99% certainty.
    Another problem exists: not all mathematical problems are easily made parallel (or, whithout a significant deal of overhead). Even worse, problems exist which can only be solved sequentially. So for example, having a 4-CPU 400 Mhz box will not always be able to outperform a faster 500 Mhz uniprocessor.
    Generally said, it completely depends on the problem.

  186. You can't emulate for real-time by ca1v1n · · Score: 2

    Both the PIII and the Athlon only emulate x86 instructions. This means that there can be a lot of overhead, making the runtimes somewhat unpredictable. The aging x86 instruction set has been hacked over so many times, especially in the realm of floating point, that you really can't expect any consistency. Between 3DNow and SSE you've got two sporadically undersupported systems of acceleration. In general, if you want real-time, that means it must be reliable. That means you have to look not only at the average run-time, but the worst-case runtime. To get a reliable worst-case runtime, you have to be free of caching concerns, which will hurt your average performance. The best way around this is to go into RISC territory. This is where the Alpha shines. I also know that the G4 is not nearly as overcomplicated as the x86 chips, so Altivec, if properly supported, could do fairly well. Still, for reliable performance, you'd want something that handles things in a fairly standard way, so you don't have to worry about architecture extensions. Once again, Alpha shines. There are plenty of other things out there, like Power3 for example, but I think that cost becomes a serious concern once you get into that range.

  187. Don't need an editor. by www.sorehands.com · · Score: 2
    We are Borg, We are the Operating System, We don't need no stinking editor.

  188. R.A.I.P. by www.sorehands.com · · Score: 2
    I would think lots of inexpensive processors that are properly networked together.

    I'm not sure when the point of overhead in processing would overcome the advantage in numbers.

    Now, if it was not just redundant processors, but you offloaded some of the processing to special purpose processors. You might get a performance boost there.

    This would be limited by the "smartness" of the compiler to divide the tasks into good chunks.

  189. Re:Depends on your software... by tesserae · · Score: 2
    Celerons are SMP capable. I don't know if Intel has completely disabled it in the new Celerons that they just announced.

    Everything I hear indicates Intel didn't bond the SMP pin to the die -- I can't find the original statement, but here's a quotation of an example. That's what I was referring to; AFAIK the Celerons up through 533MHz are fine with SMP, but the "Coppermine 128" Celerons are crippled.

    Intel appears to have done something similar with the early FC-PGA Coppermines, too -- I've not heard reliable reports of anyone managing to get good SMP out of them. The SECC-2 versions are fully SMP-enabled, though.

    I have heard that there are some stability problems with the Abit BP-6 that can take some effort to iron out.

    I've heard the same -- people tell me they're fine for gaming, but not so good for workstation use. I have no direct experience... but I do believe that you get what you pay for.

    ---

    --

    ---
    Politics is about making compromises. Religion isn't. --Michael Horton

  190. Depends on your software... by tesserae · · Score: 2
    If your software is written for SMP, or if you intend to be doing other things with the box at the same time you are running these high-end mathematical computations (especially if by "real-time" you mean that you're collecting data and analyzing it on the same machine), go with SMP. Cost-effective means Intel, since Athlon SMP motherboards aren't yet available; it also means dual, not quad, processors. Celerons are okay, but I wouldn't use them for high-end work, because of the cache limitations -- instead, go with the Coppermine PIII's, which have 256KB on-die cache, compared to the Celeron's 128KB. They will cost a bit more, but they'll be significantly faster. IMHO, the Xeon doesn't bring much more to the table -- and the cost is ridiculously higher.

    OTOH, if you can't use (or don't need) SMP, go with the fastest Athlon you can get; the FPU (as others have pointed out) is much better than the present Intel PIII FPU's.

    If I was building the machine right now, I'd probably go SMP: dual motherboards from ASUS and Tyan both have good reputations for stability, although they aren't the cheapest -- but uptime is more important than upfront cost, too. The BX or GX chipset solutions are much cheaper than the newer Intel 820 and 840 boards, because of the cost of Rambus memory -- and if you run SDRAM on an 820 (and probably the 840 also), you'll be slower than a BX solution anyway.

    Then I'd stick a couple of reasonably-fast (600 or 700MHz) Coppermine PIII's on the board, and lots of SDRAM -- enough so the OS never has to swap to the hard drive. Only you know what that amount is, and memory is still pretty cheap.

    It doesn't sound like you'll need SCSI (since drive access times and multitasking probably won't be much of an issue), so stick with EIDE for the hard drive -- the cost is much lower. Your graphics card won't be horribly expensive, either, as long as you aren't worrying about high-end screen output (if you are, it's a whole different ballgame).

    A representative system (prices are midrange online values, not the cheapest by far; buying it as a package will save you quite a bit):

    Tyan 1832 dual motherboard [make sure it's latest-revision, to handle Cumines] ($180)

    2 PIII 600E processors [OEM, without heatsink and fan] ($290 each)

    Heatsinks and fans for those processors ($30)

    IBM or Maxtor 20GB 7200RPM EIDE hard drive ($180)

    256MB PC100 SDRAM ($220)

    Mid-range graphics card ($100)

    Generic floppy drive, case, CD-ROM and keyboard ($160)

    Total price, around $1450 (without a monitor)

    This is actually quite a nice machine -- good quality parts where it matters, for what I interpret your requirements as being. If you can manage with Celerons, though, get an ABIT BP-6 motherboard and a pair of the fastest Celerons which still support SMP (don't get burned, here!), and you might be under a grand, total...

    Ask me again in a couple of months, of course, and everything will have changed. Remember that the newest, fastest PIII's don't support SMP (yet, anyway); neither do the new Celerons; Athlons will, next year, but they'll be replaced by other CPU's by then anyway; and YMMV.

    Have fun...

    ---

    --

    ---
    Politics is about making compromises. Religion isn't. --Michael Horton

    1. Re:Depends on your software... by Malc · · Score: 3

      "Tyan 1832 dual motherboard [make sure it's latest-revision, to handle Cumines] ($180) "

      I have that board and I think it is excellent. You'll need a revision F for coppermine support. It comes at a good price as it doesn't have onboard SCSI like many dual systems. When I bought mine a few months ago, the best price/performance seemed to be P2 450's, two of which were costing less than a P3 500. It was $105 for an OEM P2 450 and $75 + $15 for a Celeron 466 + converter card (at that time, P3 500's were going for $280).

      Quake 3 with my system is faster under NT than it is under 98. It's nice, and the framerate is more stable.

      "Remember that the newest, fastest PIII's don't support SMP (yet, anyway); neither do the new Celerons; "

      Celerons are SMP capable. I don't know if Intel has completely disabled it in the new Celerons that they just announced. Either through a socket 370->slot 1 converter, or through some resoldering, or through a dual Celeron board, such as the Abit BP-6. I have heard that there are some stability problems with the Abit BP-6 that can take some effort to iron out.

  191. Raw number crunching on distributed.net by sethgecko · · Score: 2
    Depends on what you're crunching, but a 450MHz PPC G4 does 4 Megakeys/second on distributed.net. If you're wondering, my dual celeron 550 only does 3.08 MKeys/sec. Since the celeron uses the same P6 core as the P3, and the distributed.net client is heavily multi-threaded, this is one of the few cases where 2 processors should be faster than one much faster one. In other words, in this, and only this case, 2 celerons have the number crunching power of a 1.1 Ghz P6 processor (read Pentium II/III). Or they have 75% the number crunching power of a 450MHz G4 using the altivec unit. If your number crunching can be done in vector operations and not floating point, the G4 rocks. Oh yeah, you can run linux on a G4 and use the altivec unit www.blacklablinux.com.

    --
    Be ot or bot ne ot, taht is the nestquoi.
  192. Alpha Alpha Alpha! by szyzyg · · Score: 3

    The alpha is basically twice as fast as an Athlon at the same MHz, or 3 times faster than a PIII of the same clock speed. Thir pfloating point performance is still better than anythign else out there. I use them for Orbital com[putations and they fly.

    We recently got a 250Grand 8 processor SGI number cruncher, for someone who wants to do MHD calculations, scarey big calculations. We did a benchmark on it on some of my code and found that 1 SGI CPU was 1/3rd of the speed of our 500Mhz DS20 CPU's..... we have a Dual processor DS20 which we acquired a year ago, for 20k, and this is 75% of the speed of our brand new 'Supercomputer'.......

  193. Re:real time, high end?? by dej05093 · · Score: 3

    I think the meaning of "real-time" in the original
    posting is "during data aquisition" with soft real time constrains instead of a calculation done after the experiment.
    In an industrial environment it might also be important to have an upgrade path and a processor
    family which will be supported for a long enough
    time (it is really expensive to switch from
    transputers to something else ...).

  194. Real-Time does not mean Real-Fast by Detritus · · Score: 3

    Real-Time is about predictability and timing guarantees, not high speed. Features that improve speed, such as cache and virtual memory, are drawbacks in a real-time system. They introduce uncertainty into the analysis of the system's timing.

    --
    Mea navis aericumbens anguillis abundat
  195. My vote by JohnZed · · Score: 3

    Well, I'd really like to vote for a dual P-III Xeon at 1 GHz with 1MB of L2 cache per chip, but Intel has been lagging behind with their newest Xeons. They used to give a 15% boost over a comparable non-Xeon, due heavily to their much larger cache, but Intel has had a hard time selling them because they cost SOOO much more than a comparable P-III standard, and the performance gap between the two models seems to be closing.
    Cache size (and speed) is a lot more important in these calculation-intensive benchmarks than it is in other uses, so it's really something you should look out for. It's also one of many small minusses that really make the dual celeron suggestion a less-than-optimal configuration for real scientific use. Today's best high-performance compilers do a reasonably good job of exploiting special P-III instructions and optimizing to squeeze the right data into the cache. Although the celeron will soon have SSE, it will still be configured with 128k of L2 cache, I believe (though I could be wrong. Anybody?). The real killer to the celeron idea is that you do still have to hit your RAM pretty often, and Rambus or DDR running on a fast bus can really, really help you here.
    So, if it comes down to the Athlon or the P-III (non-Xeon), I'd still have to go with the P-III. The biggest advantage is the ability to use multiple processors, as number-crunching code can REALLY benefit from SMP. AMD has been promising SMP Athlons for ages, but they're still basically vaporware. Another factor is the availability of (extremely pricey) Rambus RAM, while DDR is just starting to be accepted. Finally, Intel puts out some pretty fast compilers, while a lot of compiler developers fail to optimize for the Athlon as much as they could.
    Within the next year, however, we should see faster RAM for AMD chips, SMP Athlon boards, better compiler support (now that people no longer think of AMD as just a low-end provider), and a full-speed L2 cache on the Athlon. Then the chip's FPU can really shine.
    This is all assuming that we're talking about scientific crunching on x86 PCs. If you can go for an Alpha, you really should. Check out www.spec.org for benchmarks. Yes, we're all wary of benchmarks, but when a chip routinely beats its competition by a factor of 2 or more in a very respected, industry-standard benchmark, you have to assume that there really is a difference. I have a lot of hope for Intel's second-generation of IA-64 chips, though. They're doing some really interesting things with compiler/architecture design that could blow away the competition.
    --JRZ

  196. One word: Alpha by turne10 · · Score: 3
    Alpha still blows both Intel and Athlon out of the water, esp. on floating point. The best benchmark for such things is the SPECmark - see John DiMarco's handy SPECmark table, as well as the SPEC site itself for numbers, but the bottom line is that even a 500 MHz A21264 is about twice as fast on floating point than a 700-750 MHz PIII or Athlon, and DEC, er, Compaq is now shipping 667 MHz A21264's.

    Note that there is a new 1U rack version of the DS10, called the DS10L (code-named "Slate"), that is very attractive for highly compute-intensive tasks. There's a picture of a rack full of these in the Linux section of Compaq's web site.

    --
    NTAGARA
  197. Not x86 by Skweetis · · Score: 3

    Newer x86 processors (Intel, AMD, Cyrix, etc.) aren't very good for real-time work. This is because they are so fast that they sit idle most of the time waiting for instructions to come from much slower RAM. They work, but it is overkill when you can get a 68K processor from Motorola for a few dollars, or an HC12 or something for a dollar, that will do the same thing with much lower power requirements. (By the way, just because I suggest Motorola chips doesn't mean I am endorsing Apple. I am trying to give you some information based on my experience in real-time computing, and an old 68000 is a good chip for this kind of thing.)

    For straight number-crunching, a fast x86 CPU is probably a decent choice. You will probably want to go SMP for maximum performance. (Are there SMP boards for a G4? That might be a good choice, as it has one of the best FPUs out there right now.) You might also want to look into an Alpha, they have very good floating point, and some really good SMP solutions exist for them.

    Another issue to keep in mind is the software side of things. You may want to get a processor that you can write, or can learn to write, assembly code for. You can write much more efficient code in assembly, and (and this isn't always a bad idea) if you do things this way, you can make things even more efficient and not run an OS at all. (Even a 12mHz 386 screams when you don't run an OS on it while you run your code.)

    The thing to keep in mind when approaching this problem is that there are a lot more solutions than Intel and AMD, and many of these are well worth investigating further. I am just trying to give you a few ideas here, you will probably want to do your own research and choose the solution that works best for you.

  198. Here's what I just did by johndr · · Score: 3

    I recently put together a cluster of eight Gateway 2000 boxes with Athlon 800MHz proccessors. We loaded Red Hat 6.1 and are using one of the commercial Fortran compilers to do plasma simulations. The boxen are in a very basic Beowulf-type configuration.

    Our previous boxes were 600MHz DEC alpha stations running Dec's UNIX, OS/F or whatever it is. We find that the AThlon boxes, which are 32 bit of course as opposed to the 64 bit Alphas, are about as much faster as the clock speed would indicate, i.e. about 30%. As a result we increased our computation resources by a factor of four for less than $20K. We are very happy.

    I'm not sure SMP can be justified in this kind of case, as boxen that support it are typically way more expensive than our cheapo Gateways, and SMP generally does not increase speed by the factor you would think. However I'd be interested to hear any results to the contrary. When problems can be split between processors, performance per buck is what matters.

    The best solution really has to be tailored to the specific problem being solved; so for example 384MB or PC100 RAM was ample in our case but in a big 3D finite element case it gets to be a problem. As an example we also have an electromagnetic package that runs on NT; because of the software licensing we can't run more than one case at once, and that has to go on the huskiest system I can get the budget for, currently dual Pentium III 733s with 2GB RAMBUS memory. It is way less cost effective than our cluster system.

    Hope this helps.

    John

  199. the question is ill-posed by Signail11 · · Score: 3

    What exact question is it that you are asking? The answer depends to a great extent upon the specifics of your problem. For some possible usage scenarios, here are my suggestions (disclaimer: I have a preference for SGI machines, because those are what I primarily use from day to day)

    Standard office applications:Go for a cheap Celeron and lots of RAM. Most applications will be very responsive in any case.
    3D games: an Athlon would probably be your best choice. Decent FPU performance, good integer performance, won't cost you a bundle. Most games don't really benefit from SMP anyway.
    Render farms, other highly parallel, low internode communction applications: commodity x86 systems, the more the better.
    RT control of other systems/experimental setups: I personally prefer the StrongARM series of processors for this role, since the price/performance is practically unmatched, the documentation is through, and programming in assembly for the SA is truly a joy compared to the hideous mess that is the x86. Only problem is that there no FPU (it does have an integer multiply though).
    Data mining, data warehousing: I don't have any personal experience with these applications, but I have heard good things about Suns and the RS/6000's from IBM.
    Single-threaded or low parallelism scientific computations: Definitely Alphas. They blow any other processor away on floating point intensive operations. The only real drawback is lack of CCNUMA/massively parallel shared-memory systems. IIRC, they top off at 8 or 16 processors.
    Really big simultions, computational hydrodynamics, etc.: Keeping in mind my previous disclaimer, I would still have to suggest SGI Origin 2000 systems for this type of task. The out-of-box performance on a fully populated Origin 2000 is awe-inspiring. Another option might possibly be linked AS/400s or RS/6000s or even one of the Cray T3Es for vector oriented codes. A bit pricy, but if it's not your money...

  200. It's not only the processor, it's the OS by roman_mir · · Score: 3

    If computer science CSC468 Operating Systems course taught me anything, it's that no matter how great your hardware is, the software must be optimized to the maximum or your best processor with the largest bus and huge clock speeds and caches and number of registers will be worth NOTHING. The Operating System must be smart to use the cache in best ways, to maximize the performance of your IO devices, to manipulate all your threads in a smart way not to create thrushing.

    In fact I believe that if you simply need a calculator that is completely devoted to you, you should build your own very task specific operating system that is optimized for the task at hand. Half a year ago when I was implementing different parts of an OS, I remember that the most difficult challenges that stood in front of my group were the system optimizations for memory accessing, multitasking (paging), security and multitasking the IO devices.

    If all I wanted was a simple system for a single user I wouldn't need all those complicated algorithms, I could simply write a memory manager, and a simple IO and interrupt support and that would be the fastest way for a single user to operate. In a sence even DOS was too sofisticated for what you are asking, DOS had some TSR support and paging. (However its memory management was awfull.)

    So there you go, it's back to C and the Assembler time!

  201. Flamebait?? by www.sorehands.com · · Score: 3
    Is this a real question?

    This is one of the questions that are in the category of "Which is the best editor?" or "Which is the best Operating System"" or "How many angels can dance on the head of a pin?".

    Geeks have argued and fought over these (or at least the first two) for years, and will argue and fight over them until we are Borg.

  202. what "real time" means by jetpack · · Score: 4

    A couple of folks have alluded to this, but I'll reiterate in an attempt to make it clear; real time systems imply bounded response times. That does not directly imply a *faster* response time (altho the goal is to reduce the gauranteed bound on response times).

    Hence, the phrase "real-time mathematical computations" is almost an oxymoron ... if the computation by nature is unbounded, you wont get realtime performance out of any processor.

    Basically, I'm attempting to point out that "real time" and "fast" are not synonymous.

  203. Re:I don't think you understand what you are askin by costas · · Score: 4

    Excellent points; let me add a coupla more: I am not sure what 'real time mathematical computations' means here. CPUs that can crunch complex mathematical systems in real time, are usually developed by DARPA and cost millions per unit (I am referring to jet fighter avionics packages; usually TI or Rockwell chips).

    Now, if the poster is referring to mathematical problems (but not 'real time', more like 'solved in a reasonable time'), the above post is right on the money: do not buy an SMP system --at least not an x86 SMP system (I don't have any exprience with Alphas). The problem with x86 SMPs is bus speeds, i.e. communication between CPUs on the same board. Using fast network interconnects (gigabit speeds) you can usually get better performance between boxes than between CPUs for some SMP systems (particularly quads and 8-ways). Duals are not as problematic, so for compactness' sake they might worth the $$$...

    Keep in mind though, this is not the way things *should* be, it's just the state of the art --which sucks right now in x86-land. With better motherboards coming up (not to mention better SMP support by all the different OSes --the new Linux kernels seem to have solved context switching problems that were killing SMP machines, for example), eventually SMP machines will be the way to go...


    engineers never lie; we just approximate the truth.

  204. Architecture makes the difference by Microlith · · Score: 4

    If you're doing mathematical computations, your best bet would be a CPU good with floating point. Your best bet would be an SMP Alpha. With a better floating point unit than Intel and AMD, and an outright faster CPU, you'd get a lot more done in less time. Only problem in this case is cost, since the average Alpha system, IIRC, costs more than most x86 systems. That might not be true, so do your research.

  205. real time, high end?? by gargle · · Score: 4

    Which processor would be better for realtime, high end mathematical computations?

    High end mathematical computations are unlikely to run in real time on any processor. Do you really mean games?

  206. Athlon Has a Superior FPU by Zevez · · Score: 4

    The AMD Athlon has a superior FPU at the same clock speed, which is useful for scientific applications. Check this page here for further details. (Notice on the same page that the Athlon is not better at everything though).

  207. I don't think you understand what you are asking.. by slothbait · · Score: 5

    Are you referring to hard real-time applications? If so, you don't want an SMP system. Infact, you don't even want a system with a cache. Why? Because hard real-time applications optimize for worst case performance. Caches do well to improve average case performance, but usually hurt worst case performance. Thus, hard real time systems usually don't use caches.

    In general, real time apps (even soft real time, like video or audio decode) are concerned more with low latency then high throughput. As a result, you aren't going to want an SMP system. The complex caching systems in SMP's is going to make performance even *less* predictable which is precisely *not* what you want for real-time.

    If you just want a really fast media-cruncher, then you don't want to be running x86. If you're serious, you'll go for something like an Alpha, that will smoke any x86 in FPU. Besides, if you really want to get into real time media processing, you are going to need a great deal of bandwidth, and commodity x86 hardware isn't going to get you where you need to go.

    If what you really want is a budget box to run games on, then get an Athlon. A quick review of any games site in existence will tell you that Athlons beat Intel's offering in every regard these days. There's no reason to bother Slashdot with such common questions. Any of the DIY gamer sites will have a host of articles with benchmarks running Quake or Unreal or whatever it is kids play these days.

    I get the impression that this post is from someone who doesn't understand real time computation, and just through that phrase in there to make their question sound more sophisticated.

    --Lenny

  208. PPC 7400 by Pope · · Score: 5

    If you're doing Signal Processing or DSP-related tasks, the 7400 smokes.
    As always, it all depends on what numbers you're crunching, and for what purpose. The vector processing in the 7400 is pretty sweet if done right, and one of the Linux PPC variants has full support now.
    Just a though to get away from x86 ;)


    Pope

    --
    It doesn't mean much now, it's built for the future.
  209. The stupidest question I've ever heard by ShaggyZet · · Score: 5

    No, really, it is. What does real-time have to do with this? You can't solve an arbitrarily complex mathematical problem in a bounded amount of time. And that's what real-time is, bounded. Real-time has nothing to do with speed in the sense that this question is phrased. An example: A given problem must be solved in 15 seconds or less. Chip A can solve the problem in 1 second most of the time, but will take 30 seconds every once in a while. Chip B will usually take 5 seconds, but never more than 10. Guess which chip is can give you a real-time guarantee? That's right, even though Chip B is usually slower, it is the one you pick.