Slashdot Mirror


User: akuma(x86)

akuma(x86)'s activity in the archive.

Stories
0
Comments
407
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 407

  1. Re:AMD on Sun's President Dreams of a Linux Future · · Score: 1

    AMD does a pretty decent uniprocessor CPU, but you need a backplane by someone like Seymour Cray to run a serious 64-processor system. Sun, not being stupid, have a backplane first designed by... some guy named Cray.

    Exactly - Get out of the highly expensive CPU development game and focus on software and systems. Clearly Dell can't build a 64-way system. Sun should do that.

    My comment about AMD was that they have a processor solution that can more easily scale because they now have point-to-point links and aren't as constrained as other processors that have shared bus architectures.

  2. Re:Bad, BAD news for Sun on Sun's President Dreams of a Linux Future · · Score: 3, Insightful

    As a post note, Sun made theirs by grabbing a commodity operating system, putting good hardware underneath it, and selling it for a fair price. Why can't they do that anymore?

    The costs involved in creating "good hardware" are astronomical in this day and age. Fabs to manufacture processors cost upwards of 2 billion dollars. IBM has the resources (economies of scale) to make their own CPUs and systems, but even IBM lost a ton of money in it's microelectronics division. Sun does not. Their cost structure is totally out of whack with the rest of the industry. They need to get out of SPARC and get into something more cost effective like x86 - In fact the move to sell Opteron systems may be a sign of bigger things to come.

    It's hard to make money against an IBM or an HP when your costs structure requires you to spend 3x more per system on crap like SPARC development.

    My recommendation would be to abandon SPARC, move to x86 and build enterprise systems off of that. Then, fix Solaris on x86 so it actually scales and runs well. In effect - turn yourself into a software company. I keep hearing that Solaris has scalability advantages over Linux. Use that as your competitive advantage. AMD provides a good scalable CPU solution.

  3. Re:Better methods needed on Why We Need a Second Moore's Law · · Score: 1

    Again, this also means not using "tricks" like bit shifting when multiplying by powers of 2, which while giving a small boost in performance, can seriously degrade readability of code, unless its well documented, and its usually not.

    Good post. Programmers should read carefully.

    A good optimizing compiler will do the shift instead of multiply for you automatically. So, you're even more right -- don't even bother thinking about these small optimizations - they are entirely automated anyways.

    Not only that, but a lot of code is dynamically rescheduled and optimized in modern CPUs! You would not believe the transformations that occur under the hood of a high performance CPU such as an Athlon or a P4. If the shift transformation is impossible to know at compile time, don't be suprised if the processor will dynamically "predict" a shift and then correct itself and do a multiply if it was wrong, working in a way that is similar to branch prediction. The programmer is none-the-wiser.

    Higher order algorithm design and the cache-friendly and efficient layout of data structures in memory are FAR more important. Even memory layout can be optimized "under-the-hood" by a garbage collector in a managed-run-time like Java or .NET

    Focus on the architecture and algorithm design people.

  4. Re:Having a lot of something is no excuse to waste on Why We Need a Second Moore's Law · · Score: 1

    Developers aren't lazy. They're just working within the constraints of a problem.

    There's a cost to development that is measured in engineering time. The more optimized the code is, the more engineering time you need to put into it.

    The tradeoff is - faster time to market vs. tighter/faster code. The cost of Memory/CPU/Hardware resources decays exponentially every year. The cost of engineering time has not fallen nearly as much. In many cases the tradeoff becomes easy.

    Taking the "efficient code" philosophy to the extreme, programmers would write their own operating systems that were specifically optimized to their application. That ONLY happens in embedded spaces or perhaps some enterprise applications where performance actually has some material value the exceeds a shorter/lazier development time.

  5. Re:It might just be time for.... on Intel Plans CPU Naming Change · · Score: 1

    Really, the technical community needs to sit down and figure out a universal cross-platform benchmarking method.

    It's called SPEC. It is supported by most of the industry (Intel, AMD, IBM, Sun etc...).

    But since Intel dominates SPEC performance, people cry foul.

    I agree that there should be more cross-platform benchmarks outside of SPEC, but there doesn't seem to be any interest in creating such a benchmark.

  6. Re:Cheap! on Matchbox Sized Color Projectors? · · Score: 1

    Pick 2. you can't have all three.

    Why?

  7. Re:64 bits is old History on Intel 64-bit Announcements at IDF · · Score: 1

    Not necessary, there have been RISC PCs for quite a while. The Acorn, the Apple Power Macintosh, the Digital Personal Workstations and Multias, and all their clones were and are quite affordable machines. And the Digital (and clonemakers') ones where 64 bits, unless you insisted on running MS Windows NT instead of Debian GNU/Linux (I mention Debian 'cause it's the same on any hardware, and popular), NetBSD (same about portability, minus popularity), or Digital Unix (or OSF/1, or Tru64 -- just pick your favourite nom du jour).

    All of these alternatives were more expensive than comparable PCs. That's why the market for them never took off. Even Alpha, with the power of MS Windows behind it, could not convince developers to develop mainstream applications for it. Why? Because alpha based machines cost an order of magnitude more than a PC.

    I can't to this day, nor even a cheaper one, 'cause they're simply not available at Brazil. But I do have four Power Macintoshs in the family, all bought at rock-bottom prices either used or leftovers from Apple's changing lines.

    We were talking about 64 bit computing... Are your 4 power macs 64-bit?

    But now, we still don't have OpenFirmware in PCs, because Intel (and AMD) would rather we use their proprietary stuff with horrible ACPI instead of the Forth standard already present in all RISC vendors. We still don't have decent interoperability -- it works, but with much human pain involved -- because of secret file formats, APIs and protocols. We still don't have decent performance, memory addressing or power efficiency because Intel wants us to go to their horse-produced marchitecture, and AMD's only shot at the mass market is perpetuating part of x86's horrors into 64 bitness.

    So you're saying it's easier to build a RISC system than a PC because the firmware is in FORTH? Why then do clone makers not build RISC systems instead of going through "all of the pain" of the PC architecture?

    We have ridiculous amounts of performance. Desktop PCs have more memory bandwidth than workstation SPARCs. The highest SPEC scores on the planet belong to x86 machines.

    Like, PCI is open, but not the BIOS or the ISA -- ISA as in chip architecture, not the obsolete bus. And if it could better, easilier interoperate with the de facto standard.

    The ISA is pretty open. Transmeta made processors based on it without legal difficulty. Bochs is an open source x86 emulator that was developed without legal issues. Academics have built processor simulators based on x86.

    And not enough people bought them because MS never ported MS Windows to 64 bits specifically, nor even recompiled all its apps to RISC in general -- even MS Windows NT for the Alpha never got, say, MS Access or games, and only got MS Visual BASIC when it was almost dying already.

    What reason would Microsoft have to port applications to an architecture with such SMALL volumes?

    Economics teach us that, had we open standards acceptance which is currently hindered by the duopoly, we'd have US$ 500,- RISC computers too, but these would require less clock, less memory, less footprint, less energy, and they would produce less noise and last more. Incredible how ecologists blather about efficiency and garbage disposal, yet few people point out that Wintel makes for inefficient, practically disposable systems.

    You talk about power/energy/clock efficiency and at the same time praise how great Alpha was. Alpha had the highest clock rate, the highest power and was the most expensive of all the RISCs or CISCs of it's time. RISC vs. CISC is a red herring. You're basically doing to same amount of computation which to a first order, requires the same amount of energy. Just look at the new Mac G5 machines with their huge fans and power dissipation. These are also RISC machines. RISC machines also use up more memory since their code footprints are larger (more instructions to specify and algorithm than a comparable CISC ISA)

    Market forces have given us the computers as we know them today. Progress has been stifled by Microsoft because the market was not allowed to function correctly, but this is being remedied by Linux.

  8. Re:64 bits is old History on Intel 64-bit Announcements at IDF · · Score: 1

    Wrong. 64-bit computing is ten years old with the Alpha, including PCs running GNU/Linux. Not to mention the later UltraSPARC, PA-RISC 2 and MIPS workstations.

    And today we already have the PowerPC G5.

    This all proves Wintel is the biggest drag in Informatics.


    Did you own RISC workstations 10 years ago? Wow you must be rich! I certainly could not afford to shell out $25,000 USD for a unix workstation 10 years ago.

    Wintel, isn't a drag on informatics. Wintel has brought cheap/ubiquitous computing to the masses. Linux is amazing - cutting the cost of computers down even more.

    Alpha and their ilk died because not enough people bought them. The only reason computers exist is because people are willing to buy them and the producer makes a profit. It's too bad not everyone can afford to buy a $25,000 computer, but a $500 computer that performs at maybe 80% of a high end equivalent RISC today is pretty damn good.

  9. Re:64 bit systems for whom? on Intel 64-bit Announcements at IDF · · Score: 1

    2) There isn't a performance hit, especially since the AMD chips run 32 bit code so well, and I'm sure Intel's chip will too.

    There certainly IS a performance hit.

    Doubling the data-path width of the processor impacts your max frequency. You can't run as fast as you could have with a 32-bit datapath.

    Since your memory pointers are now 64 bits, there is more pressure on your caches and DRAM bandwidth, so that these things look smaller.

  10. Re:And lest we forget... on Intel 64-bit Announcements at IDF · · Score: 1

    and a few years or decades later we'll 128-bit personal computing

    Why would that be true? Going from 8->16->32->64, if you notice the pattern forms an EXPONENTIAL series.

    128 bits is enough to address more than all of the atoms in the universe - ie - you'd never have enough memory to need 128 bits.

  11. Re:Why 64 bit? on Intel 64-bit Announcements at IDF · · Score: 1

    If we change architectures, it will be less about addressing limitations and more about the piss-poor quantity of registers available on ia32. More registers means more obtainable instruction-level parallelism.. this equals more work done on modern architectures.

    Modern architectures don't need more architectural registers. Register renaming takes care of the false dependencies and the spill/fill loads and stores can be eliminated from the critical paths of programs due to fast store forwading buffers and even more advanced techniques such as memory renaming.

    The fastest integer machines on the planet are x86 based. 8 registers hasn't been a problem for quite some time now.

  12. Re:No it's not on Intel 64-bit Announcements at IDF · · Score: 1

    You don't churn throught twice as much a *useful* information as 99.9% of all integers only need 32 bits (or less), so really the higher order bits are being discarded and 32-bit processors already have 64-bit floats. The quote is misleading and suggests that the 64-bit processor is going twice as fast

    If you want to argue that route, you'd find that about 75% of all integer arithmetic only requires 8-bits of precision (small loop counters, booleans and such...)

    64 bit processors do process twice as much data (in the integer paths) whether they're "useful" or not. In fact, making the ALU wider (going to 64 from 32), puts major pressure on timing - potentially lowering the max-frequency of the processor. I've said it before and I'll say it again - 64 bit processors running 64 bit code are more likely than not to run slower than a 32 bit processor.

  13. Re:Intel wouldn't ditch Itanium... on Intel 64-bit Announcements at IDF · · Score: 1

    IPF is dead.
    Get over it already...
    It was ill-conceived both from an engineering and a business perspective.

    The HPC market is high performance and sexy, but it's certainly not where the big bucks are. Certainly not enough dollars to support multiple MPU design teams. If HPC were a volume market or major profit center, you'd see IA32 solutions with caches and memory bandwidths to match IPF implementations. Larger caches and memory bandwidth are the 1st order reason they excel in those apps. Ditto for TPC.

    IPF was designed for the enterprise. However, it's been a miserable failure there. Xeon and even Desktop CPUs are already eating IPF's lunch. The addition of 64 bit addressing only puts a further nail in the coffin. IPF was, and is, a market failure.

    Intel isn't saying it just yet for obvious reasons. They're just waiting until the infrastructure is in place to support x86-64 (open platforms - RAS features, better busses and memory technology for scalability, operating systems (MS, Linux), and of course --- applications).

    I forsee the workstation market being completely dominated by x86 - even moreso than it already is. As standardization moves up the enterprise stack, x86 will encroach into the coveted IBM, HP and SUN enterprise markets. I forsee Dell winning bigtime because all of their "R&D" is outsourced to Intel (CPU + Platforms), IBM (for linux) and Microsoft. History and economics show that in the computer business, the lowest cost producer wins.

    x86 will be everywhere - from embedded, PC, workstation, server/enterprise.

    Time to go short some more SUNW stock.

  14. Re:Once bitten, twice shy? on ESR's Open Letter to McNealy: Set Java Free! · · Score: 1

    Out of interest, would you name some better IDEs..?

    Emacs/gcc/gdb :)

  15. Re:Not gonna happen on ESR's Open Letter to McNealy: Set Java Free! · · Score: 1

    Sun sells hardware. That's what drives the majority of the profits.

    They're screwed with or without java.

    They can try to maintain control of java and claim that it works best on their hardware, but that just isn't the case given that their hardware has been significantly outperformed by other platforms. Their hardware is having problems competing with x86 solutions in the low end and IBM/HP solutions in the high end.

    Why would freeing java help them? It's a "don't care" either way.

  16. Re:ASM is not the place to start. on Learning Computer Science via Assembly Language · · Score: 1

    I disagree. It is important for even an architect or algorithm designer to understand how the machine works underneath.

    First you start with simple gates. Then you build simple combinational circuits with the gates. Then you build latches. Then you explain how to build sequential circuits. Then you teach how to control the circuits with sequences of control signals. Then you teach them to store the control signals in memory - AH-HA - your first stored program!

    This is an extremely general concept. It is important for CS people to understand that languages, algorithms etc... are an abstraction of a real machine.

    You can think of C as a shorthand for assembly. C is like assembly but it automates register allocation and facilitates the naming of memory location through an abstraction layer.

    By showing how tedious it can be to grind through all of the details that go on in the gates and circuits, you teach the student the true value of abstraction. You teach them what the right abstractions are and why they work. Once you've learned all of that, THEN you can go on to the object oriented design principles which are yet another abstraction on top of non-OO procedural code. Motivating the reason for abstraction is crucial.

  17. Learn from the best on Learning Computer Science via Assembly Language · · Score: 1

    I'm all for the bottoms up approach to learning about computers.

    Here is a book that every aspiring computer scientist should read:

    Introduction to Computing Systems: From bits & gates to C & beyond
    by Yale N. Patt, Sanjay J. Patel

  18. Re:Let's kill x86! on How to Kill x86 and Thread-Level Parallelism · · Score: 1

    And of course, address translation doesn't cost anything in terms of die size, performance or power consumption

    Nope. It doesn't. It used to in the early 90s, but now we have transistors to spare. The ISA doesn't matter anymore. It's at most 2nd order effect on die size, power and performance. I design x86 processors for a living - It's a fact.

    There are a million tricks architects can play to get around poor ISAs. What are the fastest SPECint machines on the planet? Hmm...x86 machines!

    The only reason Itanium wins on SPECfp are cache and memory bandwidth which are completely orthogonal to the ISA.

    It didn't contribute to unnecessarily complicated code and operating systems

    The cost of devloping and validating that code has been paid. We're enjoying the benefits of that labor. Now you want to start over? Re-validate? Re-compile? You're asking way too much. Compilers can compile to x86 now. Processors can optimize out the cruft. Productive programmers don't spend their lives programming in assembly anymore...Why should anyone care that they have an x86 ISA under the hood instead of say...Alpha?

  19. Re:64-bit rant [move along] on Intel Shifting 64-bit Plans · · Score: 1

    just look at p4 vs. athlon - the tremendous clock speeds realized by the p4's use of an extended pipeline (which is a risc-like optimization) have a tremendous downside - you lose a lot of time resetting the cache if you miss a branch. so for interative programs, as opposed to massive number crunching (and that can be addressed cheaper using MPP and clustering), risc is something of a dog.


    Deeper pipelines are not necessarily inefficient!

    If I have a 1 GHz processor with a 10 cycle branch misprediction re-fill penalty and a 2 GHz processor with a 20 cycle branch misprediction penalty -- guess what? The penalty is the same!

    Now, doubling the number of pipeline stages does NOT get you double the frequency due to latch overhead and that is why people don't pipeline even deeper.

    The K8 went to 12 stages from K7's 10 stages. Is is less efficient?

    You'll find that processors over time INCREASE the number of pipeline stages. Just look at the early RISC processors that had a meager 5 pipeline stages.

    Pipelines are getting longer for a reason - more performance. Intel just takes it to the extreme. They're taking a page from the Alpha playbook of high performance processor design. Alphas were always clocked higher than the competition.

  20. Re:Itanium is not being replaced on Intel Shifting 64-bit Plans · · Score: 1

    I think we're agreeing :)

    I was trying to say that instruction-set has very little to do with performance as the parent was attempting to suggest.

    If you put equal sized caches on a P4 or Athlon and equip them with comparable busses to DRAM, then yeah, you'll see some pretty impressive SPEC score that would probably exceed Itanium.

  21. Re:Itanium is not being replaced on Intel Shifting 64-bit Plans · · Score: 1

    Itanium only kicks ass on SPEC because of their ridiculously large caches and high memory bandwidth.

    These 2 things have NOTHING to do with IA64. You could have just as easily put the large caches/FSB-bandwidth on a P4 and let me assure you, the P4 would be the highest performance SPEC machine on the planet.

  22. Re:I guess the home market rules... on Intel to Increase Stages in Prescott · · Score: 1

    Unfortunately, in real-world situations cache thrashing is difficult to avoid, and accurate branch prediction is a highly non-trivial affair. When a prediction turns out to be wrong, the cost of refilling a stalled pipeline increases in proportion to the pipeline length. The ever-lengthening pipelines of P4 chips means that, although its FP performance may r0x0r, the overhead of stalls makes production code run like treacle.

    The cost of refilling the pipe is measured in time and not pipe-stages. If you're running a processor with 5 stages of misprediction penalty at 1 GHz and another processor with 10 stages of misprediction penalty at 2 GHz, then they will refill their pipes in the same amount of time.

    The penalty of the P4 mispredicts relative to the penalty of Athlon mispredicts is NOT the ratio of pipe-stages because the P4 runs at a much higher clock by design.

    On another note - Scientific FP code tends to be heavily loop based which means that branch prediction is very accurate and longer pipes don't hurt as much. If your memory access pattern is predictable, prefetching can solve many problems with the cache.

  23. Re:Still not convinced on AMD's Roadmap revealed · · Score: 1

    Integrating the memory controller reduces latency by 20-30%. At 2.0GHz this makes a BIG difference (this is the main reason why a 2.0GHz Athlon64 is faster than a 2.2GHz AthlonXP), at 4 or 5GHz the difference will be huge.

    So reducing latency by 20-30% is "huge", but only having half of the bandwidth (3.2 vs 6.4) is no biggie? Many memory accesses can get prefetched so latency isn't as big a deal as you make it out to be. Of course, if you're prefetching, more bandwidth will help you out considerably.

    I guess you don't encode DVDs to DIVX or do any high performance floating point work?

  24. A summary of 64 bit costs and benefits on Are 64-bit Binaries Slower than 32-bit Binaries? · · Score: 1

    With 64 bits you need to make the critical path of the processor wider (the ALUs, the register file etc...). The makes your circuits slower. You have the lost opportunity cost of frequency -- You could have made the processor run at a higher frequency with a 32 datapath - For those of you with a background in computer arithmetic, think about computing the carry-bit in an adder.

    64 bit addressing means that your pointers now take up twice as much space as your old 32 bit pointers. This exerts more pressure on your memory system and makes your L1 and L2 caches appear smaller. It also makes your memory bandwidth appear narrower since you need to ship more data across the bus between the processor and DRAM. You can't say something like - oh the 64 bit processor's bus is twice as wide because you could have made the 32 bit processor's bus twice as wide. In other words, the width of the bus is orthogonal to the instruction set supporting 64 bits. Similar arguments hold for caches.

    The benefit of x86-64 is that it provides more registers. This means that there are fewer loads and stores which are used to spill and fill stack variables. This reduces the number of instructions required. It doesn't necessarily speed up the processor because stack variables would live in the store-forwarding buffer and would bypass the cache lookup, but you do have fewer instructions which means less pressure on your instruction cache and lower instruction-fetch bandwidth requirements.

    There is a benefit for applications that naturally use 64-bit data types. This is a minor effect because many apps do not need the full 64 bit integer arithmetic. This benefit is also offset by the opportunity cost of frequency described above.

    The other big benefit is that you can now address a 64-bit virtual address space which is convenient for accessing large data structures. This is the main reason why processor architects want to move to 64 bits.

  25. Re:Scientific work on optimal pipeline depth on Intel to Increase Stages in Prescott · · Score: 1


    In the introduction of the Intel paper, it says "Focusing on single stream performance". So, basically they are focusing on artificial benchmark performance.


    Ummm no...

    At Intel we simulate real benchmarks - things like Unreal Tournament and Adobe Acrobat.

    Single stream performance refers to a single thread executing in the processor core at once. Of COURSE you can have a context switch and go to another OS thread.

    Dual stream (Hyperthreading/SMT) has different performance characteristics for pipeline depth. SMT refers to having 2 or more simulataneous thread in the pipeline - simultaneously...at the same time...

    Not speaking for Intel...