Slashdot Mirror


AMD Previews New Processor Extensions

An anonymous reader writes "It has been all over the news today: AMD announced the first of its Extensions for Software Parallelism, a series of x86 extensions to make parallel programming easier. The first are the so-called 'lightweight profiling extensions.' They would give software access to information about cache misses and retired instructions so data structures can be optimized for better performance. The specification is here (PDF). These extensions have a much wider applicability than just parallel programming — they could be used to accelerate Java, .Net, and dynamic optimizers." AMD gave no timeframe for when these proposed extensions would show up in silicon.

198 comments

  1. Is that what's holding up Barcelona? by Anonymous Coward · · Score: 0

    Anybody?

  2. And so it goes on... by ceeam · · Score: 0, Offtopic

    I wonder - amongst 16-bit "real mode", 16-bit "protected mode", 32-bit mode, 64-bit mode - how many different instruction kinds / opcodes a modern x86 CPU supports?

    1. Re:And so it goes on... by edwdig · · Score: 2, Informative

      There's very little difference between the instructions in the different modes. The memory management unit is where most of the differences are. Properly written 16 bit real mode code will still run in 16 bit protected mode. The only difference is how the segment portion of the pointer in interpreted.

      As for 16 bit vs 32 bit modes. The instructions are mostly the same. A code segment is specified as being either 16 or 32 bit. That size is the default data sized used by instructions within that segment. There is a "size override" prefix, which if found immediately before an instruction, tells the CPU that the following instruction should use the opposite of default size.

      I don't remember the specifics, but 64 bit mode just continues along with the same ideas. There aren't many changes from 32 bit code to 64 bit.

    2. Re:And so it goes on... by Ant+P. · · Score: 2, Interesting

      It was at least 200 last time I read - and the source was an 80486 programming book. I think there's at least that many more in the different versions of SSE.

    3. Re:And so it goes on... by funkatron · · Score: 1

      Probably enough to start dropping a few. The 16 bit instructions could be disposed of without anyone noticing for a start.

      --
      "Welcome to our world. We are the wasted youth. And we are the future too." Yes, I know these are stupid lyrics.
    4. Re:And so it goes on... by mastermemorex · · Score: 1

      they could be used to accelerate Java, .Net, and dynamic optimizers

      So for CPU manufactures C++ is dead. Thanks for to be so clear.

    5. Re:And so it goes on... by TheOrquithVagrant · · Score: 1

      If only it were so. Unfortunately, it's not. There's a distressing amount of 16-bit real-mode code being executed in between power-on and your OS kernel switching into 32 or 64 bit mode even on the most modern PC.

    6. Re:And so it goes on... by BillyBlaze · · Score: 1

      In a sense, the 16 bit instructions have been dropped, if only when running in 64-bit mode. Which is actually kind of annoying, because it means some of those old Windows 3 and DOS programs won't run without emulation.

    7. Re:And so it goes on... by Anonymous Coward · · Score: 0

      So for CPU manufactures C++ is dead. Thanks for to be so clear. Sounds like C++ is already as optimized as it's going to get, and only Java/.NET etc. will benefit more from this.
    8. Re:And so it goes on... by AnyoneEB · · Score: 1

      Java and .Net are JIT compiled. C++ is a normal compiled language. I assume the extensions are helpful to JIT compilers because they would allow the compilers to recompile the code with different optimizations based on the data they get.

      --
      Centralization breaks the internet.
  3. Just performance counters? by Erich · · Score: 2, Informative

    Looks like there isn't a whole lot there that you couldn't get using existing performance counters and a tool like oprofile....

    --

    -- Erich

    Slashdot reader since 1997

    1. Re:Just performance counters? by pipatron · · Score: 1

      But this could probably do it dynamic, in realtime, which might be nice. Dunno, didn't RTFA of course.

      --
      c++; /* this makes c bigger but returns the old value */
    2. Re:Just performance counters? by imgod2u · · Score: 4, Informative

      Looking at the PDF, it supposedly gathers profile data in the background (in local caches on the chip itself) and dumps periodically depending on the OS/application settings. This allows it to profile on-the-fly with very little impact on application performance.

      The application can then gather the information, which is stored in its address space, and do with it what it will (optimize on-the-fly).

      Of particular interest is that the OS can allow the profile information to be dumped to the address space of other threads/processes as well as the one that the data is collected on. The OS controls the switching of the cached profile information during a context switch.

      This is both cool (in that a secondary core/thread can help optimize the first) and scary (one thread getting access to another's instruction address information). I predict there will be exactly 42 Windows patches released 3.734 days after the service pack that allows Windows to take advantage of this feature because of security reasons.

    3. Re:Just performance counters? by pjhenley · · Score: 1

      We used to talk about doing this with performance counters on the intel chips when I was doing OS research. The word was that the performance counters were not good for production code because you didn't know if they were going to be there in the next iteration of the intel processors. So we used them for research purposes, but not as any sort of OS feature. It sounds like AMD is going to assure that these instructions are permanent.

    4. Re:Just performance counters? by Nefarious+Wheel · · Score: 1

      I wonder if this isn't part of the series of changes announced at MS TechEd, where it was said the Ring 0 (Kernel) instructions would be emulated to provide a bit of a speed-up for the VS Hypervisor. It was said that both Intel and AMD were preparing designs to support virtualisation in silicon. That would put it out somewhere near the end of 2007 I think.

      --
      Do not mock my vision of impractical footwear
    5. Re:Just performance counters? by Anonymous Coward · · Score: 0

      As far as I understand it, the problem with performance counters isn't that the facility itself could go away - it is that the specific events being counted depend on the details of the processor implementation, and that a future processor with a new and improved design might not be able to report the exact same events. How precisely does the new AMD specification define the events being counted, anyway?

  4. I wish AMD and Intel teamed up for once by rolfwind · · Score: 2, Funny

    and did away with the aging x86 instruction set and came up with something new.

    Yeah, I know, Intel tried with Itanium.

    1. Re:I wish AMD and Intel teamed up for once by Chris+Burke · · Score: 2, Insightful

      Yeah, I know, Intel tried with Itanium.

      And you want them to try *again*? As far as I'm concerned the most amazing achievement of IA64 was that they got to start over from scratch, and ended up with an ISA with a manual even bigger than the IA32 manual! Going to prove that the only thing worse than an ISA developed through 20 years of engineering hackery is one developed by committee.

      --

      The enemies of Democracy are
    2. Re:I wish AMD and Intel teamed up for once by realmolo · · Score: 3, Insightful

      Yup. They tried it with Itanium, and it didn't work.

      The thing is, at this stage in processor design, the actual instruction set isn't all that important.

      But *compilers* are more important than ever, and writing a good compiler is hard work. x86 compilers have been tweaked and improved for nearly 30 years. A new instruction set could NEVER achieve that kind of optimization.

      Interestingly,the Itanium and the EPIC architecture were designed to move all the hard work of "parallel processing" to the compiler. Unfortunately, they could never get the compiler to work all that well on most kinds of code. The compiler could never really "extract" the parallelism that Itanium CPUs needed to work at full speed.

      Which is *exactly* the problem we have now with our multi-core CPUs. Compilers don't know how to extract parallelism very well. It's an *incredibly* difficult problem that Intel has already thrown untold billions of dollars at. Essentially, even though Itanium/EPIC never caught on, we're having to deal with all the same problems it had, anyway.

    3. Re:I wish AMD and Intel teamed up for once by gilesjuk · · Score: 1

      Indeed, devices at the lowest level don't always look that pretty. As Linus said, with Itanium Intel threw away all the good bits.

    4. Re:I wish AMD and Intel teamed up for once by Anonymous Coward · · Score: 1, Interesting

      I read somewhere that modern x86 processors don't really process x86 opcodes anymore--there's a "translator" that takes the CISC x86 code and converts it into some kind of RISC code. If true, maybe they should enable a way for the processer to use that RISC code without the conversion.

    5. Re:I wish AMD and Intel teamed up for once by Slashcrap · · Score: 2, Interesting

      and did away with the aging x86 instruction set and came up with something new.

      Yeah, I know, Intel tried with Itanium.


      They already did. I believe the 486 was the last CPU to run x86 instructions natively. Everything since the Pentium has decoded them to a RISC like ISA which can be changed every generation if desired. The only drawback is that a relatively small area of the chip needs to be dedicated to decoding x86 instructions to whatever the internal ISA is.

      And guess what? One of the things that people dislike about x86 is the variable length instructions. Turns out that it actually leads to more compact code. And the speed gains from reduced cache usage more than make up for the effort and chip real estate expended on those decoders.

      So let's stick with x86 for now, since the gains you foresee are either non-existent or tiny and are never, ever going to outweigh the drawbacks.

    6. Re:I wish AMD and Intel teamed up for once by Vellmont · · Score: 2, Informative


      and did away with the aging x86 instruction set and came up with something new.

      They did, at least with the FP (floating point) instructions. FP instructions were based on this awful stack architecture, and it's gone away with all the SSE and 64 bit extensions.

      The x86 instruction set has evolved greatly over time, and will continue to evolve. Why replace it entirely from scratch? Who's to say that an entirely new instruction set won't have a whole new host of problems?

      --
      AccountKiller
    7. Re:I wish AMD and Intel teamed up for once by LWATCDR · · Score: 4, Insightful

      Well we had the 68000 family which had much better instruction set then the X86.
      We have the Power and PowerPC which had a much better instruction set than the X86.
      We have the ARM which is a much better instruction set then the X86.
      We have the MIPS which is pretty nice.
      And we had the Alpha and still do for a little while longer.
      The problem with all of them is that they didn't run X86 code. Intel and AMD both made so much money from selling billions of CPUs that they could plow a lot of money into making the X86 the fastest pig with lipstick that the world has ever seen.
      What made the IA-64 such a disaster was that it was slow running X86 code.

      --
      See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
    8. Re:I wish AMD and Intel teamed up for once by nbert · · Score: 1

      Not that it would make much of a difference - in the end most of the instruction set won't be used by programmers and especially compilers (CISC vs. RISC anyone?). But to get back to the topic: The overhead caused by upwards compatibility isn't that big after all. Problems a normal user experiences are not caused by bad hardware design nowadays.

    9. Re:I wish AMD and Intel teamed up for once by Anonymous Coward · · Score: 0

      No. The software community *doesn't* wish that someone created a ground up new ISA. They're created all the time by upstart embedded companies and ivory tower professors. But the fact is, everyone knows, understands, and has learned to love x86. It's an entrenched standard that with trillions of dollars of infrastructure built around it. AMD and Intel teaming up to make a new ISA would result in AMD and Intel both making an interesting product with no infrastructure support doomed to limited niche markets. See Itanium, Alpha, PA-RISC, even PowerPC.

      Coward out.

    10. Re:I wish AMD and Intel teamed up for once by LordPhantom · · Score: 1

      It's like the saying goes: None of us is as dumb as all of us....

    11. Re:I wish AMD and Intel teamed up for once by Dusty00 · · Score: 1

      And Intel's failure was due to a lack of backwards compatability. Coming up with "radically new and advanced" architecture does little good in the tech world because no matter how much better it is going forward it has to still work with the technologies that got us here.
      If someone invents something better by than HTML it won't matter how much better it is, the world isn't going to scrap the content on the internet for the sake of the new technology.

    12. Re:I wish AMD and Intel teamed up for once by Ant+P. · · Score: 1

      The thing is, what would they replace it with that they can sell? The only choices are emulation or translating code on the fly, both of which have sunk already.

    13. Re:I wish AMD and Intel teamed up for once by Chris+Burke · · Score: 2, Informative

      I believe the 486 was the last CPU to run x86 instructions natively.

      Close, it was the original Pentium. The Pentium Pro -- which despite the name which just made it sound like a minor improvement to the Pentium for business/servers was actually a completely new architecture -- is where they introduced the CISC->RISC conversion. This was in part to make it feasible to have out-of-order execution which many said CISC processors would never have. Turns out they were both right and wrong.

      So let's stick with x86 for now, since the gains you foresee are either non-existent or tiny and are never, ever going to outweigh the drawbacks.

      As much as I hate x86 from an aesthetic point of view, I must agree with you here.

      --

      The enemies of Democracy are
    14. Re:I wish AMD and Intel teamed up for once by dunkelfalke · · Score: 1

      yep, that is right.
      especially interesting is the transmeta crusoe cpu which can load different instruction sets and translate them into its native code.

      but the thing is, as far as i remember, back at those days when transmeta crusoe was just near the release, linus said something like "i compiled the linux kernel to the native crusoe vliw instructions and it was actually slower than the x86 code"

      --
      "It's such a fine line between stupid and clever" -- David St. Hubbins, Spinal Tap
    15. Re:I wish AMD and Intel teamed up for once by Chris+Burke · · Score: 2, Insightful

      If true, maybe they should enable a way for the processer to use that RISC code without the conversion.

      I don't think that's a good idea. The internal micro-ops are machine-dependent, and they will change as the microarchitecture changes. By designing the micro-ops specific to the architecture, they can try to make the x86 instruction translate into an optimal sequence of micro-ops. As hardware functionality changes, existing x86 instructions can have the underlying ops changed to suit without you having to re-code or even re-compile your program.

      For example: Barcelona is introducing 128-bit wide floating point units for SSE instructions. The previous ones were 64-bits wide, so it would take two operations (and most likely two separate micro-ops) to perform a 128 bit SSE add instruction. Whereas now it will only take one op, and the same x86 instruction can take advantage of that fact without having to know what architecture it is running on. Another example is divides, which on a machine with a hardware divide unit would be only a few instructions, but on a different machine would require a lengthy microcode routine. Your code doesn't have to know; it just runs faster on the code with the hardware DIV.

      Not that you can't optimize your x86 code for particular architectures, or that there aren't x86 codes that run on one machine and not another -- though you can check whether the machine can run the code, and you aren't having the entire instruction set change out from underneath you. I'm just saying that only exposing one side of the CISC->RISC conversion gives the chip designers a lot of leeway and you probably don't want to give that up.

      --

      The enemies of Democracy are
    16. Re:I wish AMD and Intel teamed up for once by Anonymous Coward · · Score: 2, Interesting

      IBM's PPC compiler kicked the shit out of every x86 compiler. (Apples and oranges, but the quality was much better). Same for ARM's compiler and Sun's (SPARC) compiler. Fact is, x86 is the ugly girl at the party, but it gets more attention from GCC, MS, Intel, etc. Native compilers on other architectures beat the shit out of it.

    17. Re:I wish AMD and Intel teamed up for once by ZachPruckowski · · Score: 1

      I don't know why you aren't modded +5 (at the moment anyway), but you're precisely correct.

      The number one requirement for a new instruction set is that it runs Windows and most Win32 programs at speeds comparable to existing processors. Given the size and scope of Windows, Microsoft probably can't easily port Windows and Win32 and Visual Studio's compiler over to another instruction set easily.

      This means that we either need hardware or software emulation of x86 (and possibly x86-64) on whatever new instruction set comes along. So it either has to support x86 and most x86 extension (SSE, etc), in which case it's an oversized x86 extension, or it has to be so much better than x86 that a processor can run x86 programs at about 80% speed. In either case, you'll still have a heck of a time getting non-OSS software ported to the new instruction set (as x86 will be "fast enough")

    18. Re:I wish AMD and Intel teamed up for once by Criffer · · Score: 5, Insightful

      Not again.

      Why is this nonsense still perpetuated? The instruction set is irrelevant - it's just an interface to tell the processor what to do. Internally, Barcelona is a very nice RISC core capable of doing so many things at once its insane. The only thing that performs better is a GPU, and that's only because they're thrown at embarassingly parallel problems. The fastest general purpose CPUs come from Intel and AMD, and it has nothing to do with instruction set.

      AMD64, and the new Core2 and Barcelona chips are very nice chips. 16 64-bit registers, 16 128-bit registers, complete IEEE-754 floating point support, integer and floating-point SIMD instructions, out-of-order execution, streaming stores and hardware prefetch. Add to that multiple cores with very fast busses, massive caches - with multichip cache coherency - and the ability to run any code compiled in the last 25 years. What's not to like?

    19. Re:I wish AMD and Intel teamed up for once by Anonymous Coward · · Score: 0

      Come on dude, ISA is 20 year old technology.

    20. Re:I wish AMD and Intel teamed up for once by wonkavader · · Score: 2, Insightful

      No, the problem with the IA-64 was not that it was slow running x86 code. The problem was that it was slow running x86 code and not that great at running non-x86 code. Spectacular performance on non-x86 would have made it a much greater success, but it was lackluster from the start. After so long spent on designing a new chip, you'd expect some real results -- it was not much better than the alternatives. "Why bother?", the world said, and says even now.

    21. Re:I wish AMD and Intel teamed up for once by jguthrie · · Score: 3, Insightful

      Okay, I'll feed the troll. Tell me where I can buy an ATX (or smaller) PPC motherboard and CPU new for, oh, say $200, and I'll look at PPC again. The reason that x86 gets all the software is because it's the cheapest, it's the cheapest because all the motherboard manufacturers make motherboards for it, and all the motherboard manufacturers make motherboards for it because it gets all the software.

    22. Re:I wish AMD and Intel teamed up for once by Crazy+Taco · · Score: 1

      If someone invents something better by than HTML it won't matter how much better it is, the world isn't going to scrap the content on the internet for the sake of the new technology.

      HTML 5.0 even being considered is a case in point, considering XHTML is far better from a Computer Science standpoint and has far more future potential.

      Coming up with "radically new and advanced" architecture does little good in the tech world because no matter how much better it is going forward it has to still work with the technologies that got us here.

      And even if we didn't have to keep the old tech, or even if the change to the new tech was really, really slight and easy, we still wouldn't make it because we would still have many, many idiots around who will refuse to learn something new or even consider a new technology. In fact, they will raise a stink for years until someone relents. It doesn't matter how much better the new technology is, how bad things were before, or how poor or illogical the arguments of those wanting to keep the old technology are: they will badger everyone until it is resurrected. Again, HTML 5 is a case in point.

      --
      Beware of bugs in the above code; I have only proved it correct, not tried it.
    23. Re:I wish AMD and Intel teamed up for once by Chirs · · Score: 2, Insightful

      The instruction set *is* relevent to low-level designers. Working with the PowerPC instruction set is much nicer than x86...for me at least.

      As for "the fastest general purpose CPUs come from Intel and AMD", have you ever looked at a Power5? It's stupid fast. Stupid expensive, too.

    24. Re:I wish AMD and Intel teamed up for once by Anonymous Coward · · Score: 0

      We have the ARM which is a much better instruction set then the X86.

      Sure, if you want to stick to in-order execution. As soon as you try going out of order, the implicit dependencies caused by the conditional-execution bits make it incredibly painful.

    25. Re:I wish AMD and Intel teamed up for once by dfghjk · · Score: 1

      "As Linus said, with Itanium Intel threw away all the good bits."

      It's a good thing Linus leveraged his considerable processor architecture experience while at Transmeta. Where would they be now had he not provided useful advice like that?

    26. Re:I wish AMD and Intel teamed up for once by Chris+Burke · · Score: 1

      They'd have been even worse off even sooner than what actually happened. Any other questions?

      --

      The enemies of Democracy are
    27. Re:I wish AMD and Intel teamed up for once by glitch23 · · Score: 1

      What made the IA-64 such a disaster was that it was slow running X86 code.

      IA-64 did x86 in hardware only because the instruction set did not support x86. So not only was it not supported at all in software but the support that was there was slow. That was its downfall. By retaining x86 compatibility with its 64 bit CPUs, AMD was able to jump into the 64-bit world with a better reception.

      --
      this nation, under God, shall have a new birth of freedom. -- Lincoln, Gettysburg Address
    28. Re:I wish AMD and Intel teamed up for once by Anonymous Coward · · Score: 0

      How the fuck do you get off saying that bloody 16 integer registers and 16 double registers makes for a "very nice" chip? PowerPC chips have had 32 of each for a long, long time. Itanium at least upped that number, but we saw where that went?

    29. Re:I wish AMD and Intel teamed up for once by Chris+Burke · · Score: 4, Interesting

      Why is this nonsense still perpetuated? The instruction set is irrelevant - it's just an interface to tell the processor what to do.

      Sure, now it is, since the decoding of CISC instructions into micro-ops has largely decoupled ISA from the microarchitecture, allowing many of those neat-o performance features you meantion like out-of-order execution. However in the past this wasn't the case and a lot of x86's odd behaviors that seemed like good ideas when they were made were serious performance limiters. Like a global eflags register that is only partially written by various instructions (and they always write even if the result isn't needed).

      Even today, I would say that all those RISC ISAs are better than x86, simply from the standpoint that they are cleaner, easier to decode, have fewer tricky modes to deal with, fewer odd dependencies, and all the other things that make building an actual x86 chip a pain in the arse. No, in the end it makes no difference in performance. Yet, if you had it to do all over again, building the One ISA to Rule Them All without concern for software compatability, and you decided to make something that was more like x86 than Alpha, I'd slap the taste out of your mouth.

      But we do have to be concerned with software compatability, and that I think was the GP's main point. All of those other ISAs failed to dominate -- even when there were actual performance implications! -- simply because they were not x86 and hence didn't run the majority of software. IA64 failed not because it was itself all that bad, but because it couldn't run x86 software well. So when AMD came out with 64-bit backward-compatible x86, everyone stopped caring about IA64. Because it wasn't x86, and AMD64 was.

      So ultimately I agree with you both, and I don't think the GP was nonsense at all. It's a very valid point -- backward compatability is king, so x86 wins by default no matter what. Your point -- that x86 isn't actually hurting us anymore -- is just the silver lining on that cloud.

      --

      The enemies of Democracy are
    30. Re:I wish AMD and Intel teamed up for once by Anonymous Coward · · Score: 0

      [blockquote]What's not to like?[/blockquote]

      The fact that it is not available in the marketplace now, and the very overclockable Q6600 is out there with a new low price of A$330???

    31. Re:I wish AMD and Intel teamed up for once by Verte · · Score: 2, Insightful

      AMEN! The lack of general purpose registers is a serious drawback to x86. The MMU is the same- well, it's not that it isn't feature packed, but it's so slow that we need a TLB, and the TLB can't handle threads, so all non-globals need to be flushed when context switching. Yuck.

      All the other features the GP mentioned, except for the last one if you mean COMPILED code, are also available on most RISC chips :P and the performance data really spoke for itself [Alphas had four times the floating point performance of the Pentium II clock for clock].

      --
      We at slashdot are scientists, specialists and kernel hackers. Your FUD will be found out.
    32. Re:I wish AMD and Intel teamed up for once by servognome · · Score: 2, Insightful

      and did away with the aging x86 instruction set and came up with something new.
      I wish they'd do away with English and come up with something new - a language based on consistant & logical rules.
      I don't know how anything gets done using a set of words cobbled together over hundreds of years with all sorts of special rules and idioms.
      --
      D6 63 0D 70 89 81 BB 8E 7B 7C 5F 5D 54 EA AB 73
    33. Re:I wish AMD and Intel teamed up for once by rolfwind · · Score: 1

      Yes, it's called German:) (Actually, English stems from it.)

    34. Re:I wish AMD and Intel teamed up for once by x2A · · Score: 3, Informative

      So what we need really is a "native" x86 compiler, say, from Intel, that would maybe outperform the multi-platform GCC compiler... an Intel C/C++ Compiler, or 'ICC' we could call it... maybe...

      Oh who am I kidding, that could never happen.

      --
      The revolution will not be televised... but it will have a page on Wikipedia
    35. Re:I wish AMD and Intel teamed up for once by wirelessbuzzers · · Score: 2, Informative

      Why is this nonsense still perpetuated? The instruction set is irrelevant - it's just an interface to tell the processor what to do...

      What's not to like? To start with, the complexity makes it a total pain in the ass to write kernels, compilers, runtime systems, analyses, debuggers and verifiers for x86. On top of that, it costs lots of engineering time, silicon and power to implement all those microcode crackers and fancy superscalar optimizations; this is why x86 can't hold a candle to ARM in the embedded world.

      But maybe you meant missing instructions? No load-linked/store conditional or bus snooping. No double (or even 1.5) compare-and-swap. No hardware transactional memory support. Those three make it pretty hard to write fast concurrent code. And streaming operations are improving, but could be much better; there's a reasonable chance that cache coherency will soon be too expensive for practical use.

      Maybe you're interested in single-threaded, native code performance; this is, after all, what x86 traditionally shines at. Here you'll find the lack of 3-register instructions to be a performance problem, even if the chip reduces this burden. There's no shuffle (like Altivec, although something like that is coming in Penryn, I think?), finite-field or bit twiddling operations, or conditional operations (a la ARM).

      So yeah. There are a lot of things that the x86 instruction set could do better. I don't expect it to do them all, but there are certainly a lot of reasons to change it.
      --
      I hereby place the above post in the public domain.
    36. Re:I wish AMD and Intel teamed up for once by x2A · · Score: 2

      "the TLB can't handle threads, so all non-globals need to be flushed when context switching"

      Isn't this not true on modern processors, at least up to a point? With some space per TLB entry put aside for a task ID, means that when you switch to a different process, it will won't use TLB entries with a different task ID. Of course the OS has to support this (tell the processor when it's task switching which memory space it's switching to), and I'm not sure how big the space on the TLB is for this (it may be only a few bits, so you might have to flush the TLB still between different processes, but keep not have to flush the kernels TLBs, ie, kernel has pages marked with zero, processes have pages marked with a one, you only ever flush TLBs marked with one, and tell the process when you want the zero or one TBLs). I can't remember where I first read this, it was a few years ago now I think, I'm sure it would have come along a way since though, seemed like a very sensible idea.

      --
      The revolution will not be televised... but it will have a page on Wikipedia
    37. Re:I wish AMD and Intel teamed up for once by salimma · · Score: 1

      Esperanto? It is very regular, use Latinate roots for ease of learning, and might -- just might -- be adopted by the EU as an official language, if/when they get tired of sending money to Britain to pay for translators.

      --
      Michel
      Fedora Project Contribut
    38. Re:I wish AMD and Intel teamed up for once by LWATCDR · · Score: 1

      I don't believe that ISA doesn't matter. If for no other reason than the X86 has a real shortage of GP registers. To gain the extra registers you must run in 64 bit mode so you must live with 64 bit addressing even if you really don't need it. As you said the X86 is fast which is also what I said. The ISA is very messy and and a real pain to write code for. There will always be some people that must write assembly. Yes the x86 is really fast even without a good ISA. It is also be updated over the years to keep up with current software needs. Hence it is the worlds fastest pig with lipstick.

      --
      See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
    39. Re:I wish AMD and Intel teamed up for once by be-fan · · Score: 1

      But *compilers* are more important than ever, and writing a good compiler is hard work. x86 compilers have been tweaked and improved for nearly 30 years.

      Compilers have gotten better, but mostly at CPU-independent optimization. Compilers for x86 aren't better than compilers for other architectures, it's just that x86 CPUs are extraordinarily insensitive to mediocre code generation. The reason is two-fold. First, they kind of have to be, because x86 doesn't really have enough registers to make fancy scheduling profitable. Second, x86 compilers have to target a wide variety of CPU implementations, so code ultimately ends up being compiled for something like a P6, and CPUs have to get good performance out of that existing code.

      --
      A deep unwavering belief is a sure sign you're missing something...
    40. Re:I wish AMD and Intel teamed up for once by truesaer · · Score: 1
      Even today, I would say that all those RISC ISAs are better than x86, simply from the standpoint that they are cleaner, easier to decode, have fewer tricky modes to deal with, fewer odd dependencies, and all the other things that make building an actual x86 chip a pain in the arse.


      The people who really suffer from this are Intel and AMD. They're the ones that have to design the nasty decoders for x86. They obviously find the advantages of decades of expertise in x86 ISA throughout the industry is worth the effort. Its good for everyone. And in reality a lot of the complexity of x86 decoding has been moved into the microcode engine so that the actual hardware decoders are pretty efficient. If you use any of the "cruft" in x86 its going to be microcoded and your program is going to run really slowly. And then you'll be motivated to stop using those instructions.


      Also, people shouldn't forget some of the advantages of x86, like variable instruction lengths. PowerPC and ARM may be easier to decode but they take up a ton more space and that causes a significant decrease in cache and memory efficiency. For example, I think the average x86 instruction is only 2 bytes (many are only 1 byte, if your program uses mostly 1 byte instructiosn you can get a LOT of performance this way). PowerPC is fixed at 4 bytes.

    41. Re:I wish AMD and Intel teamed up for once by UncleFluffy · · Score: 1

      if/when they get tired of sending money to Britain to pay for translators.

      if/when they get tired of returning a small amount of Britain's contribution back to it to pay for translators.

      There, fixed that for you...

      --

      What would Lemmy do?

    42. Re:I wish AMD and Intel teamed up for once by Verte · · Score: 1
      Sure- the x86 doesn't flush global [kernel] entries by default. As for the rest, I think TheRaven64 said it a lot better than me:

      x86 does not have a tagged TLB. This means that every context switch needs a full TLB flush, which results in a lot of TLB (and cache) churn. On something like SPARC, you just set the process ID register, and TLB entries belonging to other processes become invisible.
      --
      We at slashdot are scientists, specialists and kernel hackers. Your FUD will be found out.
    43. Re:I wish AMD and Intel teamed up for once by Verte · · Score: 1

      The third choice, of course, is portable software. I don't think the time is QUITE right for that kind of jump, but you can be reasonably sure that if the Itanium or Alpha were released next year, they would sell a lot more units then they did before we really had [software] choice.

      --
      We at slashdot are scientists, specialists and kernel hackers. Your FUD will be found out.
    44. Re:I wish AMD and Intel teamed up for once by Carewolf · · Score: 1

      And one that doesn't artificially limit the performance of other (let's say non-Intel) x86 CPU's.

      I am not kidding, that would never happen.

    45. Re:I wish AMD and Intel teamed up for once by zrq · · Score: 1

      But we do have to be concerned with software compatability, and that I think was the GP's main point. All of those other ISAs failed to dominate -- even when there were actual performance implications! -- simply because they were not x86 and hence didn't run the majority of software.

      Yes, software compatability has been an issue, up to now. But could this change as open source systems gain more market share ?

      In the past, a new architecture had to wait until Microsoft released a version of Windows for it and a significant number of other commercial software packages were ported before people would start to use it.

      With open source, as soon as a new architecture is available someone, somewhere, immediately tries to get Linux running on it. Once someone gets the core C compilers and OS running then most, if not all, of the other open source applications become available on the new architecture.

      If a chip manufacturer created a new architecture that had a significant performance gain over the current systems, and helped three or four of the major Linux distros to create a port that demonstrated a significant performance gain for server side applications (web servers or database servers etc). Would this be enough to entice data centers to adopt the new architecture ?

      There are other issues. It would need more than one chip manufacturer and more than one motherboard manufacturer involved before people would take it seriously. None of the major data centers would want to adopt a platform that tied them to a single supplier.
      Unit price would be another. Low volume production would mean that the new systems would more expensive.

      I know this isn't going to happen any time soon, but does the adoption of open source software remove one of the barriers.

    46. Re:I wish AMD and Intel teamed up for once by emilper · · Score: 1

      having read quite a lot of EU regulations, I can say with some degree of certainty that whatever EU translated to English after 2000 was probably outsourced to Japan or Turkey, but not to a country where "indo-european" languages are spoken. Older regulations of the ECs are fine, but some of the relatively new ones are quite bad ... word order is usually f***** up and there was one where there was no verb in a paragraph of 5 lines ...

    47. Re:I wish AMD and Intel teamed up for once by Anonymous Coward · · Score: 0

      Are you joking or just stupid?

    48. Re:I wish AMD and Intel teamed up for once by hitmark · · Score: 1

      and this is the same corp that came up with ACPI and EFI, iirc.
      not good...

      hell, if i didnt know better, i would suspect that intel was government owned, why? because they seems to overengineer to a degree that only nasa tops.

      --
      comment first, facts later. http://chem.tufts.edu/AnswersInScience/RelativityofWrong.htm
    49. Re:I wish AMD and Intel teamed up for once by Anonymous Coward · · Score: 0

      Barcelona is a very nice RISC core capable of doing so many things at once its insane.

      No. It has four cores, so theoretically it could do exactly *four* things at once. If that's insane, you'd better go buy some stock in the number 7.

    50. Re:I wish AMD and Intel teamed up for once by Ant+P. · · Score: 1

      Now that you mention it, that worked for Apple...

    51. Re:I wish AMD and Intel teamed up for once by BlackSnake112 · · Score: 1

      Itanium was too early. People were running what 8 bit and 16 bit programs at that time. Then intel rolls out a 64 bit processor. If Intel had released the Itanium when 64 bit was becoming more main steam it would have done better. Itanium is big in the data center. HP has a whole line of Itanium based server.

    52. Re:I wish AMD and Intel teamed up for once by Chris+Burke · · Score: 2, Interesting

      The people who really suffer from this are Intel and AMD. They're the ones that have to design the nasty decoders for x86. They obviously find the advantages of decades of expertise in x86 ISA throughout the industry is worth the effort.

      This is true, they're the ones who have to make it actually work. I think who it -really- hurts is anyone who isn't Intel or AMD trying to make an x86 chip. Unfortunately there's a lot of x86 behavior that isn't actually documented -anywhere- except inside the heads of Intel and AMD engineers and the HDL they write. Whereas a couple grad students could code up a fully Alpha-compatible cpu in a few weeks (it wouldn't be fast, but it would work). It creates a higher barrier to entry into the x86 market, and to me that's unfortunate. AMD and Intel obviously have a handle on the ISA.

      And in reality a lot of the complexity of x86 decoding has been moved into the microcode engine so that the actual hardware decoders are pretty efficient.

      Well, in so much as getting every operation that has to occur when you do something like a protected mode code segment load, sure the microcode deals with that. But the really hard part of x86 decode is dealing with variable-length instructions. To have a super-scalar architecture you need to be able to decode more than one instruction in a cycle, which means you need to know where the second instruction starts. A few ways of dealing with this include speculatively decoding the 2nd instruction assuming it starts at various points using parallel decoders (doesn't scale well at all), or saving marker bits in the instruction cache that tell you where instructions start (unavailable the first time you see an instruction so you have to use a slower method). Also fun to deal with is when an instruction crosses a cache-line boundary. And the essentially arbitrary number of prefixes that can be used.

      Also, people shouldn't forget some of the advantages of x86, like variable instruction lengths. PowerPC and ARM may be easier to decode but they take up a ton more space and that causes a significant decrease in cache and memory efficiency. For example, I think the average x86 instruction is only 2 bytes (many are only 1 byte, if your program uses mostly 1 byte instructiosn you can get a LOT of performance this way). PowerPC is fixed at 4 bytes.

      You aren't going to get very far using just 1 byte instructions. The average x86 instruction going by the spec may be 2 bytes, but the average x86 instruction in actual code is going to be more. If you're doing FP then you would be using SSE instructions which mostly use 3 bytes *just* for the opcode, not including register or memory arguments which could use 1-3 more bytes and potentially more if you're using any prefixes. In general I think this advantage of x86 isn't very significant. I think it would be interesting to measure the average instruction size used in actual code. Personally, I'd take fixed-width instructions any day.

      --

      The enemies of Democracy are
    53. Re:I wish AMD and Intel teamed up for once by jgrahn · · Score: 1

      Given the size and scope of Windows, Microsoft probably can't easily port Windows and Win32 and Visual Studio's compiler over to another instruction set easily.

      Whatever the cause is, it isn't size and scope. Practically any piece of free software compiles on a dozen architectures. For example, Debian Gnu/Linux ships around thirteen gigabytes of software for each of eleven architectures ...

    54. Re:I wish AMD and Intel teamed up for once by marcosdumay · · Score: 1

      "PowerPC and ARM may be easier to decode but they take up a ton more space and that causes a significant decrease in cache and memory efficiency.

      Funny you mention cache efficiency, since it is the major drawback of the x86 opcode. Let me ask you how many general porpouse registers do one of those new architecture processors have? 64? 128? And how many they export to the compiler? 4.

      That is it, 4. How do expect the compiler to optimize memory access with 4 registers? All the other are used on inside-chip optimizations, but the processor has not the time nor the amount of information the compiler can use to optimize your program.

      The fact that x86 computers are that fast is simply awesome. But they needed a huge investiment to get that way, and still spend much more energy than a pure RISC chip for the same performance.

    55. Re:I wish AMD and Intel teamed up for once by x2A · · Score: 1

      "x86 does not have a tagged TLB"

      Traditionally, no, but newer processors have introduced it, eg, along with virtualisation extensions (found one reference here: http://amd.vendors.slashdot.org/article.pl?sid=06/ 05/15/1750200 , search on page for 'tagged')

      --
      The revolution will not be televised... but it will have a page on Wikipedia
    56. Re:I wish AMD and Intel teamed up for once by Verte · · Score: 1

      Oooh- interesting. I had a look at the system programmers manual and it is a little light on details- being part of the SVM extensions, it might not be possible to use them from ring 0, which is a shame because it can't be used for general context switching, and even if it were, I don't think we will see the major operating systems use them.

      --
      We at slashdot are scientists, specialists and kernel hackers. Your FUD will be found out.
    57. Re:I wish AMD and Intel teamed up for once by x2A · · Score: 1

      Xen has implemented use of tagged TLBs to save TLB flushing between switching virtual machines. The technology (on x86) is reaching (or has reached) two years old now, so there may well be other users of it by now.

      --
      The revolution will not be televised... but it will have a page on Wikipedia
    58. Re:I wish AMD and Intel teamed up for once by Verte · · Score: 1

      Xen was really the driving force requiring hardware virtualisation technology, so it's no surprise they used such a feature. Yet it's funny to think that it takes less effort to switch operating systems than it does to switch tasks. Really that's only half true- it wouldn't be difficult to get a hypervisor to provide calls to manipulate ASIDs, it's just that you can't get that functionality from the kernel itself, which seems a bit silly. I guess what I'm trying to say is, if any kernel wanted to use the tagged TLB, they would have to start their own hypervisor, unless they had the good fortune of running on a hypervisor that had calls to manipulate the ASIDs from kernel space. As you can see, it mightn't be trivial to implement on most kernels, and it requires assumptions about the VM [or lack thereof] that the operating system is running on.

      --
      We at slashdot are scientists, specialists and kernel hackers. Your FUD will be found out.
    59. Re:I wish AMD and Intel teamed up for once by x2A · · Score: 1

      My guess is that the ASID space may only be small (eg, a single byte); big enough to give each virtual machine its ID, but not big enough to give each process its own... what you'd need is some simple heuristics to work out which processes are running most often, and give them their own ASID, and all others would have to share and flush the TLB between switches between them. I guess if you gave processors which support this tagged TLB to kernel developers, we might see it come into mainline, otherwise people don't feel it's worth it? *shrug*

      --
      The revolution will not be televised... but it will have a page on Wikipedia
    60. Re:I wish AMD and Intel teamed up for once by Anonymous Coward · · Score: 0

      Considering that ICC is STILL the overall leader of the pack when it comes to C and C++ compilers for ALL x86/x86-64, with Pathscale still a bit behind, that argument turned out to be just so much bullshit from AMD.

    61. Re:I wish AMD and Intel teamed up for once by salimma · · Score: 1

      if/when they get tired of returning a small amount of Britain's contribution back to it to pay for translators.


      not anymore:

      The UK has sacrificed part of its rebate, but the UK government said it had to do this in order to pay its fair share of the costs of enlargement. And even though it has given up more of the rebate than it originally wanted to, it has engineered a situation where the UK, France and Italy will be making a roughly equivalent net contributions to the EU budget from 2007 onwards. In the past, the UK's net contribution has been much higher.


      And anyway, nobody is forcing Britain to stays in the EU. They could go back to the EEA and still have access to the Common Market anyway.
      --
      Michel
      Fedora Project Contribut
  5. Nice, but let's get Barcelona out the door, OK? by Anonymous Coward · · Score: 1, Interesting

    These extensions could be useful, but speaking as someone from the target audience... I just don't care right now. No amount of minor improvement difference (as might be gained through these) is as important to me as seeing a viable alternative to Intel. Not because I'm an AMD fanboy, but because competition brings the prices down, and accelerates the release of faster chips. From what I hear now, we'll finally see Barcelona chips out on September 10th at -maybe- up to 2.3 Ghz if you're one of the cherised few, but most retail ones will be 1.9 Ghz. I haven't seen the (valid) numbers, so I can't say for sure, but I'm worried about how competitive this will be.

    I realize that the software people and hardware people both have their projects to work on, and they work largely independently in terms of a time-frame, but I figure this news might be timed to say, "Hey! Look at us! We're doing stuff!", but it only serves to frustrate me that their still aren't any real numbers on Barcelona, and, on the whole, that AMD seems to have dropped the ball. /Grumble

    1. Re:Nice, but let's get Barcelona out the door, OK? by Pojut · · Score: 1, Interesting

      What I would like to know is how is it that AMD got it's ass handed to itself so viciously by Intel with the Core 2, and yet STILL isn't even remotely close to having something that can compete?

      AMD was "winning" for quite a long time...what happend that has made it impossible for them to come up with something even mildly exciting?

    2. Re:Nice, but let's get Barcelona out the door, OK? by Anonymous Coward · · Score: 1, Insightful

      A new microarchitecture is not something you bang together in a weekend. And from the looks of it, no amount of incremental tweaks to the K8 microarchitecture would be enough to catch up to Core. Barcelona was most likely in development well before Core hit the streets.

      You may remember that K8 gave Netburst a similar drubbing, and yet Intel continued on with Netburst for everything but its laptop products for some time. Core has now been on the market for just over a year.

    3. Re:Nice, but let's get Barcelona out the door, OK? by HandsOnFire · · Score: 3, Insightful

      What happened is that the P4 architecture was more of a marketing scheme to push MHz, but not performance. AMD came out with an architecture directed at high performance. Intel came out with the Core 2 products which also focused on peroformance instead of clock speed. Intel has a lead in the manufacturing process side with respect to node size. This helps them to produce a lot at a lower cost. And If you look at Intel's and AMD's financials, you'll see how much each has to spend on R&D. Intel has a lot more money to put down on more designs and more engineers than AMD does.

    4. Re:Nice, but let's get Barcelona out the door, OK? by Surt · · Score: 2

      Major architectural changes (historically) have been years between. AMD had the lead arch, and intel took years to respond with core. Now intel has the lead, and AMD won't compete until their new arch. The problem is compounded for AMD by intel deciding to make a major push to speed up their arch cycle time. AMD's new arch will have to do battle with intel's refined core2 shortly after release, and intel's next arch is due as soon as next year, so their window is tight. AMD is of course also trying to accelerate their cycle, but intel has a lot more money to spend on this battle.

      --
      "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
    5. Re:Nice, but let's get Barcelona out the door, OK? by GiMP · · Score: 1

      What I would like to know is how is it that AMD got it's ass handed to itself so viciously by Intel with the Core 2, and yet STILL isn't even remotely close to having something that can compete?


      As I see it... The memory bandwidth limitations on Intel's FSB are so restricting that for many applications it matters little how many cores or threads their CPUS can push. The reality is that Intel's chips cannot push memory around fast enough for those processors to be worthwhile. Rather than a dual quad-core system with Intel processors, get a quad dual-core system with AMD processors. You still get 8-cores, but you also get a whole lot more memory bandwidth.
    6. Re:Nice, but let's get Barcelona out the door, OK? by imgod2u · · Score: 1

      See, with single-die, multi-core solutions, the FSB becomes much less of a limitation. A smart caching system pretty much does away with most of the problems with the exception of streaming programs (like pixel processing).

      Looking at the Core 2's memory bandwidth compared to that of an X-2, it doesn't seem like effectively latency/bandwidth is all that lacking.

      As you scale to 8+ chips with separate cache pools, the difference will show, however.

      Also, keep in mind that in a NUMA architecture, you don't have one big chunk of memory to do what you will with. You have pockets of memory and if the OS/application isn't smart enough to partition its data chunks (or if two threads share a single, fragmented data chunk and there's no replication), then you're not effectively getting more bandwidth. In fact, it will slow you down as you'd have to go over the core-to-core interconnect (high-latency Hypertransport link in the case of the AMD64's) to get to the memory you want.

    7. Re:Nice, but let's get Barcelona out the door, OK? by Enderandrew · · Score: 1

      It is pretty simple. On the same fab, the AMD architecture was proving superior for a couple years. However, Intel has a superior manufacturing process right now. A 45nm process certainly beats the 90nm AMD was doing for a while, and the 65nm process they are using now.

      The question I used to have is when both were using the same manufacturing process, why AMD was kicking the teeth in on the P4 line and why it took so long for Intel to catch up. It goes back and forth.

      --
      http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
  6. Silicon Problems. by Anonymous Coward · · Score: 0, Interesting

    They can't get the chips to clock up nicely as a whole; an individual chip or a few dozen individuals can, but most of them are binning in the sub-2GHz category, and that's simply atrocious; no matter how much "better" they are than Intel's quad cores, Intel's are already pushing 3GHz (and benchmarking roughly 50% better, meaning both architectures are performing pretty similarly and roughly the same clock-for-clock).

    The first stab at Barcelona we're getting are going to pathetically under-perform compared to the competition.

    1. Re:Silicon Problems. by Cyclon · · Score: 1

      They can't get the chips to clock up nicely as a whole; an individual chip or a few dozen individuals can, but most of them are binning in the sub-2GHz category

      Do you have a source for that, or is it just internet speculation?

    2. Re:Silicon Problems. by Anonymous Coward · · Score: 2, Funny

      94.3% of all statistics are made up on the spot.

    3. Re:Silicon Problems. by jombeewoof · · Score: 1

      You can make up statistics to prove anything. 16% of all people know that.

      --
      Linux Zealots: Smarter than Mac Zealots, but still zealots.
    4. Re:Silicon Problems. by BillyBlaze · · Score: 1

      Only 3% of Slashdot users haven't heard that joke, and only 2% of those who have still think it's funny for the (on average) 36.4th time.

  7. Will Intel Adopt These Instructions? by Apple+Acolyte · · Score: 1

    Has there in the past been an example of AMD adding new instructions and then Intel following along and adopting them? I know it works in the converse, but somehow I doubt Intel wants AMD taking the lead in extending its own ISA.

    --
    Part of the hardcore faithful who believed in Apple long before it was cool again to do so
    1. Re:Will Intel Adopt These Instructions? by The+Real+Nem · · Score: 3, Informative
    2. Re:Will Intel Adopt These Instructions? by Anonymous Coward · · Score: 0

      How about the AMD64 ISA.

    3. Re:Will Intel Adopt These Instructions? by Apple+Acolyte · · Score: 1

      Oops, kind of forgot about that case. Sorry for the stupid question.

      --
      Part of the hardcore faithful who believed in Apple long before it was cool again to do so
    4. Re:Will Intel Adopt These Instructions? by Anarke_Incarnate · · Score: 1

      Technically correct and wrong at the same time. EM64T has a kludge in the way that memory is addressed. The EM64T chips cannot access memory above 4GB without using pointers.

    5. Re:Will Intel Adopt These Instructions? by edwdig · · Score: 1

      Technically correct and wrong at the same time. EM64T has a kludge in the way that memory is addressed. The EM64T chips cannot access memory above 4GB without using pointers.

      You can't access any memory without pointers.

      You're probably thinking of Page Addressing Extensions (PAE), which let you swap out parts of the page tables to point to memory above 4 GB. That's existed since the Pentium Pro or so. EM64T is just the damage control name Intel's marketing department came up with for their implementation of AMD64.

    6. Re:Will Intel Adopt These Instructions? by SEE · · Score: 1

      x86-64 (AMD64) is the classic case.

      Prior to that, the closest thing was when NexGen (just before AMD bought them) developed an MMX-like extension for the Nx686 (released by AMD as the K6) and cut a deal for Cyrix to use them, which is what provoked Intel into creating MMX with cross-licensing to AMD and Cyrix.

    7. Re:Will Intel Adopt These Instructions? by Anarke_Incarnate · · Score: 1
      No, this is not regarding PAE. PAE should be irrelevant with 64bit extensions since it should be able to address over 32bits worth of RAM. PAE was for older generation processors. The issue is that the EM64T spec does not change the addressable amounts of RAM. I wish I had the link from Red Hat about the kernel hacks that were needed to make it work.

      By the way, you do not need pointers to address memory, and what I had stated was that in order to address higher than 4GB of RAM, the EM64T chips have a kludge that remaps the higher memory to lower memory via pointers.

    8. Re:Will Intel Adopt These Instructions? by Necroman · · Score: 1

      Intel and AMD have some nice agreements between one another where they are allowed to share information about x86 processor extensions and the like. This means if one company designs a cool new extension, the other can pick it up with little hassle.

      (Or at least that's how I remember it working)

      --
      Its not what it is, its something else.
    9. Re:Will Intel Adopt These Instructions? by andreyw · · Score: 1

      I don't know what you're smoking, but I want some of it.

      Let's start with some basic facts, that you can verify for yourself by hitting the long mode specs in AMD and Intel manuals:
      1) You need PAE enabled (in CR4). Long mode uses a 4-level paging table scheme (PML4 - PDPT - PD - PT, although you can get away with only using the first three levels if you are fine with a 2MB granularity.
      2) The linear address space is 64 bits.
      3) The physical address space, ATM AFAIK, is 52 bits, with the other bits reserved for now. Going beyond 52-bits will likely need a PML5).
      4) All registers are extended to 64-bit length, there are 8 new general purpose registers registers.
      5) I am going to re-iterate - your address space is 64-bits. Your addressable memory is 2^64 - 1. Unlike PAE, where your linear address space is still 32 bits, you do not need an aperture within your linear 4GB to access physical addresses > 4GB.

      I have no clue what the hell you talk about when you talk about "pointers", which are a software language concept. On EM64T/AMD64 you can perform direct and indirect MOVs to and from your entire linear (i.e. virtual) address space - and thus, through the "wonder" of paging (which you need enabled to enter long mode in the first place) - to and from your entire physical address space.

      If you want a tiny piece of advice - instead of half-understanding mailing-list threads and articles written by people who know what they're talking about TO people who know what they're talking about - just hit the specs. They're free. Shit dude, if you acually bothered to try some 64-bit programming (even at the user, much less system, level) you would see that what you just wrote is just plain wrong.

      Since this is Slashdot, I'll even give you links to the specs -
      1) http://www.intel.com/products/processor/manuals/in dex.htm
      2) http://www.amd.com/us-en/Processors/DevelopWithAMD /0,,30_2252_11467_11513,00.html

    10. Re:Will Intel Adopt These Instructions? by forkazoo · · Score: 1

      Technically correct and wrong at the same time. EM64T has a kludge in the way that memory is addressed. The EM64T chips cannot access memory above 4GB without using pointers.


      Could you clarify that at all? I'm not the end-all, be-all expert on these things, but I do know enough to be sure that what you wrote is so not-correct as to not even be wrong...

      Pointers really only matter from a relatively high-level software perspective. From a low level hardware perspective, you can either say that pointers don't exist, or that all memory is accessed via pointers. The distinction of pointer vs. some other conceptual model for accessing data (such as Java) just doesn't exist at that level. Consequently, talking about needing to use pointers to access memory above 4 GB is like trying to decide if a Senator from Alaska rambling about the Internet smells more like cyan of yellow. You can certainly say he doesn't smell very yellow, and be somewhat correct, but the statement carries no information.
    11. Re:Will Intel Adopt These Instructions? by Anarke_Incarnate · · Score: 1

      I simplified the fucking explanation due to being tired, sue me.

      https://www.redhat.com/docs/manuals/enterprise/RHE L-3-Manual/release-notes/as-amd64/RELEASE-NOTES-U2 -x86_64-en.html

      From the reference itself
      " Software IOTLB -- Intel® EM64T does not support an IOMMU in hardware while AMD64 processors do. This means that physical addresses above 4GB (32 bits) cannot reliably be the source or destination of DMA operations. Therefore, the Red Hat Enterprise Linux 3 Update 2 kernel "bounces" all DMA operations to or from physical addresses above 4GB to buffers that the kernel pre-allocated below 4GB at boot time. This is likely to result in lower performance for IO-intensive workloads for Intel® EM64T as compared to AMD64 processors.

        Lack of 3DNow!(TM) instructions: -- Intel® EM64T does not recognize the prefetch and prefetchw instructions while AMD64 processors do. The Red Hat Enterprise Linux 3 Update 2 kernel excludes these instructions in both C and assembly language code and therefore will suffer a small amount of performance degradation."

    12. Re:Will Intel Adopt These Instructions? by edwdig · · Score: 1

      Well, you missed the point of it. This is referring to DMA. DMA is means of fast transfer of data between main memory and an expansion device. Basically, this means that, for example, to send a packet of data to your network card, the data must exist within the first 4 GB of memory.

      It simply means that when the kernel allocates buffers for data transfer to/from hardware, it has to be a little careful about where it does it. This doesn't have any impact whatsoever on userspace code.

      Also, at least in the early days of 32 code, there were limits such as DMA only supporting the first 24 bits of address space. This isn't really anything new, nor is it anything to be concerned about unless you're a kernel developer.

    13. Re:Will Intel Adopt These Instructions? by x2A · · Score: 1

      Aside from what others mentioned, "3D-Now!" was added by AMD on the K6/2, as a kind of floating point MMX type thing I seem to recall. Intel later incorporated* these instruction sets, and AMD incorporated* Intels SSE instruction sets.

      (*mostly)

      --
      The revolution will not be televised... but it will have a page on Wikipedia
    14. Re:Will Intel Adopt These Instructions? by andreyw · · Score: 1

      No, you do not understand the crux of the problem. This has nothing with memory access from from user-space or kernel-space code. This has nothing to with the CPU instruction set architecture. This has everything to do with direct memory access by I/O devices, and is a result of lack of chipset support for this, wherein the entire north bridge is integrated in the AMD chips making any support issues moot. This cripples I/O bandwith.

      You didn't simplify the explanation - you did not understand it, and you STILL do not. You remind me of the CEO of a statup IT company I work at, who likes ranting about stuff he clearly has no grasp over... but damn - he's got opinions and he wants you to hear (and not laugh at ) them.

    15. Re:Will Intel Adopt These Instructions? by Anonymous Coward · · Score: 0

      Dude, you think you are just a tad pretentious? Be thankful that guy doesn't shitcan your ass for being a douchebag. A simple "Hey, not what you thought it was" would suffice, but instead you come across as a dickhead.

    16. Re:Will Intel Adopt These Instructions? by Anonymous Coward · · Score: 0

      How many times can you explain politely to someone that they're wrong?

    17. Re:Will Intel Adopt These Instructions? by thegnu · · Score: 1
      Apple Acolyte wrote:

      Oops, kind of forgot about that case. Sorry for the stupid question.
      --
      PowerPC zealot since 1994

      Apparently you're a mac person, so it's understandable. :-)
      --
      Please stop stalking me, bro.
    18. Re:Will Intel Adopt These Instructions? by tji · · Score: 1

      They don't necessarily need to. But, if they are useful, Intel could either adopt them or make their equivalent.

      Adopt: x86-64 (AMD Created, Intel adopted it when the Itanium sunk)

      Co-existing features: SIMD: MMX/SSE and 3DNow! (SSE eventually won out, but they co-existed for a long time).
              Virtualization: Intel VT and AMD-V co-exist today, and both are used by virtualization projects like Xen.

  8. good by thatskinnyguy · · Score: 0, Flamebait

    "... they could be used to accelerate Java, .Net, and dynamic optimizers." About 80% of all Java-based apps I've run across could use all the help they can get in the speed department. Robust platform... Just the speed isn't quite there.
    --
    The game.
  9. just about time by yoprst · · Score: 1

    I never quite understood why chip manufacturers had added cores long after memory bandwidth had became a problem. Why not add specialized execution units and make instruction set a bit fatter? It's not like arithmetic and logic operations are all that you can do with an int or a few ints. Same for floats (but even more operations).

    1. Re:just about time by Anonymous Coward · · Score: 0

      x86 is full of specialized instructions.

    2. Re:just about time by Courageous · · Score: 1

      I never quite understood why chip manufacturers had added cores long after memory bandwidth had became a problem.

      They've been increasing bandwidth while adding cores, and those cores also happen to have things like L1 and L2 caches, and so forth.

      C//

  10. nice by Pokamonster · · Score: 1

    its a good start, but it isint much. parallel programming will still be a bitch

  11. 2008: x86_64 retired. by Anonymous Coward · · Score: 0

    2008: x86_64 retired because of bad performance, there are many prefix's bytes of the instructions of the CISC ISA x86_64.

    x86-64 IS DEAD!!!

    Let's go ppc64!!!

    Let's go IBM!!! Let's go AMD-IBM!!!

    1. Re:2008: x86_64 retired. by Anonymous Coward · · Score: 0

      IBM: please, retire you archaic x86-64.
      AMD: sure?
      IBM: yes, you can market cheap ppc64 four-cores 1.8 GHz, and i can market mainframes, but the condition is the retired x86-64 from the root of evil of intel 'i386'.
      AMD: good business!!! I will accept!!!

    2. Re:2008: x86_64 retired. by be-fan · · Score: 1

      Entertainingly, PPC code is much larger than AMD64 code, prefix bytes or no.

      --
      A deep unwavering belief is a sure sign you're missing something...
  12. I think this is great by P3NIS_CLEAVER · · Score: 2, Funny

    I for one
    think this
    is good
    news.

    --
    Please sign petition to restore sanity to our banking system!!!

    http://financialpetition.org/
    1. Re:I think this is great by Anonymous Coward · · Score: 0

      too bad I only
      have 2 cores

    2. Re:I think this is great by Mwongozi · · Score: 1

      > Brevity is the soul of wit.

      No it isn't.

  13. Is it really that hard? by Anonymous Coward · · Score: 0

    I see all fuss about programming. easy. don't what the is parallel It's
    I see all fuss about programming. easy. don't what the is parallel It's

    1. Re:Is it really that hard? by Short+Circuit · · Score: 2, Funny

      I see all fuss about programming. easy. don't what the is parallel It's
      I see all fuss about programming. easy. don't what the is parallel It's I hereby propose that execution is in order for out of order speech.
    2. Re:Is it really that hard? by UID30 · · Score: 1

      I think I'm missing some op-codes.

      --
      "Glory is fleeting, but obscurity is forever." - Napoleon Bonaparte
  14. all over the news? really? by Anonymous Coward · · Score: 0

    "It has been all over the news today:". Really? The only AMD news I've been seeing all day has been "Barcelona not shipping on schedule, and parts won't be as fast as promised". Ooops. Well, those Core2's are still cheap. and faster.

  15. You can get the x86/EMT64 documentation from intel by Gazzonyx · · Score: 2, Informative
    If you root around Intel's site a bit, you can get the developer manuals for asm on their chips; I think there's like 5 of them @ 300 pages+ each. It's all the documentation, I think only 1 book is the actual language specs. Anyways, if you ask them nicely via email, they'll send the manuals to you for free. I got mine in under a week from when I emailed them. They even pay shipping.


    Also, I know from asm on SPARC that many op codes are really just variations of other ops (and/or pseudo ops). For instance, (I'm not sure of the x86 equivalent) .mul is a pseudo op for sll (shift left logical), IIRC. And almost every op has a data type specific variation (byte, half, word, double, etc), on top of that.

    --

    If I mod you up, it doesn't necessarily mean I agree with what you've said, sorry.

  16. Nothing special for Java or .NET by dnoyeb · · Score: 0, Troll

    That must have been speculation or a SWAG from the poster to suggest it could be used to accelerate Java and/or .NET. There is nothing special about java or net that would allow this optimization. Both run on top of the OS and not on top of the hardware. So if the OS provided similar information about its routines, then that could be used. As it stands, the only thing to accelerate Java or .NET (both of which are c/c++ programs) is something that would accelerate any c/c++ program running on top of an OS.

    1. Re:Nothing special for Java or .NET by Wesley+Felter · · Score: 4, Insightful

      Performance counters could be used by JITs to generate more optimized code. I wonder which programming languages use JITs...

    2. Re:Nothing special for Java or .NET by Anonymous Coward · · Score: 0

      That assertion is in the original press release. But you're correct, there's no reason you couldn't use light-weight profiling in any other language. What makes it appealing for interpreted / JITed languages, though, is that the original program doesn't need to be aware of this - it's up to the JVM or .NET Framework to make your application fast for you. It'd probably be infeasible for a C compiler to spit out binaries that dynamically optimize themselves without you, the programmer, being aware of it.

    3. Re:Nothing special for Java or .NET by jhol13 · · Score: 1

      There already are systems which do exactly that (optimise dynamically C programs), see http://arstechnica.com/reviews/1q00/dynamo/dynamo- 1.html.

      Of course HotSpot-like JIT'd languages are "easiest" target and most likely gives the biggest performance improvement. After all, HotSpot does partially (in SW) what the proposal does (in HW).

    4. Re:Nothing special for Java or .NET by TheNetAvenger · · Score: 1

      That must have been speculation or a SWAG from the poster to suggest it could be used to accelerate Java and/or .NET. There is nothing special about java or net that would allow this optimization.

      Ok, sorry, wrong, and yes, wrong again...

      The notes about .NET and JAVA come specifically from AMD themselves.

      The reason it would benefits these environments is because they are processed on the fly and the environment could make the 'adjustments' to the code at runtime instead of it be 'locked' as natively compiled code is.

      This is level 101 understanding and logic here, not sure how you are missing this.

    5. Re:Nothing special for Java or .NET by TheNetAvenger · · Score: 1

      There already are systems which do exactly that (optimise dynamically C programs)

      I don't disagree with the notion that any natively compiled language could be scaled to take advantage of this, a good solution would be an OS level scheduling mechanism for natively compiled applications that could make the decisions based on the information the AMD instructions would be offering.

      However, the reference you cite is more about basic instruction changing and not the dynamics of testing to see what threads are busy, which ones need more time, and where they can shifted to run at runtime based on these needs. The Transmeta solution is a lot like a real-time in chip concept of many of the old translation tricks used in various products at a software level like the FX!32 used on WindowsNT Alpha version.

    6. Re:Nothing special for Java or .NET by cbhacking · · Score: 1

      You're missing the point here. They aren't talking about accelerating the frameworks (the JRE, the Common Language Runtime, any other program that was compiled to native code at or before install time), they are about accelerating the applications that are run using those frameworks. The reason is that both frameworks use JIT-compiled code (I believe old JVMs translated instructions individually, but these days I'm pretty sure the whole .class gets compiled to native code just before execution).

      The advantage of these extensions is that they can make it possible to optimize the hell out of the JIT, producing code at run-time that takes full advantage of the available capabilities of the processor(s). Theoretically, this could make these JIT-compiled programs faster to run (though they will always incur a startup penalty) than non-JIT native code because the JIT compiler knows things about the environment in which it will be executing that would be specific to that machine at that time.

      --
      There's no place I could be, since I've found Serenity...
    7. Re:Nothing special for Java or .NET by x2A · · Score: 1

      "There is nothing special about java or net that would allow this optimization"

      Sure there are. A profiler could quickly pick up on a function that's getting called many times from within a loop, and decide it could speed it up more by inlining it. Or, a bit of inline code that isn't being used often could be moved out of line, so the rest of the loop fits into a single cache line.

      --
      The revolution will not be televised... but it will have a page on Wikipedia
    8. Re:Nothing special for Java or .NET by jhol13 · · Score: 1

      You are right, the reference I gave is old.

      I was not trying to disagree completely with the GP as clearly JIT'd languages are biggest winners. Just noted that "even" C can be improved dynamically, without compiler help. I would not call it a scheduling mechanism, rather a code morphing mechanism.

      Besides the proposed extensions (the PDF) did not so much help in thread scheduling but rather on cache coherency and finding hot spots. Garbage collection can, if it would use the cache information, improve performance perhaps a lot just by moving the objects around. HotSpot VM can use the, eh, hot spot information without counting in SW.

  17. Re:Hardware Accelleration == Bad Trend by ZakuSage · · Score: 1

    You can poo-poo Java all you want, but the reality is that it's made programming a lot easier for the "rest of us", especially in a world where cross platform compatibility is key.

  18. Re:Hardware Accelleration == Bad Trend by Chris+Burke · · Score: 1

    Yet another waste of silicon to 'accellerate' badly written software.

    AND well-written software. What, you think you could write code that's just as fast without all the "hardware acceleration" being done for you, without using any instruction set extensions that have been added over the years? You are on crack.

    Instead of devoting transistors to speed up the latest toy programming languages ('managed' code), why can't we just train programmers better?

    And better profiling tools are contrary to this goal how exactly? And at what point do you tell your better-trained programmers that using those hardware acceleration features will make their code go faster?

    Ahh..of course, because of java..don't bother learning HOW to optimized, let java do it FOR you...

    Or let your C compiler do it for you. Whichever. There's a matter of degree, to be sure, but even still you're most likely wasting your time "optimizing" individual lines of C code since the compiler can probably do a better job and that's been the case for quite a while. The thing that will get you the most bang for buck is the same in C as it is in Java -- optimize your algorithms. Java can't do that for you, and neither can your C compiler.

    --

    The enemies of Democracy are
  19. Re:Hardware Accelleration == Bad Trend by Anonymous Coward · · Score: 0

    Ah, yes, the 'rest of us', meaning to me, the mediocre programmers, or 'Code Monkeys'. Please, by all means continue to churn out steaming mounds of code.

    It's cross-platform alright, but crap is crap, on any platform, and in any language.

  20. Re:Hardware Accelleration == Bad Trend by kiddygrinder · · Score: 2, Funny

    Java never made anything easier for anyone and you know it.

    --
    This is a joke. I am joking. Joke joke joke.
  21. Side Channels by DanLake · · Score: 1

    "They would give software access to information about cache misses..." Yeah that ought to help significantly with side-channel attacks against crypto software.

    1. Re:Side Channels by gnasher719 · · Score: 1

      >> "They would give software access to information about cache misses..." Yeah that ought to help significantly with side-channel attacks against crypto software.

      I think you didn't read the spec. All that information is only available to the thread that is profiled; everything is context-switched so it can't leak out to other threads and definitely not to other processes.

  22. Re:Hardware Accelleration == Bad Trend by glitch23 · · Score: 1

    It isn't Intel's job to train programmers to do things right. That is the responsiblity of the education system. Nothing stops the education system from still teaching proper programming and design skills.

    --
    this nation, under God, shall have a new birth of freedom. -- Lincoln, Gettysburg Address
  23. Re:Hardware Accelleration == Bad Trend by shankarunni · · Score: 1

    Yet another waste of silicon to 'accellerate' badly written software.

    Instead of devoting transistors to speed up the latest toy programming languages ('managed' code), why can't we just train programmers better?

    Ahh..of course, because of java..don't bother learning HOW to optimized, let java do it FOR you...

    I'm tempted to slam this as an uneducated rant, but since there's a little teeny kernel of truth in it, I'll let it slide.

    The issue is not "badly written code". It's being able to run the same compiled code on a wide variety of hardware without recompiling it for every chip variant.

    The huge drawback with all the RISC architectures (at least initially) was that each version of each chip had different numbers of functional units, different latencies for the functional units, different latencies to cache and memory, etc.

    If you ever dealt with the MIPS or Sun compilers, they have a huge number of flags for hyper-optimizations on a variety of implementations of those architectures. The problem is that when you optimize it for one variant, it often makes it worse on other variants (because instructions that didn't collide in the instruction pipeline now do, as just one example..)

    Now all of the modern architectures play the same games. Power/PowerPC, SPARC, Itanium, all of them. They all have multiple pipelines and execution units, massively parallel instruction issue, etc. Just like the X86.

    And it's not because the programmers are idiots, but because that's the only way you could ever ship one binary that would run "optimally" on every implementation of that architecture.

    PS. Java and C++ only make this worse because they are so dependent on such out-of-order massively-parallel execution (since they are so darn difficult to statically optimize).

    The supreme irony of this is that for a while there, Java on X86 (Sun's implementation, no less!) ran rings around Java on SPARC (great strategy for pulling in customers for SPARC !). It's only with recent SPARC implentations (Niagara/Niagara 2) that play the same way as the X86's, that SPARC has finally caught up with and passed X86 again..

  24. BIOS vs. EFI? by tepples · · Score: 1

    If only it were so. Unfortunately, it's not. There's a distressing amount of 16-bit real-mode code being executed in between power-on and your OS kernel switching into 32 or 64 bit mode even on the most modern PC. Is this true only of machines that use BIOS, or is it also true of machines that use only EFI?
    1. Re:BIOS vs. EFI? by WhatAmIDoingHere · · Score: 1

      In windows, thanks to people who love making things easy for the end user and who also are always looking to the future.. a lot of install programs for games and apps are 16bit.

      Another fun thing is a lot of games broke in Vista because the game had the "MY DOCUMENTS" folder location hard coded.

      Future looking programmers..

      --
      Not a Twitter sockpuppet... but I wish I was.
  25. Re:You can get the x86/EMT64 documentation from in by Anonymous Coward · · Score: 0

    Intel isn't alone by providing detailed documentation. AMD gives instruction set and detailed optimization tips too.

    It would be cool if GPU manufacturers were as helpful as CPU manufacturers are!

  26. Map and reduce? by tepples · · Score: 4, Interesting

    Compilers don't know how to extract parallelism very well. It's an *incredibly* difficult problem It's not that compilers can't extract parallelism. It's that the C and C++ language standards lack a way to express parallelism. Often, you want to compute a function for each element in an array, resulting in a new array. In some languages, this is called map(). In Python, this is [expression_involving(el) for el in some_list]. An ideal language would provide a way to express that a function has no side effects, allowing map() to farm out different slices of the array to different CPUs. However, iterators in C++ and many other popular languages assume that the computation may have side effects, and provide no way inside the standard language to ask the compiler to break the computation into slices.
    1. Re:Map and reduce? by Anonymous Coward · · Score: 0

      Ironically, C++ has just this in the form of one standard algorithm at least. Hey, I know, it's pathetic, but technically there is at least *one* that I know of. From the article here:
      http://www.ddj.com/cpp/184403769

      "-for_each applies the operation in a definite order, namely starting at the beginning and proceeding to the end of the input range; no such guarantee is given for transform.
      -The operation supplied to transform must not have any side effects; no such restriction is imposed on the operation supplied to for_each."

      Of course, theory and practice are different things, so I guess it doesn't. Heh. And don't get me started on valarray...

    2. Re:Map and reduce? by Josef+Meixner · · Score: 2, Interesting

      An ideal language would provide a way to express that a function has no side effects, allowing map() to farm out different slices of the array to different CPUs.

      And would be terrible for performance. Why on earth does everybody assume that fine grained parallelism will ever work? You need a very highly specialized processor to make it work and those have failed a decade ago as the "standard CPUs" just blew them away. Remember the Connection Machine, that was a box with exactly that fine grain of parallelization? It was programmed in C and Fortran with specialized extensions to express parallelism, incidently they live on in the way you program GPUs and the SSE is also another example of even finer grained parallelism.

      Fine grained parallelism only works on very small and specific tasks. In general you want high level parallelism with very little communication and very little dependency on each other. As that is another extreme you have to find a compromise, but to assume the compiler can magically extract a real speed up from a bunch of simple for-loops is just completely unrealistic.

      You will have to learn to handle the parallelism. It takes different algorithms and a different way to structure programs. Also you will have to accept that there are things which will not work in parallel. You can parallelize them, but the speed up is just not there to make it useful.

      Parallel programming is hard and blaming it on programming languages and claiming another one will solve all problems is just the usual silver bullet. Those languages have been around for ever, functional programming languages can be parallelized automatically. So if they make it so much easier, why aren't they not used? Could it be, that you have to pay for the easy parallelization with something?

    3. Re:Map and reduce? by Just+Some+Guy · · Score: 1

      An ideal language would provide a way to express that a function has no side effects, allowing map() to farm out different slices of the array to different CPUs.

      I wrote something like that for Python. The idea is that you'd use a "decorator" to indicate that a method is parallelizable (doesn't have any side effects) and roughly how many processes to spread it across (because you don't want to hit your database with 10,000 simultaneous queries just because your client could theoretically do so, for instance). For example:

      @parallelizable(10, perproc=4)
      def timestwo(x, y): return (x + y) * 2

      print map(timestwo, [1, 2, 3, 4], [7, 8, 9, 10])

      would tell the multiprocessing map() that timestwo() can be run up to 4 times per CPU, up to a total limit of 10 times. The per-CPU limit is because some tasks spend a lot of time waiting on external data (DB calls, file reads, etc) and it's OK to load out the system with those mostly-idle processes. The hard limit is because there's likely to be a maximum you still don't want to exceed.

      BTW, this was meant mainly as a proof of concept and not something you'd just randomly use all over the place. Please consider the idea behind it and not just my particular implementation that I hammered out one afternoon.

      --
      Dewey, what part of this looks like authorities should be involved?
  27. About 57,839 opcodes by fedorowp · · Score: 1

    The number depends on how you look at it. I made a table that lists every x86 instruction excluding prefixes a while ago and it came out to 57,839 instruction/parameter combinations. That doesn't factor in the specific values passed to the opcode, or in the registers, or the differences in behavior of the chip depending on mode, how memory protection is setup, out of order execution, or instruction prefixes.

    The large number of combinations certainly makes validation a tremendous challenge.

  28. Re:Hardware Accelleration == Bad Trend by Anonymous Coward · · Score: 0

    Do you even know Java? Or do you know anything OTHER than Java?

  29. Re:Hardware Accelleration == Bad Trend by HandsOnFire · · Score: 1

    Running code that is directed at one architecture or another was an issue for RISC. If you look at the x86 CISC machines, you'll have a lot less variance. When it comes to RISC vs. CISC, it's not so important to omptimize for a specific architeture on CISC simply because the CPU handles a lot of things instead of the programmer/compiler code. The variances between running a program on CISC architectures is much smallar then doing the same for RISC architectures.

  30. Re:You can get the x86/EMT64 documentation from in by Gazzonyx · · Score: 1

    Yeah, but I couldn't find a way to get AMD to mail me a hard copy of their documentation (at least, not for free). If they do so, please correct me, as I haven't looked in quite a few months.

    --

    If I mod you up, it doesn't necessarily mean I agree with what you've said, sorry.

  31. Re:Hardware Accelleration == Bad Trend by x2A · · Score: 1

    Profiling is useful for code produced by any language, and being able to profile without adding code, eg, at the beginning of functions, means you get to see how the actual software runs, without doing things that affects caching etc (for example, profiling code might push certain instructions onto a different cache line, skewing the results)

    --
    The revolution will not be televised... but it will have a page on Wikipedia
  32. Logical reasons to buy AMD by Enderandrew · · Score: 3, Insightful

    Funny. I've seen a $59 Brisbane core (1.9 out of the box) overclocked to 2.9 GHz with just air cooling, so I'm not sure why everyone insists AMD can't hit the 3GHz barrier, especially when AMD keeps displaying 3GHz Barecelonas.

    There are three reasons to buy AMD right now.

    1. Price, price and price. AMD knows Intel has the better fab, but AMD is selling super cheap. You can get a dual-core processor for half what Intel charges, and for the average user, it is more than enough. I'm running Oblivion at 30 FPS with a $59 processor, and I've barely overclocked it. The cheapest Intel dual-core proc was $120 when I bought my $59 proc. Most people have no idea that their proc these days often underclocks itself, and you rarely touch the full potential of your proc. Intel is faster, and no one doubts that today, but if you never see the speed benefit, why spend the extra dollars? On a performance per dollar basis, AMD wins hands down.

    2. There is a mountain of evidence against Intel for anti-trust violations, and I try not to financially support evil. The EU is also coming down on Intel for anti-trust violations.

    3. Even if the anti-trust suits both come through, AMD is near bankruptcy, and I prefer choice in the marketplace. I am terrified of the day when Intel has no competition pushing them and they can just sell what they want and whatever price they want.

    --
    http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
    1. Re:Logical reasons to buy AMD by Wavicle · · Score: 2, Interesting

      Funny. I've seen a $59 Brisbane core (1.9 out of the box) overclocked to 2.9 GHz with just air cooling, so I'm not sure why everyone insists AMD can't hit the 3GHz barrier, especially when AMD keeps displaying 3GHz Barecelonas.

      Gosh, maybe you should go tell AMD that they aren't having any trouble with leakage, the yield of their 65nm parts is optimal and they can start volume production right now! The time AMD has spent not shipping Barcelona has been costing them dearly. Did you see the loss they posted last quarter? Did you notice their market cap right now is just a tad over what they paid for ATI?

      AMD knows Intel has the better fab, but AMD is selling super cheap. You can get a dual-core processor for half what Intel charges

      Yeah, you can get a share of AMD for about half of what Intel's cost as well.

      On a performance per dollar basis, AMD wins hands down.

      So rush out and buy an AMD now, before their super-low margins bankrupt them altogether!

      There is a mountain of evidence against Intel for anti-trust violations, and I try not to financially support evil. The EU is also coming down on Intel for anti-trust violations.

      You know if Intel did what AMD has done back when AMD had the faster product - cut their margins down to almost nothing to undersell AMD and gain market share - you would be screaming about the evil monopolist Intel. Somehow it is exactly the opposite of evil when AMD does it.

      Even if the anti-trust suits both come through, AMD is near bankruptcy, and I prefer choice in the marketplace. I am terrified of the day when Intel has no competition pushing them and they can just sell what they want and whatever price they want.

      Oh please. Regulators would never allow Intel to buy AMDs IP and there are plenty of companies out there willing to jump in and try their hand at the x86 game. If Intel starts driving up prices, that just makes jumping in appear much more appealing.

      --
      Education is a better safeguard of liberty than a standing army.
      Edward Everett (1794 - 1865)
    2. Re:Logical reasons to buy AMD by Enderandrew · · Score: 1

      Are you trolling? Sure seems like it to me.

      Intel demanded that people not carry or display AMD products, or they'd refuse to ship product they already purchased. That is pretty clearly evil.

      Intel doesn't have to buy AMD's IP. If AMD goes belly up, then Intel will have an unchallenged monopoly, and no one has suggested trying to compete with them.

      Barcelona is late, and Intel does have a better manufacturing process. No one is contesting either of these points, but cheap AMD processors are reaching the 3 GHz barrier, so only trolls insist that AMD will never hit that mark. And when a product produces similiar results for HALF the price, that seems to be a good reason to buy that product. Given that both processors run under-clocked most of the time, there may be no real-world performance difference between the two.

      However, you can keep trolling if you want.

      --
      http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
    3. Re:Logical reasons to buy AMD by ben+there... · · Score: 0

      And when a product produces similiar results for HALF the price, that seems to be a good reason to buy that product. Let's look at products from each that produce similar results:
      AMD X2 5600+ is about equivalent to Core 2 Duo 6420.

      AMD X2 5600+ costs $150.

      Intel C2D 6420 costs $186.

      Looks like AMD costs 80% of Intel for the same performance. Not very close to half, though I'll give you that it is cheaper. Throw in the cheaper Core 2 Duos like the $125 Allendales, and overclock both, and I wouldn't even give you that.

    4. Re:Logical reasons to buy AMD by Enderandrew · · Score: 1

      When I bought my last processor, it was a few months back before all the AMD price cuts by the 3600+ was $59 and the cheapest Core 2 Duo was $120.

      --
      http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
    5. Re:Logical reasons to buy AMD by ben+there... · · Score: 0

      Yes, I'm sure you're right, but it still says nothing about the relative performance. Prior to AMD's price cuts, and before the C2D E4300 was released, was when I bought my E6600. At that time, I believe the E6300 was even beating the 4600+. Their prices were very close. We can talk about now. We can talk about when you bought yours, or when I bought mine. Regardless, there hasn't been a time when you could get an AMD processor for half the price and equal performance of Intel's cheapest C2D.

    6. Re:Logical reasons to buy AMD by xrobertcmx · · Score: 1

      The E6700 is just a touch faster then the X2 6000+. Current New Egg Price E6700 $319 X2 6000+ $169. Not exactly 50%, but I won't argue the point too much.

    7. Re:Logical reasons to buy AMD by Wavicle · · Score: 1

      Are you trolling? Sure seems like it to me.

      No, I'm stating some inconvenient truths.

      Intel demanded that people not carry or display AMD products, or they'd refuse to ship product they already purchased. That is pretty clearly evil.

      It's an alleged evil. Since the only major Intel-only brand in the US was Dell, I don't find it a particularly compelling case of evil. In fact it is a pretty short walk from saying Intel was manipulating Dell to saying Dell was manipulating Intel (hey Intel, I hear AMD has some pretty nice procs, sure is hard to turn those down, unless maybe we got a big discount on our next batch of cpus).

      Intel doesn't have to buy AMD's IP. If AMD goes belly up, then Intel will have an unchallenged monopoly, and no one has suggested trying to compete with them.

      AMD and Intel are fighting it out right now. The margins for each company are so low and the cost of entry so high that nobody else WANTS to jump into that alley fight. If AMD goes belly up, the cost of entry (buying AMD IP) gets a whole lot cheaper. If Intel jumps its gross margins, the potential returns of entering the market get a whole lot larger. "Unchallenged Monopoly" is just sabre rattling. It isn't going to happen.

      Barcelona is late, and Intel does have a better manufacturing process. No one is contesting either of these points, but cheap AMD processors are reaching the 3 GHz barrier, so only trolls insist that AMD will never hit that mark.

      Of course that is a strawman, as I never said nor implied AMD will never hit that mark. It's fairly certain in the market that AMD is having yield problems due to leakage on their 65nm parts. They are delayed while they get that under control. It also explains why initial Barcelona ships will be binned at 2GHz and only to key customers. AMD will have at most 8 weeks to get Barcelona penetration going before Intel starts shipping its 3.2GHz 45nm "Penryn."

      --
      Education is a better safeguard of liberty than a standing army.
      Edward Everett (1794 - 1865)
    8. Re:Logical reasons to buy AMD by Enderandrew · · Score: 1

      Funny then that a bunch of manufactures produced evidence and agreed to testify against Intel then. And the evidence was so overwhelming that the EU launched their own anti-trust suit, unprovoked by AMD. But clearly according to you, Intel is doing anything wrong.

      And you're suggesting it is a good thing for the market to go down to one company so prices can rise, and hopefully a new competitor emerges? Quite frankly, AMD has had decades to try and establish themselves. If they go belly-up, then how is someone new going to fix that? Once every major maker is only carrying one chip, it will be near impossible for the market to become competitive again.

      Someone who defends an evil company and roots for a monopoly sure seems like a troll to me.

      --
      http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
  33. Re:Hardware Accelleration == Bad Trend by Enderandrew · · Score: 1

    Java is a great concept with piss-poor execution.

    Oddly enough, the same code can often be compiled cross-architecture and cross-platform quite easily on GCC that provides a nice, fast executable native to each platform and architecture and it uses a fraction of the start-up speed and resources of Java.

    I'm a crappy programmer, and even that is transparent to me.

    --
    http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
  34. Jombeewoof, get off the Internet. by Anonymous Coward · · Score: 0, Informative

    Jombeewoof is a bastard who thinks the world owes him a living. http://slashdot.org/comments.pl?sid=267807&cid=202 07637 Jombeewoof tried to destroy an Internet Service Provider in Massachusetts by expecting large bandwidth without paying anything. Educated alone doesn't pay the bills. Jombeewoof is not worth your mod points and is a MySpace loser. Jombeewoof, give up, get off the Internet. The TrollGoons won't leave you alone.

    1. Re:Jombeewoof, get off the Internet. by jombeewoof · · Score: 2, Funny

      It looks like I have a fan.
      good times. I guess I'll have to start wearing pants now though.

      --
      Linux Zealots: Smarter than Mac Zealots, but still zealots.
  35. Actually, AMD already did that... by Chaset · · Score: 1

    I was reading the Great Microprocessors list and it says AMD already did that back in the K5 days. It had a mode where it can natively execute the RISC-like instructions. Nobody used it, so I don't know whether current gen AMD chips support it.

    --
    -- "This world is a comedy to those who think, a tragedy to those who feel."
  36. reminds me of the PS2's PA by acidrain · · Score: 1

    Looks like there isn't a whole lot there that you couldn't get using existing performance counters and a tool like oprofile....

    Sony had a $10k PS2 called the PA that recorded exactly what happened to every cycle on the cpu, gpu etc. without changing the way the game ran. It was the most incredible thing, like you had been sitting in the dark for years and then suddenly someone turned on the lights.

    Is it cache misses, dma contention, background threads, branch stalls or actual work? Optimizing on the PC just feels like groping around in the dark again.

    --
    thegirlorthecar.com - a dating game for guys

    --
    -- http://thegirlorthecar.com funny dating game for guys
    1. Re:reminds me of the PS2's PA by nuzak · · Score: 1

      > Sony had a $10k PS2 called the PA that recorded exactly what happened to every cycle on the cpu, gpu etc.

      Ten grand is pretty small change for a game developer. Was it ever commonly used as a dev kit, or was it a tech demo?

      --
      Done with slashdot, done with nerds, getting a life.
  37. ARM CPUS outnumber x86 by a huge factor -probably by thaig · · Score: 1

    Whole families have one or two computers but every member has their own phone. ARM has triumphed numerically. It doesn't try to compete with x86 but a future could exist in which many people have an extremely powerful ARM-based phone and rely on the internet a lot instead of having a PC.

    --
    This is all just my personal opinion.
  38. Re:Hardware Accelleration == Bad Trend by 12357bd · · Score: 1

    There's a matter of degree, to be sure, but even still you're most likely wasting your time "optimizing" individual lines of C code since the compiler can probably do a better job and that's been the case for quite a while.

    Terrible, if people start to give up to optimize the code (and understanding why it works), the net result will always be a noticeable decrease in programming quality (a very usual situation).

    I know that you are aiming at premature optimization, and you are really right on this one, but the notion that 'optimize code' == 'wasting time' is a perfect excuse to not to learn how and why things work.

    You have to know how to optimize the code to decide when is detrimental to use that optimization, and then, when you know how to do it, you realize it's a matter of degree, as you rightly said.

    --
    What's in a sig?
  39. In other news, Intel ... by syn1kk · · Score: 1

    releases ANOTHER newer faster processor two weeks later ... effectively kicking AMD in the groin AGAIN.

  40. Re:Hardware Accelleration == Bad Trend by PKI+Champion · · Score: 1

    You're crazy if you think the education system teaches programmers how to write good code. They can't even teach math and english well. Good programmers are mentored by other programmers.

  41. Re:ARM CPUS outnumber x86 by a huge factor -probab by LWATCDR · · Score: 1

    Yes but then the 8051 then is probably out numbers the X86 and the Arm. The Mips, Arm, Power, and even the 68k still exists in the embedded market. For example the Power is in all three of the new game consoles. Arms are in a lot of the WAPs. I keep wondering if we will see the a CPU the size of the latest AMD but containing 16 or more ARM cores. Sort of a T1 competitor.

    --
    See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
  42. woops by thegnu · · Score: 1

    i don't mean that you'd be an idiot for being a mac person, but that x86 cpu particulars would slip your mind. :D

    --
    Please stop stalking me, bro.
    1. Re:woops by Apple+Acolyte · · Score: 1

      Thank you very much for the clarification. ;-)

      --
      Part of the hardcore faithful who believed in Apple long before it was cool again to do so
  43. Debug Mode by Slaimus · · Score: 1

    Isn't this just exposing/documenting the CPU's internal debug features so that developers can use them?

    If you look at the die shots of recent CPUs, you will see a big chunk of transistors marked DEBUG.

  44. why is amd not evil too? by cinnamon+colbert · · Score: 1

    like most /.ers,you have these wierd catagorys of evil and non evil companies.
    ALL large companies are the same - the more successfull, the more evil
    why is this so ?
    while everyone professes to like the free market, businesmen hate the free market and love monopoly - in a free market you have to work harder for less, who in their rigth mind would actually like that ?

    So, the 1st thing a company does when it becomes big and succesffull is to use its power to dampen market forces in any way it can.

    Now sometimes, when a company is really, really rich and successful, like google or the old AT&T they are so succesfull that they cna hide their evilness behind total monopoly power. but as sooon as their market posistion slips, they beocme evil.
    mark my words, you heard it hear 1st: as soon as googles profit starts to fall, andit is no longer a wall street darling, they willl be right in their with MS and GM and whoever.

    1. Re:why is amd not evil too? by Anonymous Coward · · Score: 0

      This isn't about evil/not evil. This is about getting away with being evil. AMD can't afford to screw customers right now. Intel could, but they don't do that. Yet. When AMD goes out of business you'll see what I mean.

    2. Re:why is amd not evil too? by Enderandrew · · Score: 1

      Clearly I have weird definitions.

      Intel committed anti-trust violations and is trying to force a monopoly. They broke the law, and they are trying to make the market bad for consumers. Why would anyone in their right mind accuse them of being evil?

      You insist that every company tries to damage the market. Do you have any proof for that claim, or is it a convenient excuse so you feel better about financially supporting an evil company?

      --
      http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
  45. Re:ARM CPUS outnumber x86 by a huge factor -probab by thaig · · Score: 1

    There is a multi-core ARM CPU under development. The idea is that multiple cores are the best way to keep increasing performance without increasing power consumption.

    I don't think that it's anything astoundingly interesting by desktop standards but it will allow embedded devices to keep advancing. As usual, before your phone can handle it properly, there is probably going to be some software that needs a redesign if it's going to show a speed improvement.

    --
    This is all just my personal opinion.
  46. Re:Hardware Accelleration == Bad Trend by cthulhu11 · · Score: 1

    If you ever dealt with the MIPS or Sun compilers, they have a huge number of flags for hyper-optimizations on a variety of implementations of those architectures Sun's compiler has a huge number of flags for hyper-optimizations on a variety of implementations of X86 too. Near as I can tell, though, their impact on the vast majority of code is minimal. AMD and Intel can throw in all the new instructions they want, but they won't be meaningful for years -- if ever -- because code has to run on existing processors that don't implement those instructions.

  47. Re:Hardware Accelleration == Bad Trend by Anonymous Coward · · Score: 0

    Hey, I know this. Java makes storing random things inside a hashtable easier than C. What do I win?

    (what it doesn't make easier as well as the set of languages that are better suited for this particular example is left as an exercise)

  48. Then again, schools are partly to blame by tepples · · Score: 1

    You will have to learn to handle the parallelism. It takes different algorithms and a different way to structure programs. Why are these parallel algorithms not taught in university computer science classes from day 1?

    Those languages have been around for ever, functional programming languages can be parallelized automatically. So if they make it so much easier, why aren't they not used? Educational inertia probably makes up a large part of it.
    1. Re:Then again, schools are partly to blame by Josef+Meixner · · Score: 1

      Why are these parallel algorithms not taught in university computer science classes from day 1?

      Which algorithms were you taught? I only learnt about some sorting algorithms and tree algorithms, the rest was education on how to decomposit a problem to get to an algorithm, the mathematical basis of numerics, statistics, formal reasoning, automata theory and so on. I don't see, why in CS you would be explicitly taught algorithms. And I did indeed learn to program parallel machines, as that was my area of interest. Granted I was at an university which had built experimental parallel machines for some time, so there was a lot of research in that area going on and professors who knew what they did.

      Educational inertia probably makes up a large part of it.

      At my university we started with Scheme, a programming language where you can do functional programming and we were taught in our first year how to do that. Still most students had a lot of problems to grasp the concepts. Functional programming is inherently hard in my opinion and I am unsure if the additional effort to program in such a language offsets the cost of a procedural language with explicit parallelization.

  49. Re:Hardware Accelleration == Bad Trend by Anonymous Coward · · Score: 0

    [...] in a world where cross platform compatibility is key. Try coding for J2ME and a native mobile platform someday.
  50. You need a bigger team by marcosdumay · · Score: 1

    Do you mind if I call Microsoft into that comitee? They are the ones holding x86 alive.

  51. Re:ARM CPUS outnumber x86 by a huge factor -probab by LWATCDR · · Score: 1

    The ARM core isn't slow by any stretch I would bet that a good dual or quad core ARM would run all the software the average desktop needs. It would probably work just fine for most business systems. Since the ARM core is so small compared to say an Core2Duo or AthlonX2 I would bet that you could put 16 or more on a single die and then use Hyper transport for memory IO. You would need to add something like SSE and maybe an FPU but the end result could be very interesting for servers.

    --
    See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
  52. Re:Hardware Accelleration == Bad Trend by glitch23 · · Score: 1

    You're crazy if you think the education system teaches programmers how to write good code. They can't even teach math and english well. Good programmers are mentored by other programmers.

    You must be crazy if that's what you got out of my message. I didn't say the education system currently teaches programmers how to write good code. I said nothing stops them from doing so, whether they know how to is a different issue.

    --
    this nation, under God, shall have a new birth of freedom. -- Lincoln, Gettysburg Address
  53. Re:Hardware Accelleration == Bad Trend by kiddygrinder · · Score: 1

    I know python, c, c++ vbscript and javascript. I did try and learn java once, that didn't last long though. Relax man, its a joke.

    --
    This is a joke. I am joking. Joke joke joke.
  54. Re:Hardware Accelleration == Bad Trend by PKI+Champion · · Score: 1

    I guess the education system failed me on reading comprehension....LOL