Slashdot Mirror


SuSE Submits Enhancements for AMD Hammer

ackthpt writes "SuSE has this press release as they are submitting enhancements to the Linux kernal particular to the AMD's x86-64 processor instruction set. Anticipated for 2.6 kernel, some enhancements may appear in 2.4, as development is only beginning on 2.5. AMD's take on the announcement as well.". nik notes that SuSE join NetBSD in having ports to Hammer. Usenix members can see the paper Wasabi's Frank van der Linden wrote about the porting effort.

57 comments

  1. p0st by Anonymous Coward · · Score: -1, Offtopic

    p0st the first!

  2. TROLLDOT IS AVAILABLE by RoboTroll · · Score: -1
    Hey Robo, TROLLDOT.COM is available!

    From the annals of the Troll Library .

  3. Good stuff! by SquierStrat · · Score: 2

    Hammer is definitely gonna be an interesting and very cool set of chips! Glad to see someone is working on enhancing linux for it. Especially since the big bad wolf in Redmond hasn't yet even done a beta of 64-bit XP for the Hammers.

    --
    Derek Greene
    1. Re:Good stuff! by mirabilos · · Score: 1

      No wonder they don't, because Wintel is paying
      off, and their secret contract involves not
      supporting AMD (at least the new way) ;)

      But I might be wrong, at least isn't Intel said
      to include x86-64 compatibility stuff into the
      next Pentium IV releases?

      This would be a really, really cool way to get
      rid of M$ in a large market share, because _if_
      IA-64 doesn't pay off, but x86-64 does (and it
      will, because of its ease to convert from and
      to x86-32), Intel will activate this, both chips
      sell (AMDs more I should guess), but M$ OS run
      only in 32-bit mode ;)

      --
      My Karma isn't excellent, damn it! (And /. still does not get UTF-8 right in 2012. Wow.)
    2. Re:Good stuff! by Anonymous Coward · · Score: 0
      This would be a really, really cool way to get
      rid of M$ in a large market share, because _if_
      IA-64 doesn't pay off, but x86-64 does (and it
      will, because of its ease to convert from and
      to x86-32), Intel will activate this, both chips
      sell (AMDs more I should guess), but M$ OS run
      only in 32-bit mode

      Wow! I bet Intel and MS never thought of this. You must be so much cleverer than they are - right? Dipshit!

    3. Re:Good stuff! by Anonymous Coward · · Score: 0

      *Cough*. There's lots of #IFDEF AMD64:s in the source...

  4. Whow by Anonymous Coward · · Score: -1, Offtopic

    no way... first?
    I guess eye-balling this pag 24/7 DOEs pay off ;)

  5. great, but what about GCC? by Khopesh · · Score: 5, Interesting

    this is truely a great move in the right direction, but we also need to see something like a gcc support and optimization for this new architecture. AMD, please: you are the expert on your chips. As Intel made it's own free compiler, so too can you. Ideally, release your compiler via MIT-License, LGPL, GPL, or something similar, and releasing an optimization for GCC would blow my mind.

    --
    Use my userscript to add story images to Slashdot. There's no going back.
    1. Re:great, but what about GCC? by Ace+Rimmer · · Score: 5, Interesting

      There are some people in SuSE working on gcc Hammer optimizations this is a part of the contract between AMD and SuSE.

      --

      :wq

    2. Re:great, but what about GCC? by VAXman · · Score: 2

      On the other hand, that is a pretty big nail in Hammer's coffin. Intel's compiler is 2x the speed of GCC, and since it doesn't support x86-64, there goes any performance advantage (and then some) which Hammer was supposed to have.

    3. Re:great, but what about GCC? by DrSkwid · · Score: 3, Informative

      Not necessarily. Recall from this slashdot story about this article the intel compiler also showed similar results over GCC when targetting the Athlon.

      GCC's mission statement is not the running time of executable code, we've recently been having a thread about it on the plan9 mailing list (or comp.os.plan9). (although ours started as a flame from Thomas Bushnell that plan9's 8c was nothing more than a "cute toy" - 8c is more concerned with compilation speed than execution time where it beats GCC hands down, if you want raw execution speed look elsewhere).

      It could well be that Intel's compiler will show similar performance gains over GCC on the Hammer.

      I wonder if every problem will start to look like a nail when the hammer claws it's way out of the AMD tool box.

      --
      There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
    4. Re:great, but what about GCC? by VAXman · · Score: 3, Informative

      Oh, I have no doubt that Intel's compiler will produce great 32 bit code on Hammer. Hammer is just a proliferation of K7 with 64 bit extensions, and AMD knows how to optimize their hardware for that compiler (they use it when submitting their SPEC scores).

      But Intel's C compiler won't generate 64 bit code, which means that AMD has to rely on GCC for 64 bit applications. So any performance advantage of 64 bit is more than nullified because there's not a decent compiler for it.

  6. HEMOS, LEARN TO FUCKING SPELL!!! by Anonymous Coward · · Score: -1, Redundant

    DAMN IT, LEARN TO SPELL!

  7. It's being done!! :) by Daath · · Score: 3, Interesting

    FreeBSD is working on an x86-64 GCC! Actually AMD itself has sponsored this! Take a look at the link!

    --
    Any technology distinguishable from magic, is insufficiently advanced.
    1. Re:It's being done!! :) by prisonernumber7 · · Score: 1

      Mod the parent up. That link is really interesting and the OSS community as a whole would benefit from that.

      --
      && aemula C. ab stirpe interiit
    2. Re:It's being done!! :) by vidarh · · Score: 4, Insightful

      I believe you've misread. FreeBSD people are working on adapting a x86-64 GCC port that was done by SuSE. AMD does state on the x86-64 website that they are supporting porting work for both Linux, FreeBSD and NetBSD, however.

  8. Linux Kernal? by tuxzone · · Score: 2, Funny

    To straighten things out:
    Commodore machines have a kernal (Keyboard Entry Read, Network, And Link), linux has a kernel.

    To make life more complicated: if you want to run a Unix like OS on a machine with a kernal (like the c64) it is not going to be linux but lunix (http://lng.sourceforge.net/).

    1. Re:Linux Kernal? by Anonymous Coward · · Score: 0

      To straighten things out:
      C= 64 is NOT 64-bits architecture

    2. Re:Linux Kernal? by Skuld-Chan · · Score: 2

      I remember reading that in commodore hacking magazine trivia section - I always thought it was a classic case of commodore naming something them wrapping an acronymn around something.

  9. already being done... by scorcherer · · Score: 5, Informative

    Take a look at GCC main page and you'll see a note on the x86-64 port contributed by SuSE.

    --

    --
    The Cap is nigh. Time to get a fresh new account.

  10. PEACE NOW by cmdr_shithead · · Score: -1

    stop the war

  11. But where/when can we get a Hammer? by Quixote · · Score: 2

    Are Hammers available right now? If so, where can I get one? Strictly for research purposes, of course...... ;)

    1. Re:But where/when can we get a Hammer? by Phosphor3k · · Score: 3, Informative

      They are tentatively shceduled to be released at the end of 2002. I would wager that they won't be available in force to the common man until sometime 1st quarter 2003.

    2. Re:But where/when can we get a Hammer? by Anonymous Coward · · Score: 3, Funny
      Are Hammers available right now? If so, where can I get one?

      Yes! Your nearest hardware store should have a good selection!

  12. i sniff a server market takeover .. by Anonymous Coward · · Score: 1, Interesting

    because intel put their itanium 64bit egg in the windows xp64 basket.

    1. Re:i sniff a server market takeover .. by lessthan0 · · Score: 1

      I agree, and possibly a bigger slice of the desktop market as well.

      The Itanic smells a lot like IBMs ill fated move to the Microchannel bus. On the other hand, if Itanic delivers on the promise of vastly superior performance (doubtful) AND if they make it easy to post I32 programs, then it will have a chance.

      It seems likely that Intel will back track and create a hybid 32-64 processor like AMD.

    2. Re:i sniff a server market takeover .. by Anonymous Coward · · Score: 0

      that's not all you sniff, ass monkey.

  13. Newbie 64-bit question by scorcherer · · Score: 2
    Would it be possible to process two 32-bit operations at once in a 64-bit system? I imagine this is possible considering the information content.

    For a decimal example, multiply 123,456 by 2 to get 246,912. Imagine your old number system was limited to max. 999. With the new system (max. 999,999) you've effectively multiplied 123*2 = 246 and 456 * 2 = 912 by a single instruction. Of course you'll have to separate the resulting numbers at the end, but you might get improvements if you do multiple instructions in succession.

    --

    --
    The Cap is nigh. Time to get a fresh new account.

    1. Re:Newbie 64-bit question by Flavio · · Score: 3, Informative

      This is called a SIMD (single instruction, multiple data) operation. It's what MMX is all about.

      It's usually not worth doing this if there's no SIMD hardware support, because the time wasted loading your values and then separating them isn't compensated by the gain in speed. Of course there are special cases (like when dealing with bit strings) where this is used by definition (and will be an improvement).

    2. Re:Newbie 64-bit question by VAXman · · Score: 3, Informative

      No, you can't, unless you can guarantee that the result from the lower half of the operand will not affect any bits in the upper half. For multiplication this will happen all the time but for addition it will happen whenever the lower operand carries over.

      Besides, 64 bit operations are higher latency than 32 bit operations, and the cost of all of the shifting and masking to separate the results would be very high. It would be much faster to just do two separate 32 bit operations.

      SIMD is a different story since the hardware assembles and reassembled the operands, and executes them on separate executions units.

    3. Re:Newbie 64-bit question by athlon02 · · Score: 1

      Granted the GPRs are being extended from 32bits to 64bits, but more than that it's x86-64 because it has 64 bit *ADDRESSING*.

    4. Re:Newbie 64-bit question by pslam · · Score: 1
      Besides, 64 bit operations are higher latency than 32 bit operations, and the cost of all of the shifting and masking to separate the results would be very high. It would be much faster to just do two separate 32 bit operations.

      Is this actually true of the x86-64 instruction set? It would strike me as a very poor design if simple operations (add/sub/bitwise) took more than a single cycle, otherwise having 64 bit words would be rather pointless as you could do 64 bit operations just as fast in 32 bit. The only advantage would be larger register space.

      I can't actually find any documentation of instruction timings on AMD's site or x86-64.org. I would guess that most instructions take the same time in 64 bit as 32 bit. The exceptions would be things like multiply/divide etc.

  14. kernal by Anonymous Coward · · Score: -1, Flamebait

    for fucks sakes, learn to fucking spell would ya. and you want people to pay for spelling mistakes a fucking kid would not even make. jesus

  15. translation - please mod up by Anonymous Coward · · Score: -1, Troll

    5U53 $U8Mi+$ 3NHAnceM3Nts PH0R @md h4mm3r
    pO5+eD 8Y hEM0$ 0N $4TUrd@Y maRCh 02, @09:28@m
    FroM the GE+T1N9-+H3-f34TUR3S-IN dEPT.
    4Ck+hP+ WR1T3$ "5u$3 h4$ TH15 pR35$ r3LE@53 4S TH3y Ar3 5UbMiTt1n9 EnH@Nc3M3nt$ +o tEH l1nUx k3Rn4L P4r+IcUl4r to tH3 @mD'5 x86-64 pR0ce550R 1n5+rUc+I0N Set. 4NTICIP4t3D PH0R 2.6 k3rn3l, 5oMe eNH4nCEMeN+5 m@Y 4pPE4R 1n 2.4, @5 D3V3LOpm3n+ 15 oNlY 8391NnIn9 ON 2.5. amD'5 +4K3 ON +3H 4NN0unCeMenT @$ welL.". N1K N0T35 thAT $u5e J01N ne+8sd 1N h4V1N9 p0R+5 +o H4MMEr. u5EniX M3MBer5 C4N $33 Teh P4P3R w4$@81'5 Phr4nk v4N D3R l1nD3N WRotE @BOU+ Teh por+In9 3pHPH0RT.

  16. Facts about *BSD by Anonymous Coward · · Score: -1, Flamebait
    Fact: *BSD is dying

    Yet another crippling bombshell hit the bleaguered *BSD community when recently IDC confirmed that *BSD accounts for less than a fraction of 1 percent of all servers. Coming on the heels of the latest Netcraft survey which plainly states that *BSD has lost more market share, this news serves to reinforce what we've known all along. *BSD is collapsing in complete disarray, as further exemplified by failing dead last in the recent Sys Admin comprehensive networking test.

    You don't need to be a Kreskin to predict *BSD's future. The hand writing is on the wall: *BSD faces a bleak future. In fact there won't be any future at all for *BSD because *BSD is dying. Things are looking very bad for *BSD. As many of us are already aware, *BSD continues to lose market share. Red ink flows like a river of blood. FreeBSD is the most endangered of them all, having lost 93% of its core developers.

    Let's keep to the facts and look at the numbers.

    OpenBSD leader Theo states that there are 7000 users of OpenBSD. How many users of NetBSD are there? Let's see. The number of OpenBSD versus NetBSD posts on Usenet is roughly in ratio of 5 to 1. Therefore there are about 7000/5 = 1400 NetBSD users. BSD/OS posts on Usenet are about half of the volume of NetBSD posts. Therefore there are about 700 users of BSD/OS. A recent article put FreeBSD at about 80 percent of the *BSD market. Therefore there are (7000+1400+700)*4 = 36400 FreeBSD users. This is consistent with the number of FreeBSD Usenet posts.

    Due to the troubles of Walnut Creek, abysmal sales and so on, FreeBSD went out of business and was taken over by BSDI who sell another troubled OS. Now BSDI is also dead, its corpse turned over to yet another charnel house.

    Recently, Slashdot confirmed that WindRiver bucked FreeBSD out on its ass for a carton of Winstons and a six-pack of Pabst Blue Ribbon. This only serves to confirm the fact that FreeBSD is unwanted, doomed to be passed around like a cross-eyed harelip orphan from one foster parent to another.

    All major surveys show that *BSD has steadily declined in market share. *BSD is very sick and its long term survival prospects are very dim. If *BSD is to survive at all it will be among OS hobbyist dabblers. *BSD continues to decay. Nothing short of a miracle could save it at this point in time. For all practical purposes, *SD is dead.

    Fact: *BSD is dead

  17. Ph@c+: *8sd 1$ DY1nG by Anonymous Coward · · Score: -1, Offtopic

    Ph@c+: *8sd 1$ DY1nG
    y3+ @nO+H3R CR1pPliNG b0mB$HELl Hi+ Th3 8ElE49UEreD *8$d C0MMuNiTY wh3N reCEN+lY IDC c0NF1rM3D That *85d 4ccOUNT5 pH0R L355 +H@N A PHr@C+10N Of 1 pERC3n+ OPH aLL 53RV3rS. cOMINg on th3 h3ElS Of t3H L4+e5+ NETCr@Ft 5URV3y wh1cH pL4InlY $Tat3$ Th@+ *85D H4$ lO5+ M0R3 M4Rk3+ $hAr3, +hI5 n3W$ $erV3$ +0 r3INFORCE WH4t w3'VE Kn0wN 4ll AlON9. *85d 1s C0ll4P$INg IN C0mPlE+3 D1$4rR4y, @5 Fur+H3R ExEmPL1F1ED bY F41l1ng D34d lA5+ in +h3 R3c3N+ 5y5 4DM1N cOMpreh3Ns1V3 netW0rk1Ng tE5t.

    yoU D0n't N33D +O b3 @ kR3$KIn t0 PR3dICt *8SD'S FU+UrE. Th3 H4nd Wr1t1Ng I$ on +3h W4Ll: *85d F4CE$ 4 BLE4K fu+uRE. 1N pH4CT TheRe w0n't 8E anY Fu+UrE @+ 4LL fOr *B5d BEC@u$E *8$d 1$ DYIN9. +h1n9$ 4re l00K1n9 veRy 84d PH0R *85D. 45 M4ny 0pH U5 4R3 4LR34Dy 4W@r3, *b$D cONtiNUe5 TO L0$3 M4rket SH4R3. rED iNk flow5 L1k3 @ R1Ver oF 8LooD. fr3E85D i$ the MO$+ eNd4N93r3d oPh +h3M all, haviN9 L05t 93% OpH i+S C0RE d3v3loper5.

    L3+'5 ke3P T0 +Eh F4C+5 @Nd L00k @+ THE NUMb3RS.

    0p3N8$D L34dER +hEo 5+@Te5 +H4+ +hER3 @R3 7000 u53r5 0F OP3N85D. H0w M4Ny U5er5 0f n3t85d 4R3 Th3Re? LE+'5 5E3. +hE nUm83r 0PH 0pENb5D v3r5u$ nE+85d P05+5 On US3NE+ i5 RouGHly 1N R4+i0 OF 5 TO 1. ThEr3f0RE +h3R3 @re @80u+ 7000/5 = 1400 NE+85d U53r$. 8$d/o5 PoS+5 on u53net ArE @8ou+ h@Lph OPH tH3 v0lUMe 0F N3+B5D P05+5. th3rEPh0r3 +hER3 aRE @8ou+ 700 uSeR$ oF 85D/o5. 4 r3C3N+ @RtIcl3 pU+ FrE3B$d 4T @BouT 80 PeRcEn+ 0F T3H *8sd M@rkEt. TH3R3f0re +H3RE 4Re (7000+1400+700)*4 = 36400 pHR3385d u5er5. +H15 15 C0n51$+ent wi+H +H3 NumB3r 0ph phRee85D u5eN3+ Po5t5.

    due +0 +he +rOubles OPh W@lnu+ CrEeK, 4by5m@l 54LES @nD 5o On, Phr338$D wENt oUt opH BU51n35$ 4Nd W45 T4K3n 0VeR 8Y b$D1 wh0 5eLL @n0+h3r TR0UbLeD 05. NOW B5d1 15 4l50 d34D, 1+5 C0RP$3 tuRN3D 0Ver +O Y3t aNO+hER CH4RN3L hOU$3.

    R3c3N+LY, 5L4$hDOT [9o4+$e.cX] cONfirm3d Th4+ W1NDRIvER 8uCK3D pHR33B$D 0u+ 0N iT$ @55 PH0r a C@rTON 0F WiN5+0ns 4nD 4 51x-p4cK OPH P4b$t bLU3 R1bB0N. +H15 ONly $ErV3$ +0 CONfIrM the F@CT th4+ fR3e85D 1$ UnW4Nt3D, dOomEd To 8E p4$53D 4r0UnD lik3 4 Cr0S5-3YEd H4r3LIp 0RPh@N FRoM one phO5T3R p4r3nt to @N0tHER.

    @ll m4J0R 5URVEYs $HOW +HAT *85D h4$ 5tE4diLy D3Cl1N3D 1n M4rkeT 5H4R3. *B5D 15 V3RY $1CK 4nd Its Lon9 +ERm 5urviV4l PROSP3cT5 4R3 V3Ry diM. 1ph *B5d 1$ +o 5uRVive @t 4Ll IT w1lL 83 4MOnG O$ hob8y15+ d48BlEr5. *b$D C0ntinU3$ +O dec4y. n0Th1Ng $h0r+ OF 4 M1rAcL3 coUlD 54vE iT 4+ th1$ p01NT 1n +Im3. PH0r 4Ll pr4c+Ic@l purPo$3$, *85D 1S DE4d.

    F4C+: *B5D i$ De@D

  18. SASSY! by Anonymous Coward · · Score: -1, Offtopic

    A Widow, after the death of her husband, incontinent, and without any Difficulty, shall have her marriage and her inheritance, and shall give nothing for her dower, her marriage, or her inheritance, which her husband and she held the day of the death of her husband, and she shall tarry in the chief house of her husband by forty days after the death of her husband, within which days her dower shall be assigned her (if it were not assigned her before) or that the house be a castle; and if she depart from the castle, then a competent house shall be forthwith provided for her, in the which she may honestly dwell, until her dower be to her assigned, as it is aforesaid; and she shall have in the meantime her reasonable estovers of the common; and for her do wer shall be assigned unto her the third part of all the lands of her husband, which were his during coverture, except she were endowed of less at the Church-door. No widow shall be distrained to marry herself: nevertheless she shall find surety, that she shall not marry without our licence and assent (if she hold of us) nor without the assent of the Lord, if she hold of another.

  19. 5a$$y! by Anonymous Coward · · Score: -1, Offtopic

    a WID0w, 4ph+3R thE De@+h OF H3r HU$84nd, iNCoN+1N3nt, aNd wI+H0UT 4nY D1FpHICul+Y, 5H@Ll h@v3 HeR M4Rr149e 4Nd HER 1NH3r1T4Nce, 4Nd 5H4LL G1vE N0+H1N9 FOr H3r dow3R, h3r M@rrI@93, oR her 1nHEr1t@Nc3, WHich h3R hu$B4nD @nD 5h3 h3ld tHe DAY 0pH +eh d3@+H 0f h3R hU5B@Nd, 4nd 5H3 SHaLL +4rRy IN +he ch13PH hous3 Oph her HU$8@nD bY fOr+y d4y5 apH+eR Teh D34+h 0f her hu584ND, wi+Hin wH1ch d@y5 HER D0weR 5H@ll BE 4551gn3d h3R (iF i+ w3r3 N0+ @551GneD H3R BEPH0r3) or +H@+ Teh h0u$3 83 @ c4$+le; @nD 1pH 5h3 D3P4rt frOM +h3 c@$+l3, Th3n @ cOmPET3N+ hoU53 $h4ll 8E For+hW1+h PrOV1dED PHOR Her, IN +h3 wh1Ch $H3 m4Y hOne5+LY dWeLl, unt1l h3r dOWer 8e to hER A5$1GNeD, 45 It I5 @forE$41D; 4Nd $h3 5h4Ll h@Ve 1n T3H me4N+IMe H3R r345On4Ble 3$+Ov3RS Of +3H c0MM0N; 4nD FOr HER D0 wEr $H@LL Be 4S5i9N3D Un+o H3r t3h +H1RD p@r+ 0f 4Ll teh l@ND$ of Her HUSb4nd, Wh1cH W3RE HiS duRiN9 COv3RtuR3, ExC3Pt $hE WeR3 3ND0W3d OPH Le55 4t tHE cHuRCh-DooR. nO W1DoW 5h4Ll BE d1$tR41N3D +0 M4RRy H3r5elf: NEver+H3L3$5 5he 5h@lL phinD SUR3+y, th4t $He 5H@ll not M4rRY W1THOuT Our Lic3nce 4Nd 4$53Nt (1ph $H3 h0Ld OPH U5) n0R wITH0U+ +hE 4$53nT of thE lord, 1f 5he HolD OF 4N0+her

  20. misleading headline by nomadic · · Score: 1

    No, SuSE is submitting enhancements for Linux for the AMD Hammer. Made me think they were actually making suggestions to the chip design for a second.

    1. Re:misleading headline by bears · · Score: 4, Informative

      According to one of the developers from SuSE who worked on this (and demoed SuSE running under one of the x86-64 simulators at a recent OxLUG talk ), SuSE and other porters did indeed make suggestions to AMD as to details of the architecture which were taken up by AMD.

  21. Imagine by Anonymous Coward · · Score: 0

    Imagine Dr Torvalds claiming Suse's patches stink and need to be re-worked from ground-up.

    Imagine.

  22. Interesting read by Matrim9 · · Score: 4, Informative

    http://www6.tomshardware.com/cpu/02q1/020227/

    Interesting - they tested one of the Hammer CPUs on Suse, but they only ran XP in 32-bit... :o

    1. Re:Interesting read by Phosphor3k · · Score: 3, Informative

      Just a clarification. Tom did not test these. This was a demonstration at a trade show that everyone and their brother has been reporting on. No one was allowed to test any software on these machines. However, this is the FIRST batch of hammers tested in public. The series on the cpu was A0. Generally, AMD and Intel do not test/show of such early production CPUs to the public as they are still going through testing/debugging.

  23. intel compiler not free by morcheeba · · Score: 2

    Just a nit-pick, but Intel compilers actually cost: $500 for linux C/C++ compiler ($125 academic)

    Intel does provide a number of free open source products, including an Intanium assembler, library routines, vision routines, and a network performance analyzer.

  24. Freedom to Innovate by Paul+the+Bold · · Score: 5, Interesting

    You will recall that when AMD demoed hammer recently, they showed a 32-bit Windows system and a 64-bit Linux system. People were commenting on AMD preferring Linux over Windows, therefore showing a more powerful Linux demo than a Windows demo.

    The truth is that there is not a 64-bit version of Windows for the Hammer. AMD was able to modify the existing Linux code to create their own 64-bit version of Linux. This is the best example of the freedom granted by the GPL that I have seen in months. AMD is releasing a new product at the end of the year, and they are able to create a demand for it NOW by having software for it NOW.

    Do you remember the lag between the introduction of Intel's Itanium and a Windows version for Itanium? It was not well coordinated. AMD has done the opposite, they created a demand and a use several months before the release, and it's working. We are all drooling over a 64-bit architecture, and we will have 6-8 months to think about (and save up for) the purchase of a Hammer.

    This is the freedom to innovate that is granted by the GPL and denied by the MS EULA. GPLed software is going to make AMD some money.

    I feel all warm and fuzzy inside.

    1. Re:Freedom to Innovate by Flous · · Score: 1

      Yeah, you're probably right on that one. Although I wouldn't bet on a 64-bit linux version creating _that_ much demand for a processor. Sure, some geeks (including me) will want one ASAP, but remember, we're just talking about the kernel now. I'll feel warm and fuzzy inside when people who produce user-land software will compile AND optimise for the Hammer. Little OT: I _reeeeaaally_ hope AMD will keep using the hammer names (although I realise it's not likely)

  25. Open Source software vital to hammer success by AZPhysics · · Score: 3, Interesting

    While Hammer will fly at 32 bit code, the 64 bit code will really differentiate the proccessor. Two-way clawhammer Beowulfs should be a huge business. But, the differentiation will really not show on Windows until (unless) they develop a x86-64 bit windows. I wouldn't count on them doing that until Intel comes out with their version of x86-64. (note that I didn't say if). There will be great pressure to recompile and reoptimize Open software to take advantage of the Hammer.

    I think this is a wonderful advancement. I run Suse on an athlon now, and will run suse on a dual hammer in probably a year in a half (I can't afford to be bleeding edge). I can't find many optimizations for the Athlon in compilers and such. However, with the Hammer, the optimizations will be out there. Not only will the compilers have flags, but entire distributions will likely be built with re-compiled applications. That would be something I would pay more for.

  26. compilers by Khopesh · · Score: 2

    I agree with you. If AMD were to release it's own set of x86-64 optimizations for GCC, very few of them would find themselves in the GNU release of GCC. HOWEVER, I am suggesting an AMD-GCC distribution/supplement; this would be released as a set of diffs on the current GNU GCC and neither "waste" GNU developers' time nor "bloat" the standard GCC distribution.

    --
    Use my userscript to add story images to Slashdot. There's no going back.
  27. Excuse me? by Inoshiro · · Score: 2

    When was the Itanium released? Where is Windows for the Itanium?

    --
    --
    Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.
    1. Re:Excuse me? by Paul+the+Bold · · Score: 2
      I don't remember exactly when it was released, but a search on pricewatch for "Itanium" brings up a lot of vendors. It's out there.

      As far as Windows for the Itanium, who cares? No, seriously, I don't know. I don't follow new Microsoft product releases unless something funny or terrible happens (Bill's demo crashes, or spyware). I am under the impression that there is a 64-bit Itanium Windows out there, but maybe I am wrong. That would be one hell of a lag.

  28. Isn't that only for floating-point code? by Goonie · · Score: 2
    I've seen that quoted before, but from what I've read the performance differences on integer code aren't anywhere near that great. If you're running Apache, for instance, the relative floating-point performance is neither here nor there.

    Anyway, who's to say AMD don't have a demon proprietary compiler for x86-64 up their sleeve for just this purpose?

    --

    Any sufficiently advanced technology is indistinguishable from a rigged demo
    --Andy Finkel (J. Klass?)
    1. Re:Isn't that only for floating-point code? by VAXman · · Score: 2

      Anyway, who's to say AMD don't have a demon proprietary compiler for x86-64 up their sleeve for just this purpose?

      Besides the fact that they've never developed a compiler before?

  29. Doing SIMD without SIMD hardware is possible by pslam · · Score: 2, Interesting
    Ignore the other replies - it is possible to do this, and it definitely is a speed increase. See the example code below. You just have to be careful about the packing arrangement of data in each word, and the overlap when performing operations on them.

    Multiplication is a bad example, but it is possible to multiply several numbers at the same time by one or more coefficients. This usually isn't worth it unless the numbers are very small compared to the word size - e.g 4 bits vs 32 bits.

    However - there are a lot of operations which can be dramatically improved by packing data without any extra SIMD hardware. For example, you can perform some tricks with bit shifting to do pixel masking 32 bits (or 64!) at a time. You can do addition/subtraction trivially with the only thing to watch out for being the carry.

    Whether it's worth it is a case-by-case decision. Sometimes the packing/unpacking/carry correction takes longer than the performance gain.

    And here's an example where there's definitely a performance increase! I've used the code below to do motion blur in the past. It's slower than using MMX, but not by much. I wrote it so long ago I don't have any comparitive figures though.

    unsigned *bufin = (unsigned *) buffer;
    unsigned *bufout = (unsigned *) motionbuf;
    unsigned mask1 = 0xfcfcfcfc;
    unsigned mask2 = 0xfefefefe;
    for(unsigned n = (width * height) >> 2; n; n--) {
    unsigned in = *bufin++;
    unsigned out = *bufout;
    in &= mask1;
    in >>= 2;
    out &= mask2;
    out >>= 1;
    out += (out & mask2) >> 1;
    *bufout++ = in + out;
    }

    The idea here is that the framebuffer persists the image. The input and output buffers are 8 bits per primary. Now, you could do this a single byte at a time, but that would suck for speed. Instead, 4 bytes are computed at once. The formula for each output byte is based on:

    out = (out * 3 + in) / 4

    This is actually performed here slightly less accurately:

    out = out / 2 + out / 4 + in / 4

    I remove some of the visible artifacts in practise by a post-processing stage where 1 bit of noise is added.

    The bit masks are applied to prevent the shifts "leaking" into the next byte in the word. Now, on the topic of 64bit - the above can be performed on 64bit words with no performance loss. This means it goes twice as fast. Although you'd be silly to do this on an architecture with SIMD instructions designed to do exactly this job.

    On architectures without SIMD, tricks like this can give you several times speed increase. If anyone's interested in any other tricks I can pull some code onto a web page somewhere.

    1. Re:Doing SIMD without SIMD hardware is possible by morbid · · Score: 0

      This sounds very interesting. I have a little itch-scratcher on-going at http://libsimd.sourceforge.net and I'd be interested in any little tricks you have :-)

      --
      I'm out of my tree just now but please feel free to leave a banana.
    2. Re:Doing SIMD without SIMD hardware is possible by spitzak · · Score: 2

      Another simple example is turning a 1-byte grayscale image into a 4-byte "color" image as needed by some hardware by multiplying each input byte by 0x1010101. I have measured this and it definately is faster than storing the 4 characters one after another into the output buffer.

    3. Re:Doing SIMD without SIMD hardware is possible by pslam · · Score: 1

      Hmm, forgot about that one. Depending on the hardware, the multiply will end up single cycle because it'll early out after the first 8 bits of coefficient. I know recent ARM processors perform 12 bits per cycle, so you tune coefficients to be less than 4096. I think recent x86 processors are single cycle regardless of coefficient, possibly with some result latency.

      Alternatively you could do:

      unsigned inw; unsigned char *in; ...
      inw = *in++;
      inw |= inw << 8;
      inw |= inw << 16;

      This gets the job done in nearly the same time. It depends on the architecture and context I suppose. On ARM you get or+shift in one cycle so the above looks pretty much like the C code:

      ldrb r0, [r1]
      add r1, r1, #1
      orr r0, r0, lsl#8
      orr r0, r0, lsl#16

      On x86 it takes a ton more instructions, so the multiply ends up better. It's a shame compilers can't spot these kinds of optimisations.

      Of course, you shouldn't be storing to memory one byte at a time if you can pack the bytes into words and store words at a time. Modern x86's will merge writes for you but I'd guess at it not making the instruction scheduling any easier for the processor.

    4. Re:Doing SIMD without SIMD hardware is possible by spitzak · · Score: 2

      I did try doing shifts and or into a word, instead of the multiply and the result was slower, this was on both MIPS and Pentium. I'm sure the compilers could have done better, but it is possible that the only better thing would have been to recognize the equivalence to the multiply and do that.