Slashdot Mirror


Red Hat Engineer Improves Math Performance of Glibc

jones_supa writes: Siddhesh Poyarekar from Red Hat has taken a professional look into mathematical functions found in Glibc (the GNU C library). He has been able to provide an 8-times performance improvement to slowest path of pow() function. Other transcendentals got similar improvements since the fixes were mostly in the generic multiple precision code. These improvements already went into glibc-2.18 upstream. Siddhesh believes that a lot of the low hanging fruit has now been picked, but that this is definitely not the end of the road for improvements in the multiple precision performance. There are other more complicated improvements, like the limitation of worst case precision for exp() and log() functions, based on the results of the paper Worst Cases for Correct Rounding of the Elementary Functions in Double Precision (PDF). One needs to prove that those results apply to the Glibc multiple precision bits.

5 of 226 comments (clear)

  1. Re:C versus Assembly Language by RingDev · · Score: 4, Insightful

    99.9% of the time, no.

    The purpose of the compiler is to identify and optimize the code structures in higher level languages. There are many, many tools, and generations of compilers that have been dedicated to just that. For the vast majority of cases, the compiler will do a better job and leave you with the much easier task of maintaining a high level language codebase.

    That said, there are specific operations, most frequently mathematical in nature, that are so explicitly well defined and unchanging, that writing them in ASM may actually allow the author to take procedural liberties that the compiler is unknowledgeable of or exist in such a way that the compiler is unaware of.

    The end result of such code is typically virtually unreadable. The syntax masks the math, and the math obfuscates the syntax. But the outcome is a thing of pure beauty.

    -Rick

    --
    "Most people in the U.S. wouldn't know they live in a tyrannical state if it walked up and grabbed their junk." - MyFirs
  2. Re:C versus Assembly Language by ClickOnThis · · Score: 5, Insightful

    Keep in mind: this is [w]hat the compiler tried to do; when you start down this path you are saying "that fancy compiler doesn't know what its doing, I'll do it all myself".

    Trying to outsmart a compiler defeats much of the purpose of using one.
    -- Kernighan and Plauger, The Elements of Programming Style

    --
    If it weren't for deadlines, nothing would be late.
  3. Re:C versus Assembly Language by TheGavster · · Score: 4, Insightful

    I think that saying "This piece of code is going to be called a lot, so I'll implement it in assembler" is inadvisable. The more reasoned approach is "after profiling, my program spends a lot of time in this routine, so I'll go over the assembler the compiler generated to make sure it optimized correctly". The upshot being, it is useful to be able to read and write assembler when optimizing, but it would be rare that you would produce new code in assembly from whole cloth.

    --
    "Because Science" is one step from "Because old book". Try "Because of my experiment testing my falsifiable assertion".
  4. Re:C versus Assembly Language by Pseudonym · · Score: 4, Insightful

    Not just every architecture. In general, you may need to write it for every major revision of every architecture. As CPU pipelines and instruction sets change, the hand-crafted assembler may no longer be optimal.

    (Exercise: Write an optimal memcpy/memmove.)

    --
    sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
  5. Re:Don't need to be an expert to beat compilers .. by angel'o'sphere · · Score: 4, Insightful

    I once found an Bresenham line drawing algorithm written in Assembly (68k) on a Mac SE (black and white graphics).

    That code was extremely complicated and the main mistake was that the author loaded each byte, manipulated it and rewrote it to memory each cycle.

    Just to load in many cases the exact same byte again in the next cycle.

    I immediately came to the idea to not load and store the bytes every cycle, and from that the jump to figure if you could also use words and long words depending on how steep the line was, was easy.

    So I first wrote code in C that used chars, ints and longs depending on steepness and changed consecutive bits until it needed to access another "word". Each "word" was only loaded once into a register and was only written after all necessary bits where changed.

    My C code rewrite was so fast (more than 100 times than the original assembly) that I never rewrote that in assembly itself.

    --
    Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.