Goto Leads to Faster Code
pdoubleya writes "There's an article over at the NY Times (registration required) about Kazushige Goto, the author of the Goto Basic Linear Algebra Subroutines (BLAS, see the wiki); his BLAS implementation is used by 4 of the current 11 fastest computers in the world. Goto is known for painstaking effort in hand-optimizing his routines; in one case, "when computer scientists at the University at Buffalo added Goto BLAS to their Pentium-based supercomputer, the calculating power of the system jumped from 1.5 trillion to 2 trillion mathematical operations per second out of a theoretical limit of 3 trillion." To quote Jack Dongarra, from the University of Tennessee, "I tell them that if they want the fastest they should still turn to Mr. Goto."" Ever get the feeling someone wrote an article merely for the pun?
I'd always been told that use of Goto led to a case of the BLAS in my code!
Those who can, do. Those who can't, write technology blogs.
Ever get the feeling someone wrote an article merely for the pun?
Good thing the headline didn't contribute to that at all.
Not Buzzword 2.0 compliant. Please speak english.
Although he also writes fast code, Mr. Bluescreen was criticised for the poor stability of his code.
It was CIS 150, C++ was the language of the day (pascal before, java after.) I was taking an exam that was all coding. I remember extensive use of GOTO from my commodore days, so I used one in a test (the objective was to code something with as few lines as possible)
;)
I had the shortest working code in the class but the arse hole teacher failed me for it. Said something like "we don't teach goto for a reason. Yeah, it's in the book, but don't ever use it!"
Jerk. I should post his phone number on slashdot
Do not meddle in the affairs of sysadmins, for they are subtle, and quick to anger.
...To see who actually reads the article.
;)
Judging from the replies...not many people
Goto Considered Helpful?
-Loyal
I aim to misbehave.
DEC had an ultra-optimized math library (calculations on arrays, Fourier transforms, etc.), improved over decades by generations of PhDs. There were different versions of the routines for the different generations of CPUs, for the different cache sizes of a same model, maybe even for various speeds of RAM. Needless to say, the simple fact of linking against that library instead of the standard one improved the speed of math intensive code by a good 10 to 20 percent (those numbers out of my fuzzy memory, but that far from insignificant).
Add to that compilers that were producing top-notch machine language for the target architecture (producing images that ran twice as fast as what gcc gave you at best), CPUs that were spanking the rest of the world as far as floating-point performance was concerned, and you can understand why the scientific community has kept using Alphas for so long.
You might want to read up on this page for some human interaction hints.
Try out fish, the friendly interactive shell.
I believe you are referring to Kazushige's cousin, Mr. Gosub.
Considering the number of scientists who have been looking at this over a number of years, I think it really is a credit to Goto's work. Optimizing at this level is very challenging work on modern processors.
"Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
A lot of people complain about people never reading the actual articles before they comment, but it seems worse than that. People don't even bother reading the blurbs.
I wonder where the slashdot effect comes from then?
From the article:
"Robert A. van de Geijin, a computer scientist who works with Mr. Goto at the Texas Center,..."
All right, a Japanese programmer named Goto, working with a non-Japanese guy name Geijin. That's too much.
Which is certainly good, but to me says more about the previous implementation than it does about Goto's work.
Yeah, that previous implementation must have totally sucked. I know all my linear algebra software is written around an assembly language core, hand tuned for each new version of a half dozen processors, and designed from the start to minimize TLB misses instead of just naively trying to fit a dataset into L1 or L2 cache. I don't know why those retards at the universities and national labs were ever using anything else!
(closing Slashdot, going back to working on my shamefully unoptimized C++ numerics code...)
Which only goes to show that you haven't considered the implications of optimization in modern processors. A Pentium 4 can operate above 3 GHz. This means that light can travel no more than 10 centimeters in the duration of one clock pulse. With the spacing in the motherboard, this isn't enough for a pulse to go from the CPU to the RAM and come back. Even if the memory could operate at the same rate as the CPU, the computation would still be limited by light speed alone.
Optimization to get the full advantage of a Pentium 4 doing floating point calculations is one of the most difficult tasks one can do in computing. A P4 can do, in one clock pulse, four multiplications and four additions. To get 100% of this speed one needs to have a sophisticated handling of cache memory, among other requirements.
Oh, Goat-toe hell you spoilsport!
My favorite ever comment was, "If I ever saw this in the real world, I'd fire you" attached to an "A" test paper with a programming question on it I'd managed to reduce to one line of nearly incomprehensible recursion.
ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
If wife has headache, GOTO sleep
If boss is on vacation, GOTO strip bar for long lunch
If in-laws are coming over, GOTO work and pretend there is a critical problem that requires your presence all night
If technical conference is in Vegas, GOTO it
loads of examples.
If work is boring, GOTO slashdot to kill an hour or two
"I have as much authority as the pope, I just
don't have as many people who believe it" - George Carlin
We have given birth to a new acronym: RPFH Read Past the F**king Headline.
AT&ROFLMAO
http://news.com.com/Writing+the+fastest+code%2C+by +hand%2C+for+fun/2100-1022_3-5972844.html?tag=nefd .top
Atlas is open-source and is a pretty good alternative. It is only a few percent slower than libgoto in most cases.
Save the bandwidth. Don't use sigs!
Seriously. Computed goto is very useful for low-level
optimizations in things like high-throughput ethernet
drivers and such. It basically eliminates conditional
checks in cases where the condition stays the same
for a particular set of data. So instead ofone would haveIf the second part is executed in a loop, the savings of
not making an IF comparison accumulate fairly quickly.
3.243F6A8885A308D313
I like everyone else was trained *never* to use the dreaded goto statement. I'll grant that Pascal was more readable than Basic (with unlabeled gotos).
But, sometimes, it is actually better to use a goto to make the code more readable. The Linux Kernel, for example, uses gotos. I was pretty sceptical at first because it had been drilled into my head how unreadable code was with gotos in it. But, reading the code, I have to admitt is is much more readable for exception handeling, for example.
If the goto would not make your code more readable then don't use it. But, in the cases where it would avoid a bunch of sillyness trying to get out of a bunch of nested loops in case some error happened, then it makes a lot of sense.
Linus Torvalds (and others) explain the reasoning for this at:
http://kerneltrap.org/node/553
In short, there are both readability and efficiency reasons to use gotos.
Randy.Flood@RHCE2B.COM