Actually, anyone flying left seat in the big iron *IS* a thoroughly and exhaustively trained, very senior and experienced person fully capable of making the correct decisions and acting on them. And I trust the guy at the front of the plane far more than I trust a team of software developers working in an office somewhere.
You're right -- lose it should have been -- and I did it more than once, so I can't blame it on a typo. I just dashed that post off in a hurry, stream of conciousness, before my morning coffee -- so it really was a stream of semi-conciousness:-) -- I should have proofread it.
Yes, that's exactly what they're doing. And it's a really butchered attempt too. So fine, they have this great new codec -- tiff has a well tested mechanism for specifying a new pixel codec. If they did it this way, they would loose absolutely no functionality - but no, they had to introduce gratuitous incompatibilities, new tags that duplicate exactly the capabilities of existing tiff tags, and remove baseline tiff capabilities. All while maintaining the 32 bit file size limitations of tiff.
What a hack job. I would recommend anybody to stay (far, far) away from supporting this format until there is a (very) strong business case for it (Be pragmatic -- don't loose money over it, but don't help this become standard).
In summary, the MS we've come to know and love is here in full force.
Even if you've got the source code, it won't help you determine if there is remote surveillance embedded in it. That source has to be compiled by a compiler that is controlled by MS. Ok, so lets say you have the source for that. It was compiled by itself, and I'm sure everyone here knows of the paper by Ken Thompson concerning hiding code in a compiler such that it is no longer in the source code.
As Ken Thompson says; "No amount of source-level verification or scrutiny will protect you from using untrusted code."
I still have 100 shares:-) (They were worth less than the comissions would have been to sell them almost the day I wound up getting them through the ESPP.) I also have a certificate on my office wall stating that I am the proud owner of 5000 options to purchase SGI shares at a strike price of (let me look and stop laughing long enough to type) $29.875/share:-) They were underwater from the moment I got them, and never once came up for air.
This ch. 11 deal is (IMHO) the end result of a continuing and unbroken string of totally boneheaded decisions by Sr Management at SGI that started about 12 years ago and still hasn't let up.
I've changed that setting and it STILL leaks hundreds of megabytes. Until it bogs my machine down to the point of unusability and crashes. Hows THAT for "denial of service"?
No, the parenthesis (missing, in this case) tell the compiler to call the function. The naked function name -- without the parenthesis -- is the address of the function, and it will be non 0.
That's very true -- and people wouldn't complain about it if they didn't care. And they wouldn't care if is was no good at all. But it does have problems. And some of those problems just do not belong in a "best-of-breed" application as you put it, regardless of its complexity or size -- like leaking a gigabyte of memory, for example...:-)
>Okay hotshot, name ONE project you've worked on that has the complexity and wide audience of a web >browser? This example you give must incorporate a complex UI (not in terms of use, but necessary >complexity on the back end), advanced embedded scripting, render quickly, and yet not ignore security >concerns.
A couple of the projects I've worked on around here are as big or bigger than mozilla, albeit not with the same audience size as mozilla, but arguably a harder problem domain. (One of these is ~5 million lines of code, the other >30 million - with scripting, embedded languages, exceedingly complex UI -- over a thousand different tools and menu items, many hundreds of dialogs -- definitely quick rendering -- both on screen and off -- huge datasets -- Think Alias Maya and Studio Tools)
BTW I've been doing this stuff for ~20 years -- I'm not some newbie spouting off. I'm not as smart as some of the guys around here, but I do alright.
> In theory, theory and practice are the same thing. In practice... well... the real world works > differently.
In many cases that's very true. But when it comes to Agile development, it's matching up pretty well for us these days.:-)
> Your solution works for small-scale, one-off solutions on a contract basis for one or a small > set of customers.
I disagree -- It's working pretty well for our applications, and if you read above, they're pretty darn big. I suspect that if you applied a modicum of software engineering discipline to your projects, it might work not too badly for you too.
>Software will cease to be mediocre when users stop accepting mediocrity.
Such people are called Mac users, and their numbers are growing. (That ought to fan the flames a bit:-)
> Calling them "unforgiveable" doesn't make them easy to fix.
No, once they've been there for a while, they become hard to fix -- as are many bugs that have been around for a while. But just because they are hard to fix, doesn't make them forgivable.
These sort of bugs are generally indicitave of architectural flaws and broken development processes. What needs to be done, right from the start of a project, is constant performance measurement and memory footprint measurements. And when things start going bad, you *stop* feature work right away and fix the problems. Before they become too hard to fix.
Agile works well for this -- every iteration should be shippable. No leaks or performance problems (or other bugs for that matter) should ever live more than two iterations, otherwise you hit the big virtual stop button and fix the damn problems before proceeding.
Once things like this creep into your code base, and the accepted wisdom is that they're too hard to fix, you get this tolerance for mediocrity setting in amongst your developers as they see the problems, and ignore them because it's just accepted as normal. This is the beginning of the end for a software project.
If you are an architect or development manager, never accept mediocrity -- fight it with everything you have. It's the only way to create software you can take pride in and that your users will love. And the people working on the project will appreciate it too.
Mine rises to 800 meg and more until the OS grinds to a halt and everything goes pear shaped. Just load some pages that are really heavy with images -- it leaks them -- all of them, and nothing short of terminating the process releases the memory, and it never reuses the leaked memory. And I have the cacheing "feature" (can't remember it exactly) turned off, and it still leaks horribly.
See;
http://www.independencenow.com/home.html#
It can climb up and down stairs, raise you up to eye level of other standing humans, handle gravel and other rougher terrain.
Costs 20k, but If I needed a wheelchair, that's the one I'd get.
Where did I find the Evil Research(tm)? Where else but directly from the source of evil -- no, no, not Microsoft, the *other* source of evil -- Intel:-)
And their excellent paper titled "Speculative Precomputation: Long-range Prefetching of Delinquent Loads" by Jamison Collins, Hong Wang, Dean Tullsen, Christopher Hughes, Yong-Fong Lee, Dan Lavery, and John Shen can be found here; http://www.intel.com/research/mrl/library/148_coll ins_j.pdf
(Those damn delinquent loads -- GET OFF OF MY LAWN YOU DELINQUENTS!:-)
There's also "Physical Experimentation with Prefetching Helper Threads on Intel's Hyper-Threaded Processors" by Dongkeun Kim, Steve Shih-wei Liao, Perry Wang, Juan del Cuvillo, Xinmin Tian, Xiang Zou, Hong Wang, Donald Yeung, Milind Girkar, and John Shen which can be found here; http://www.cgo.org/cgo2004/papers/02_80_Kim_D_REVI SED.pdf
There, that ought to cure your insomnia and answer your question: "How could a thread possibly be executed far enough in advance to make the time savings worth while, yet be sure that it is "predicting" memory accesses correctly?"
Read the papers carefully -- there will be a quiz later.
AC writes: "Programmers better get used to writing parallel programs because it's the most power efficient thing we can come up with. The ground-breaking research will not be in architecture, but in new software, languages and programming paradigms designed to make the life of the parallel programmer easier."
In order for a CPU architecture to be commercially viable, it must run existing programs fast. And existing programs -- especially large ones, and they're all getting large now, are very unweildy. Imagine taking something like Alias' Maya -- over 25 million lines of C++, and making it run well on an explicitly parallel machine. It's just not going to happen. It's hard enough to get some parallelism out of some small fragments of it. On a large scale, parallel evaluation and traversal of the Maya Dependency Graph just does not seem feasible. And rewriting it is just not going to happen.
What Maya likes is exactly what Intel is doing with Conroe/Merom -- shorter pipelines, lower branch mispredict penalty, branch prediction for computed jumps, better memory disambiguation for better OoO around memory accesses, more execution resources, better OoO, etc, etc.
Some of those aggressive techniques I wrote about will help even more.
It runs like *crap* on explicitly parallel machines.
There are several techniques for increased performance or throughput that the designers of next gen microarchitectures are likely looking at.
There are extensions to known techniques;
A: more execution units, deeper reorder buffers, etc trying to extract more Instruction Level Paralelism (ILP).
B: More cores = more threads
C: hyper threading -- fill in pipeline bubbles in an OOO superscaler architetcure; also = more threads
I personally don't think any of these carry you very far...
Then there are some new ideas:
a: run-ahead threads -- use another core/hyperthread to perform only the work needed to discover what memory accesses are going to be performed and preload them into the cache - mainly a memory latency hiding technique, but that's not a bad thing as there are many codes that are dominated by memory latency
a': More aggressive OoO run-ahead where other latencies are hidden
Intel has published some good papers on these techniques, but according to those papers these techniques help in-order (read Itanic) cores much more than OoO.
b: aggressive peephole optimization (possibly other simple optimizations usually performed by compilers) done on a large trace cache. Macro/micro-op fusion is a very simple and limited start at this sort of thing. (Don't know if this is a good idea or not, or whether anyone is doing it)
But it's far from clear what AMD is doing. Whatever it is, anything that improves single threaded performance will be very welcome. Threading is hard (hard to design, implement, debug, maintain, and hard to QA). And not all code bases or algorithms are amenable to it.
Intels next gen (nahalem) is likely going to do some OoO look-ahead, as they have Andy Glew working on it, and that's been an area of interest to him...
A very interesting new concept is that of "strands" (AKA: dependency chains, traces, or sub-threads). (The idea is instead of scheduling independent instructions, schedule independent dependency chains. - For more info, see http://www.cse.ucsd.edu/users/calder/papers/IPDPS- 05-DCP.pdf) But it's not clear how well it would apply to OoO architectures, but I would expect that likely approaches would also need large trace caches.
Applying this to an OoO x86 architecture, and detecting the critical strand dynamically in that processor could be very cool, and potentially revolutionary.
It will be very interesting to see what Intel and AMD are up to -- it would be even cooler of they both find different ways to make things go faster...
Your post is probably the first I've read on/. that I would use mod points to mod down. It is your good fortune (and possibly everyone else's bad) that I don't happen to have any at the moment.
I guess I'm feeling like I have karma to burn today:-)
Actually, anyone flying left seat in the big iron *IS* a thoroughly and exhaustively trained, very senior and experienced person fully capable of making the correct decisions and acting on them. And I trust the guy at the front of the plane far more than I trust a team of software developers working in an office somewhere.
You're right -- lose it should have been -- and I did it more than once, so I can't blame it on a typo. I just dashed that post off in a hurry, stream of conciousness, before my morning coffee -- so it really was a stream of semi-conciousness :-) -- I should have proofread it.
Yes, that's exactly what they're doing. And it's a really butchered attempt too. So fine, they have this great new codec -- tiff has a well tested mechanism for specifying a new pixel codec. If they did it this way, they would loose absolutely no functionality - but no, they had to introduce gratuitous incompatibilities, new tags that duplicate exactly the capabilities of existing tiff tags, and remove baseline tiff capabilities. All while maintaining the 32 bit file size limitations of tiff.
What a hack job. I would recommend anybody to stay (far, far) away from supporting this format until there is a (very) strong business case for it (Be pragmatic -- don't loose money over it, but don't help this become standard).
In summary, the MS we've come to know and love is here in full force.
Even if you've got the source code, it won't help you determine if there is remote surveillance embedded in it. That source has to be compiled by a compiler that is controlled by MS. Ok, so lets say you have the source for that. It was compiled by itself, and I'm sure everyone here knows of the paper by Ken Thompson concerning hiding code in a compiler such that it is no longer in the source code.
As Ken Thompson says; "No amount of source-level verification or scrutiny will protect you from using untrusted code."
It would be only around 10% or so (instead of 80%) if COBOL weren't so verbose.
:-)
So what would you call an OO Cobol?
c; -> c++;
COBOL -> ADD ONE TO COBOL GIVING COBOL.
>Doing everything perfectly in a landing is the *hardest* part of flying.
Depends on the plane. Landing a Pitts will give you white knuckles and pucker you up but good. A cessna or a piper -- yawn.
I've done that and it STILL leaks hundreds of megabytes of memory. The only extension I have installed is the most recent version of AdBlock.
It's leaks like a seive, and the collective refusal to admit to this craptacular state of affairs is really annoying to those of us who experience it.
I still have 100 shares :-) (They were worth less than the comissions would have been to sell them almost the day I wound up getting them through the ESPP.) I also have a certificate on my office wall stating that I am the proud owner of 5000 options to purchase SGI shares at a strike price of (let me look and stop laughing long enough to type) $29.875/share :-) They were underwater from the moment I got them, and never once came up for air.
This ch. 11 deal is (IMHO) the end result of a continuing and unbroken string of totally boneheaded decisions by Sr Management at SGI that started about 12 years ago and still hasn't let up.
The end of a once great and cool company.
http://kb.mozillazine.org/Browser.sessionhistory.m ax_total_viewers
I've changed that setting and it STILL leaks hundreds of megabytes. Until it bogs my machine down to the point of unusability and crashes. Hows THAT for "denial of service"?
No, the parenthesis (missing, in this case) tell the compiler to call the function. The naked function name -- without the parenthesis -- is the address of the function, and it will be non 0.
>Firefox is one of the best browsers available.
:-)
That's very true -- and people wouldn't complain about it if they didn't care. And they wouldn't care if is was no good at all. But it does have problems. And some of those problems just do not belong in a "best-of-breed" application as you put it, regardless of its complexity or size -- like leaking a gigabyte of memory, for example...
>Okay hotshot, name ONE project you've worked on that has the complexity and wide audience of a web
:-)
:-)
:-)
>browser? This example you give must incorporate a complex UI (not in terms of use, but necessary
>complexity on the back end), advanced embedded scripting, render quickly, and yet not ignore security >concerns.
A couple of the projects I've worked on around here are as big or bigger than mozilla, albeit not with the same audience size as mozilla, but arguably a harder problem domain. (One of these is ~5 million lines of code, the other >30 million - with scripting, embedded languages, exceedingly complex UI -- over a thousand different tools and menu items, many hundreds of dialogs -- definitely quick rendering -- both on screen and off -- huge datasets -- Think Alias Maya and Studio Tools)
BTW I've been doing this stuff for ~20 years -- I'm not some newbie spouting off. I'm not as smart as some of the guys around here, but I do alright.
> In theory, theory and practice are the same thing. In practice... well... the real world works
> differently.
In many cases that's very true. But when it comes to Agile development, it's matching up pretty well for us these days.
> Your solution works for small-scale, one-off solutions on a contract basis for one or a small
> set of customers.
I disagree -- It's working pretty well for our applications, and if you read above, they're pretty darn big. I suspect that if you applied a modicum of software engineering discipline to your projects, it might work not too badly for you too.
>Software will cease to be mediocre when users stop accepting mediocrity.
Such people are called Mac users, and their numbers are growing. (That ought to fan the flames a bit
Cheers
It leaks horribly regardless of that setting
> Calling them "unforgiveable" doesn't make them easy to fix.
No, once they've been there for a while, they become hard to fix -- as are many bugs that have been around for a while. But just because they are hard to fix, doesn't make them forgivable.
These sort of bugs are generally indicitave of architectural flaws and broken development processes. What needs to be done, right from the start of a project, is constant performance measurement and memory footprint measurements. And when things start going bad, you *stop* feature work right away and fix the problems. Before they become too hard to fix.
Agile works well for this -- every iteration should be shippable. No leaks or performance problems (or other bugs for that matter) should ever live more than two iterations, otherwise you hit the big virtual stop button and fix the damn problems before proceeding.
Once things like this creep into your code base, and the accepted wisdom is that they're too hard to fix, you get this tolerance for mediocrity setting in amongst your developers as they see the problems, and ignore them because it's just accepted as normal. This is the beginning of the end for a software project.
If you are an architect or development manager, never accept mediocrity -- fight it with everything you have. It's the only way to create software you can take pride in and that your users will love. And the people working on the project will appreciate it too.
Mine rises to 800 meg and more until the OS grinds to a halt and everything goes pear shaped. Just load some pages that are really heavy with images -- it leaks them -- all of them, and nothing short of terminating the process releases the memory, and it never reuses the leaked memory. And I have the cacheing "feature" (can't remember it exactly) turned off, and it still leaks horribly.
And the one that does work right is a memory leak.
Sorry for shouting, but I'd be happy if they did *nothing* but fix the memory leaks.
Memory leaks are unforgivable.
See; http://www.independencenow.com/home.html# It can climb up and down stairs, raise you up to eye level of other standing humans, handle gravel and other rougher terrain. Costs 20k, but If I needed a wheelchair, that's the one I'd get.
> God is a woman.
:-)
Then I'm going to hell, and I won't even know why.
Where did I find the Evil Research(tm)? Where else but directly from the source of evil -- no, no, not Microsoft, the *other* source of evil -- Intel :-)
c lin/docs/main_cls/mergedprojects/optaps_cls/common /optaps_pgo_sspopt.htm
l ins_j.pdf
:-)
I SED.pdf
It's already in their compiler;
http://www.intel.com/software/products/compilers/
(Their compiler absolutely rocks BTW)
And their excellent paper titled "Speculative Precomputation: Long-range Prefetching of Delinquent Loads" by Jamison Collins, Hong Wang, Dean Tullsen, Christopher Hughes, Yong-Fong Lee, Dan Lavery, and John Shen can be found here;
http://www.intel.com/research/mrl/library/148_col
(Those damn delinquent loads -- GET OFF OF MY LAWN YOU DELINQUENTS!
There's also
"Physical Experimentation with Prefetching Helper Threads on Intel's Hyper-Threaded Processors" by
Dongkeun Kim, Steve Shih-wei Liao, Perry Wang, Juan del Cuvillo, Xinmin Tian, Xiang Zou, Hong Wang, Donald Yeung, Milind Girkar, and John Shen which can be found here;
http://www.cgo.org/cgo2004/papers/02_80_Kim_D_REV
And also;
"Speculative Precomputation on Chip Multiprocessors" by Jeffery Brown, Hong Wang, George Chrysos, Perry Wang, and John Shen at;
http://www.cs.ucsd.edu/~jbrown/papers/sp-cmp.pdf
There, that ought to cure your insomnia and answer your question: "How could a thread possibly be executed far enough in advance to make the time savings worth while, yet be sure that it is "predicting" memory accesses correctly?"
Read the papers carefully -- there will be a quiz later.
AC writes: "Programmers better get used to writing parallel programs because it's the most power efficient thing we can come up with. The ground-breaking research will not be in architecture, but in new software, languages and programming paradigms designed to make the life of the parallel programmer easier."
In order for a CPU architecture to be commercially viable, it must run existing programs fast. And existing programs -- especially large ones, and they're all getting large now, are very unweildy. Imagine taking something like Alias' Maya -- over 25 million lines of C++, and making it run well on an explicitly parallel machine. It's just not going to happen. It's hard enough to get some parallelism out of some small fragments of it. On a large scale, parallel evaluation and traversal of the Maya Dependency Graph just does not seem feasible. And rewriting it is just not going to happen.
What Maya likes is exactly what Intel is doing with Conroe/Merom -- shorter pipelines, lower branch mispredict penalty, branch prediction for computed jumps, better memory disambiguation for better OoO around memory accesses, more execution resources, better OoO, etc, etc.
Some of those aggressive techniques I wrote about will help even more.
It runs like *crap* on explicitly parallel machines.
Excellent and informative post.
There are several techniques for increased performance or throughput that the designers of next gen microarchitectures are likely looking at.
- 05-DCP.pdf)
There are extensions to known techniques;
A: more execution units, deeper reorder buffers, etc trying to extract more Instruction Level Paralelism (ILP).
B: More cores = more threads
C: hyper threading -- fill in pipeline bubbles in an OOO superscaler architetcure; also = more threads
I personally don't think any of these carry you very far...
Then there are some new ideas:
a: run-ahead threads -- use another core/hyperthread to perform only the work needed to discover what memory accesses are going to be performed and preload them into the cache - mainly a memory latency hiding technique, but that's not a bad thing as there are many codes that are dominated by memory latency
a': More aggressive OoO run-ahead where other latencies are hidden
Intel has published some good papers on these techniques, but according to those papers these techniques help in-order (read Itanic) cores much more than OoO.
b: aggressive peephole optimization (possibly other simple optimizations usually performed by compilers) done on a large trace cache. Macro/micro-op fusion is a very simple and limited start at this sort of thing. (Don't know if this is a good idea or not, or whether anyone is doing it)
But it's far from clear what AMD is doing. Whatever it is, anything that improves single threaded performance will be very welcome. Threading is hard (hard to design, implement, debug, maintain, and hard to QA). And not all code bases or algorithms are amenable to it.
Intels next gen (nahalem) is likely going to do some OoO look-ahead, as they have Andy Glew working on it, and that's been an area of interest to him...
A very interesting new concept is that of "strands" (AKA: dependency chains, traces, or sub-threads). (The idea is instead of scheduling independent instructions, schedule independent dependency chains. - For more info, see http://www.cse.ucsd.edu/users/calder/papers/IPDPS
But it's not clear how well it would apply to OoO architectures, but I would expect that likely approaches would also need large trace caches.
Applying this to an OoO x86 architecture, and detecting the critical strand dynamically in that processor could be very cool, and potentially revolutionary.
It will be very interesting to see what Intel and AMD are up to -- it would be even cooler of they both find different ways to make things go faster...
Your post is probably the first I've read on /. that I would use mod points to mod down. It is your good fortune (and possibly everyone else's bad) that I don't happen to have any at the moment.
:-)
I guess I'm feeling like I have karma to burn today