1) Apple controls the entire software stack, so they can port it all to Itanium and optimize it for that. 2) Apple has experience doing a port like this (68K -> PPC) 3) Intel gets to ship more Itanium -- increasing volume ( Good for Intel ) 4) Apple could suck up the low end of Itaniums which nobody wants for servers or (the almost totally dead workstation market) 5) Intel can put some pressue back on Microsoft.
This is not an entirely crazy idea. There are advantages for both Intel and Apple. Apple controling the entire software stack is in my view a point of critical importance. Intel getting to save Itanium and ship more Itanium volume is great for them.
Who does this really hurt if it happens? AMD with x86-64 and Dell. If I was Dell I would be sweating bullets right about now -- because you may have to chose between staying with Microsoft -- and supporting multicore PPC (like XBOX 360) for the future, or sticking with Intel as they stop focussing on x86 and go for Itanium.
Itanium looked like a turd because it was over-promised. If you have the compiler technology it can do damn well at some things. I'd bet an Itanium optimized photoshop -- or other Apple media development software would rock.
Week? It was still going after over a week on the P133 with 96 MB RAM I saved from the trash. On a 486/33 with say 16 MB (Let's be generous) try increasing your units to at least MONTHS.
If you include KDE or GNOME make it YEARS.
I hope you have a really BIG swap file too and a watercooled hard drive:-)
Astronautix has an interesting article on exactly this stuff.
http://www.astronautix.com/craftfam/rescue.htm Check out the douglas paracone... http://www.astronautix.com/craft/para cone.htm
This stuff has been thought about *hard* for several years. NASA is really stuck between a rock and a hard place. The media and public treat each disaster as some kind of media circus. The astronaughts themselves know the risks, and accept them. Everyone else bitching and whining should be told to deal with it. It's not like the astronaughts have a gun held against their head. They are smart people and know the risks.
Let NASA fly again -- quit the damn political ass covering and witch hunts. The astronaughts won't fly the machine if they think the risk is too high. That includes the teachers they launched too. It's a rocket for god's sake, it isn't a bloody Volvo.
Nice! They made some real improvements on the T3E judging by the first paragraphs in that paper... Did you ever figure out what caused that weird spike in figure 3 at 32 bytes?
I don't think they thought that at all (Let's build a supercomputer). I think the natural problem they were trying to solve.
This is because when you have the following conditions:
-- Lots of memory bandwidth needed -- Fast floating point -- Parallelizable code -- Hand tuned kernels OK
You end up with something that looks lots like a supercomputer. You just turned your compute bound problem into an IO bound problem. We may want to revise that saying -- and say 'You turned your compute bound problem into a coding problem'. Supercomputer performance seems more bound by the feasibility of extracting decent performance from the iron than it used to be -- Judging by the stuff I have read by the old-hands.
What's the scalar performance of one of these beasties?
Can an Athlon 64 / P4 beat it on scalar code? The whole HPC world has gotten boring since Cray died. Here's why I say that:
The Cray 1 had the best SCALAR and VECTOR performance in the world.
The Cray 2 was an ass kicker, the Cray 3 was a real ass kicker (if only they could build them reliably).
Cray pushed the boundaries, he pushed them too far at some points -- designing and trying to build machines that they couldn't make reliable.
So it'll be a cold day in hell before I get all fired up over the fact that someone else managed to glue together a bazillion 'killer micros' and win at Linpack... Now if someone would bring back the idea of transputers, or we saw some *real* efforts at Dataflow and FP then I'd be excited. I'd love a PC with 8 small, simple, fast, in-order tightly bound cpus. Don't say CELL, all indications are that they will be a *real* PITA to program to get any decent performance out of.
OK, so if ptr points to char, then we have to do an indirect lda something like
; test that the char pointed to by ptr is null
LDY #0;zero so we can use 0 offset for indirect
LDA (ptr),y
BNE
;I know those are assumptions, but we are talking 6502 here and assuming null == 0 on a 6502, and that a char uses a byte aren't unreasonable Nope, I haven't assembled this it's from memory...
Or?
If only they would re-release this. Or Star-Raiders. (Not Star-Raiders 2). Star Raiders was amazing at the time it came out.
Then I would think about buying one.
I don't care about movie quality football.
I want to hear the sound of Zylon ships attacking, crank the engines to full power and scream thru space to save another space-station
The A380 has one big strike against it. No track record. The same comment can be made for the 7E7 or any new airliner. One thing about the seven-four is it has flown a *LOT* of miles. It's a known quantity. So do you really want to be onboard an 800 person jet with no track record and an avionics system designed by the same people who managed to land^h^h^h^hcrash at the Paris air show?
This may seem unfair -- and it probably is, but one thing about Boeing is they have been building jets for a while and they have a *really* large sample of what the jets can hack -- both by operators who do maintenance and those who don't. I feel safe in a seven-three, because there are so damn many of them in service and the kinks are more likely to have been found and worked out.
P.S. Does anybody else wonder about the 7E7 and lightning? I know they have flown a few hours in B2's that are composite but nothing like the total in air-hours for an airliner.
--Tarp
One word - Abstraction. A modern machine does much more than the old C64 (I'm an Atari 8 bit and Ohio Scientific Challenger owner myself).. Those early machines weren't doing nearly as much as a modern PC is. It's still a valid point however. We have decided we would rather live in a world of abstracted API layer after layer, and we pay for it dearly. The flip side of the coin is that a good API allows us to run our software on many different machines. When the API is poor then you end up with a software mono-culture.
The AMIGA is an excellent example of a machine that had tight hardware/software synergy. Similar time frame PC's were laughable compared to the AMIGA, but who won? It's because eventually you could upgrade them, you could change them. Mediocrity with expandability works.
Very true, Anybody remember Algol 68. It had the idea of coroutines in there too.
We are all too scared to switch to F.P. or PROLOG
on
Where's My 10 Ghz PC?
·
· Score: 1
Many good points by other posters, but the core reason is simple. We are too scared to do it.
Great ideas have been around for a fair while how to speed things up by orders of magnitude. Examples? Anybody remember John Backus and F.P.??? Remember Prolog?
The truth of the matter is that great ideas for ways around the performance limitations we are experiencing have existed for decades, and we just haven't got the GUTS to make the jump.
I dream sometimes of having the courage to throw away my imperative mind, and program in a world of logic (PROLOG) or make parallelism my mantra (F.P). I haven't had the guts yet, I always fall back to LDA #$0F, STA $D20F. have you had the guts to do it? If you have then please followup.
We need to hear from people who made the leap from the imperative world of constructs we are all so familiar with and immersed themselves in those other worlds.
Exactly what I was thinking - UMA on the SGI O2.
This has been talked about and used on PC's before too:
http://www.byte.com/art/9607/sec18/art5.htm
There's one thing this may work well with: Has anyone tried opening say a 16384x16384*32 bitmap on a PC with a typical $500 ATI / NVIDIA (Say X800 / 6800) card. Try doing a smooth continuous zoom / roaming. Can you do it smoothly at say 60 fps at 1280x1024?
In theory - you'd only need say 1280x1024*4 (32bpp) * 60 bytes per second of bandwidth - That's 300 MB/sec. Can you do that on an AGP card?
I think you can probably do it on an O2 without a problem.
--Tarp
Get whatever webserver the vendor recommends, throw/. on it, find the biggest firehose you can and throw the IP of the test system on the/. homepage.
Measure the amount of sweat from the marketoids foreheads.
It all comes down to a fundamental question computer architects have been asking for a while now. Is it better to try and do more dependency analysis at compile time, or at run time?
If you look at modern CPU's they are extremely complex, having piles of logic added to try and hide memory latency, pipeline stalls, and dependencies. You end up with a very complex O-O-O behemoth.
I for one believe that SGI / MIPS are actually a good indicator of how good / bad Itanium is (Whether it still is a question for debate).
SGI / MIPS looked at the Itanium and said 'We can't beat it'...
A little history is in order:
Recall the R4400 - MIPS basically went the P4 route with the R4400 - Super pipelined it, cranked up the clock speed and they all said 'Whoa! Memory latency is killing us'... So what did they do? They created the R10000. The R10000 performed OOO and could handle piles of outstanding memory requests without stalling. Look where Intel is going now.... MIPS/SGI went thru this stuff 10 years ago.
Ask yourself, how fast would an R10000/12000/14000/16000/18000 derivative run if it was fabbed in Intel's / AMD's latest and greatest process technology?
The real question still comes back to the first point I made: How far can we push dependency analysis on the compile side? Perhaps they pushed it too far on the Itanium.
All the comments about sales volume are things to be expected. Compare any architecture that pushes the envelope with an entrenched existing architecture in terms of cost. Let me spell it out:
Itanium cannot compete with x86/amd64 due to economies of scale in terms of price-performance.
Now do a s/Itanium/other_favorite_architecture/ and the above is still true. Hell, even the Power PC can't compete with x86.
--tarp (please forgive bad formatting first time posting to slashdot)
Here's the Pro's:
1) Apple controls the entire software stack, so they can port it all to Itanium and optimize it for that.
2) Apple has experience doing a port like this (68K -> PPC)
3) Intel gets to ship more Itanium -- increasing volume ( Good for Intel )
4) Apple could suck up the low end of Itaniums which nobody wants for servers or (the almost totally dead workstation market)
5) Intel can put some pressue back on Microsoft.
This is not an entirely crazy idea. There are advantages for both Intel and Apple. Apple controling the entire software stack is in my view a point of critical importance. Intel getting to save Itanium and ship more Itanium volume is great for them.
Who does this really hurt if it happens? AMD with x86-64 and Dell. If I was Dell I would be sweating bullets right about now -- because you may have to chose between staying with Microsoft -- and supporting multicore PPC (like XBOX 360) for the future, or sticking with Intel as they stop focussing on x86 and go for Itanium.
Itanium looked like a turd because it was over-promised. If you have the compiler technology it can do damn well at some things. I'd bet an Itanium optimized photoshop -- or other Apple media development software would rock.
--Tarp
Week? It was still going after over a week on the P133 with 96 MB RAM I saved from the trash. On a 486/33 with say 16 MB (Let's be generous) try increasing your units to at least MONTHS.
:-)
If you include KDE or GNOME make it YEARS.
I hope you have a really BIG swap file too and a watercooled hard drive
--Tarp
Astronautix has an interesting article on exactly this stuff.
a cone.htm
http://www.astronautix.com/craftfam/rescue.htm
Check out the douglas paracone...
http://www.astronautix.com/craft/par
This stuff has been thought about *hard* for several years. NASA is really stuck between a rock and a hard place. The media and public treat each disaster as some kind of media circus. The astronaughts themselves know the risks, and accept them. Everyone else bitching and whining should be told to deal with it. It's not like the astronaughts have a gun held against their head. They are smart people and know the risks.
Let NASA fly again -- quit the damn political ass covering and witch hunts. The astronaughts won't fly the machine if they think the risk is too high. That includes the teachers they launched too. It's a rocket for god's sake, it isn't a bloody Volvo.
--Tarp
Nice! They made some real improvements on the T3E judging by the first paragraphs in that paper... Did you ever figure out what caused that weird spike in figure 3 at 32 bytes?
Didn't the T3D and T3E have the worlds fastest scalar perfomrnace microprocessor (shamless historical plug for Alpha) in each node ;-)
I don't think they thought that at all (Let's build a supercomputer). I think the natural problem they were trying to solve.
This is because when you have the following conditions:
-- Lots of memory bandwidth needed
-- Fast floating point
-- Parallelizable code
-- Hand tuned kernels OK
You end up with something that looks lots like a supercomputer. You just turned your compute bound problem into an IO bound problem. We may want to revise that saying -- and say 'You turned your compute bound problem into a coding problem'. Supercomputer performance seems more bound by the feasibility of extracting decent performance from the iron than it used to be -- Judging by the stuff I have read by the old-hands.
What's the scalar performance of one of these beasties?
Can an Athlon 64 / P4 beat it on scalar code? The whole HPC world has gotten boring since Cray died. Here's why I say that:
The Cray 1 had the best SCALAR and VECTOR performance in the world.
The Cray 2 was an ass kicker, the Cray 3 was a real ass kicker (if only they could build them reliably).
Cray pushed the boundaries, he pushed them too far at some points -- designing and trying to build machines that they couldn't make reliable.
So it'll be a cold day in hell before I get all fired up over the fact that someone else managed to glue together a bazillion 'killer micros' and win at Linpack...
Now if someone would bring back the idea of transputers, or we saw some *real* efforts at Dataflow and FP then I'd be excited. I'd love a PC with 8 small, simple, fast, in-order tightly bound cpus. Don't say CELL, all indications are that they will be a *real* PITA to program to get any decent performance out of.
IGNORE prior post... Total brain fart.... It tests to see if the value pointed to by ptr is 0 -- sorry... TGIF!
We should just be glad it isn't iAPX432 assembler... Actually no -- that would rock...
OK, so if ptr points to char, then we have to do an indirect lda something like
;zero so we can use 0 offset for indirect
;I know those are assumptions, but we are talking 6502 here and assuming null == 0 on a 6502, and that a char uses a byte aren't unreasonable
; test that the char pointed to by ptr is null
LDY #0
LDA (ptr),y
BNE
Nope, I haven't assembled this it's from memory... Or?
On the 6502 I think you could just do an LDA, and use the implicit setting of the zero flag to allow you to do your branch.
LDA ptr
BNE L1_not_null
--6502 forever.
If only they would re-release this. Or Star-Raiders. (Not Star-Raiders 2). Star Raiders was amazing at the time it came out.
Then I would think about buying one.
I don't care about movie quality football.
I want to hear the sound of Zylon ships attacking, crank the engines to full power and scream thru space to save another space-station
The A380 has one big strike against it. No track record. The same comment can be made for the 7E7 or any new airliner. One thing about the seven-four is it has flown a *LOT* of miles. It's a known quantity. So do you really want to be onboard an 800 person jet with no track record and an avionics system designed by the same people who managed to land^h^h^h^hcrash at the Paris air show?
This may seem unfair -- and it probably is, but one thing about Boeing is they have been building jets for a while and they have a *really* large sample of what the jets can hack -- both by operators who do maintenance and those who don't. I feel safe in a seven-three, because there are so damn many of them in service and the kinks are more likely to have been found and worked out.
P.S. Does anybody else wonder about the 7E7 and lightning? I know they have flown a few hours in B2's that are composite but nothing like the total in air-hours for an airliner.
--Tarp
The worst part was falling asleep at night and having dreams where you actually broke down your movements into quanta that would fit in one T.U.
I would dream that I couldn't possibly walk that far in one turn, so I should stop behind a tree for cover.
How does this stuff compare to Burton Smiths MTA/Tera stuff?
One word - Abstraction.
A modern machine does much more than the old C64 (I'm an Atari 8 bit and Ohio Scientific Challenger owner myself)..
Those early machines weren't doing nearly as much as a modern PC is. It's still a valid point however. We have decided we would rather live in a world of abstracted API layer after layer, and we pay for it dearly. The flip side of the coin is that a good API allows us to run our software on many different machines. When the API is poor then you end up with a software mono-culture.
The AMIGA is an excellent example of a machine that had tight hardware/software synergy. Similar time frame PC's were laughable compared to the AMIGA, but who won? It's because eventually you could upgrade them, you could change them. Mediocrity with expandability works.
Very true, Anybody remember Algol 68. It had the idea of coroutines in there too.
Many good points by other posters, but the core reason is simple. We are too scared to do it.
Great ideas have been around for a fair while how to speed things up by orders of magnitude. Examples? Anybody remember John Backus and F.P.??? Remember Prolog?
The truth of the matter is that great ideas for ways around the performance limitations we are experiencing have existed for decades, and we just haven't got the GUTS to make the jump.
I dream sometimes of having the courage to throw away my imperative mind, and program in a world of logic (PROLOG) or make parallelism my mantra (F.P). I haven't had the guts yet, I always fall back to LDA #$0F, STA $D20F. have you had the guts to do it? If you have then please followup.
We need to hear from people who made the leap from the imperative world of constructs we are all so familiar with and immersed themselves in those other worlds.
--tarp
Exactly what I was thinking - UMA on the SGI O2. This has been talked about and used on PC's before too: http://www.byte.com/art/9607/sec18/art5.htm There's one thing this may work well with: Has anyone tried opening say a 16384x16384*32 bitmap on a PC with a typical $500 ATI / NVIDIA (Say X800 / 6800) card. Try doing a smooth continuous zoom / roaming. Can you do it smoothly at say 60 fps at 1280x1024? In theory - you'd only need say 1280x1024*4 (32bpp) * 60 bytes per second of bandwidth - That's 300 MB/sec. Can you do that on an AGP card? I think you can probably do it on an O2 without a problem. --Tarp
Get whatever webserver the vendor recommends, throw /. on it, find the biggest firehose you can and throw the IP of the test system on the /. homepage.
Measure the amount of sweat from the marketoids foreheads.
It all comes down to a fundamental question computer architects have been asking for a while now. Is it better to try and do more dependency analysis at compile time, or at run time?
If you look at modern CPU's they are extremely complex, having piles of logic added to try and hide memory latency, pipeline stalls, and dependencies. You end up with a very complex O-O-O behemoth.
I for one believe that SGI / MIPS are actually a good indicator of how good / bad Itanium is (Whether it still is a question for debate). SGI / MIPS looked at the Itanium and said 'We can't beat it'...
A little history is in order:
Recall the R4400 - MIPS basically went the P4 route with the R4400 - Super pipelined it, cranked up the clock speed and they all said 'Whoa! Memory latency is killing us'... So what did they do? They created the R10000. The R10000 performed OOO and could handle piles of outstanding memory requests without stalling. Look where Intel is going now.... MIPS/SGI went thru this stuff 10 years ago.
Ask yourself, how fast would an R10000/12000/14000/16000/18000 derivative run if it was fabbed in Intel's / AMD's latest and greatest process technology?
The real question still comes back to the first point I made: How far can we push dependency analysis on the compile side? Perhaps they pushed it too far on the Itanium.
All the comments about sales volume are things to be expected. Compare any architecture that pushes the envelope with an entrenched existing architecture in terms of cost. Let me spell it out:
Itanium cannot compete with x86/amd64 due to economies of scale in terms of price-performance.
Now do a s/Itanium/other_favorite_architecture/ and the above is still true. Hell, even the Power PC can't compete with x86.
--tarp (please forgive bad formatting first time posting to slashdot)