No. All a turbo allows you to do is burn the fuel in the engine more rapidly. You get more power, but at an increase of fuel economy. This solution is making use of the currently wasted byproduct of internal combustion; i.e heat to get more power from the same amount of fuel.
Well, it depends on how you use the turbine. As the exhaust gasses expand through the turbine they cool down. Having a steam engine is just another way of extracting part of the heat that goes out the tailpipe.
So, usually with a turbocharger the turbine is used to compress the intake air, which as you said allows one to produce more power with the same engine. Due to less friction there might be slightly less fuel consumption than a larger equally powerful naturally aspirated engine.
But then, you can instead connect the turbine to the output shaft via a reduction gearing. This is called turbocompounding, and was used in aircraft engines in the late 1940'ies (also, some modern truck engines by Scania use it today). That sure IMHO sounds like a simpler solution than adding a steam engine.
..the crap on your code to get the speed from 1.5 trillion operations per second to 2 trillion. Or if you're smart you can sit on the beach drinking cocktails for 6 months and wait for the next generation of CPUs to come out.
Yes, this is one reason why optimization usually doesn't matter. It's also entirely irrelevant for this discussion. What we have here is an implementation the the BLAS api, which ultimately is the thing that (wild guess) about 50 % of all supercomputer cycles in the world are spent doing. As these supercomputers are very expensive, and users use them because they need maximum performance, it makes sense economically to spend quite some time on an optimized BLAS library.
We use BLAS for most of our Linpack runs, as it is the fastest set of generic libraries for the purpose.
BLAS is an API, with multiple implementations, of which the GOTO BLAS is one. Other implementations are e.g. ATLAS, ESSL, ACML and MKL and of course the reference implementation on netlib.
My bigger concern is that I have heard people touting diesel as a replacement for these electric technologies rather than something to be used along with the,.
I don't think it's an either/or proposition. The problem with electricity and hydrogen is storage. So why not use biodiesel, or any other carbohydrate for that matter, as the energy storage medium. There are already high temperature fuel cells that can feed directly on natural gas. Give it some time and we might have fuel cells that munch on diesel fuel. Or then you can have an onboard reformer that breaks down the diesel into something the fuel cell can use.
Cray does not hesitate to use Linux where it is appropriate. However, when you are doing something like designing your own vector processor from scratch, porting Linux to it just doesn't make sense.
My guess is that porting Linux is way less work than writing a new OS from scratch. However, in this case Cray already had unicos running on the predecessor of the X1 vector computer (SV1 IIRC), so it was probably easier to port it, and less pain for existing customers, than to port Linux.
If you were to start from scratch, I think Linux (or perhaps one of the *BSD:s if you have GPL issues) is a pretty no-brainer choice to base your OS on.
You do seem to be implying that Linux-based computers running commodity hardware always makes more sense than using things like proprietary interconnects.
I don't see the connection between the OS and whether the interconnect is proprietary or not. You can have Linux and a proprietary interconnect, or a proprietary OS with a standard interconnect.
Please come out with smaller (2.5") and faster (10krpm) desktop drives that don't cost a fortune (like laptop and enterprise 2.5" drives), and allow many fast, cool, and quiet drives in a SFF.
When will the desktop market transition to 2.5"? When oh when?
So, uh, it seems that AM is a professional level headed guy. No surprises there really.
But to my gripe: Starting the interview with "Do you think it was good to have had the time with BitKeeper in kernel development, or should they have stuck with CVS?". Gee, being so in tune with what's happening in kernel land just makes want to run to the nearest newsstand and get the latest "Linux Format": "The essential read for all Linux users".
Wouldn't surprise me if it's tacit. Say positive things about the company and products, advance on the corporate ladder, while publically they can honestly say that "hey, we're just encouraging our employees to get in touch with our users, be more open blah blah blah" or something like that.
Anyway, it's transparent enough that it's really nothing more than the dotcom version of infomercials.
Yes, you're probably right that it doesn't make sense for AMD economically. But I want to run numerical codes at more than 5 % peak performance on my cheap Opterons, so I want to believe.;-)
The vector units that a cray uses aren't like altivec, sse, or other "bolt-on" vector units. The vector unit on a cray (or NEC) is a latency hiding mechanism. It's a method for forcing the programmer/compiler to structure the code such that the data loaded from memory is used a significant period of time after the load is initiated.
Yes, I know. And that's precisely the reason why I'd like to see real vectors instead of the sse/altivec toy ones. Main memory latency is hundreds of cycles, and it's getting worse all the time.
Additionally, from a microarchitecture perspective, vectors have quite a few advantages there too.
This works pretty well on the HPC code that is used on crays, but not at all for the everyday server/workstation code that opterons run.
I'm not sure about that. I guess technical apps vectorize just as well as HPC codes (well perhaps not the UI, but the code that runs the actual simulation or whatever). Heck, even some database code vectorizes nicely (sorting and hash joins).
Furthermore, to support that sort of vector unit, you need to have about eight times as much memory bandwidth as an opteron, which means many more pins on the socket, which are very expensive.
Yes, as I said some Alpha Tarantula like design is probably overkill for the vast majority of the market. My point was that a vector ISA extension with modest execution resources wouldn't need that much die area, and could help make better use of the available bandwidth, whatever that bandwidth is. As you said yourself, the expensive thing is IO. Transistors are cheap by comparison. So not having instructions that allow one to effectively use the available IO resources is a real shame.
I think you're much more likely to see the cray vector processor retooled with lots of hypertransport connections, so it can use an opteron as its scalar unit, and use the same seastar routers that the xt3 uses. On the X1, the scalar unit already runs ahead of the vector unit, so I bet it's not all that important for the scalar unit to be on-die.
Yes, that sounds feasible. IIRC it is something like this that Cray has cooked up for the Cascade project; I.e. a node consists of 8 (or was it 4) scalar processors connected to memory (I guess these could be Opterons or further in the future some kind of Processor-in-memory (PIM) stuff), and a vector unit with its own cache and fast access to the main memory via the scalar cpu:s.
As for the seastar thing, I think you're right that that's what they'll use for inter-node communication. Currently X1(E) uses Numalink licenced from SGI, so they're certainly looking at replacing that with existing in-house tech. BTW, 2H2006 will see the XT4, with the new Opteron sockets with DDR2 memory and the Seastar2 router that provides twice the BW compared to the existing Seastar.
Now, if only they could put four X1e CPUs into an air-cooled, rack-mount server and charge a reasonable amount for it. I'd much rather have a handful of vector processors than a few dozen opterons, anyday.
NEC sells an entry-level deskside SX-8, called IIRC SX-8i, with one processor. Unfortunately "entry-level" in this case means $100000+.:(
Tera bought far more than a name when they bought us. They also bought a bunch of software and hardware people, many of whom (myself not included) have been with Cray Research (the original Cray) for many years. So, while it's certainly not the Cray of the mid-1980's, the tradition still goes back there, especially with the vector machines like the Cray X1/X1E and its impending follow-on.
You work for Cray. Cool.
Please tell me that this deal implies that AMD is going to add some proper vector instructions to amd64. Pretty pretty please.;-)
Specifically, I'm thinking about something like Alpha Tarantula. Well, the huge bunch of execution units and memory BW of tarantual was perhaps a bit overkill for a general purpose processor, but the vector ISA extensions and the general architecture looked all right. On paper, of course.
How could your comment be moderated insightful? Interesting perhaps, but insightful???
You must be new around here, thinking that the moderation system generally works as it should.;-)
Methinks you've been playing too much Call of Duty
Uh, is that some video game? Well, not knowing what it is I guess I haven't played it.
It's damn hard to shoot long guns accurately on a ship that's heaving and rolling on the ocean.
Umm, having actually been aboard big ships, I can tell you that big ships don't heave very quickly, even in very high seas. Practically speaking, in the relatively calm weather required for these pirates to operate, a big ship is more or less a stationary platform as far as firing a rifle is concerned.
The rule of thumb when engaging vehicles, boats or hardened targets is that you need some decent firepower, and that means 7.62mm or.50caliber heavy machineguns or bigger.
Uh, did you see the pictures of these pirates in their boats? We're talking about 6 m open wooden or fibreglass boats with an outboard motor, with about 3 people aboard. You don't exactly need a battleship to disable those. A rifle bullet will go through a boat like that like a hot knife goes through butter, for lack of a better analogy. But yeah, a machinegun would be even better (that's why I mentioned it in my previous post).
Don't forget that these pirates may be shooting at you as well, which makes your job that much harder.
Yes, of course. But my point was that a big ship is a significantly better gun platform, offering much better stability and protection from incoming fire, than a 6m open boat.
I'm wondering why these cargo ships are not defending themselves. Cargo ships are pretty stable even in choppy seas, and have lots of steel to cover behind. Just a simple high power rifle with a scope, and you could pick off these pirates when they're coming in their dinky open boats way before they get into range to shoot anywhere near accurately. Hell, given a machine gun, everybody on that little pirate boat would be dead meat within seconds.
Linux is substantially more scaleable now than it was even just 6 months ago (not the vanilla, but quite well tested scaleability patches).
Perhaps it is, but is has nothing to do with BG, since a) BG doesn't have shared memory, and each 2 cpu node (1 dual core processor) runs its own kernel and b) Linux is only used on the service nodes (the nodes handling disk IO, interactive logins, compiling etc.), not the compute nodes (where the actual action takes place).
I'm quite sure that the improvements are due to tweaking the LINPACK benchmark itself (yes, this is allowed), ESSL libraries (IBM:s version of BLAS), and improving the XL Fortran compiler.
They can't use OpenMP since they don't have shared memory beoynd 2 cpu:s. HPF is dead, or least dying, due to lackluster scaling beyond a few dozen cpu:s.
No. All a turbo allows you to do is burn the fuel in the engine more rapidly. You get more power, but at an increase of fuel economy. This solution is making use of the currently wasted byproduct of internal combustion; i.e heat to get more power from the same amount of fuel.
Well, it depends on how you use the turbine. As the exhaust gasses expand through the turbine they cool down. Having a steam engine is just another way of extracting part of the heat that goes out the tailpipe.
So, usually with a turbocharger the turbine is used to compress the intake air, which as you said allows one to produce more power with the same engine. Due to less friction there might be slightly less fuel consumption than a larger equally powerful naturally aspirated engine.
But then, you can instead connect the turbine to the output shaft via a reduction gearing. This is called turbocompounding, and was used in aircraft engines in the late 1940'ies (also, some modern truck engines by Scania use it today). That sure IMHO sounds like a simpler solution than adding a steam engine.
TFA is about a guy whose surname is 'Goto'.
Yes, this is one reason why optimization usually doesn't matter. It's also entirely irrelevant for this discussion. What we have here is an implementation the the BLAS api, which ultimately is the thing that (wild guess) about 50 % of all supercomputer cycles in the world are spent doing. As these supercomputers are very expensive, and users use them because they need maximum performance, it makes sense economically to spend quite some time on an optimized BLAS library.
We use BLAS for most of our Linpack runs, as it is the fastest set of generic libraries for the purpose.
BLAS is an API, with multiple implementations, of which the GOTO BLAS is one. Other implementations are e.g. ATLAS, ESSL, ACML and MKL and of course the reference implementation on netlib.
"Learn ruby/perl/python/something and automate *everything*" ok so I should write a script to open liferea? I do this at least 10 times a day.
No, you should write a script that opens 10 instances of liferea at a time.
Ethanol has more energy per gallon than does gasoline
Nope. The energy density of ethanol is about 2/3 of that of gasoline.
My bigger concern is that I have heard people touting diesel as a replacement for these electric technologies rather than something to be used along with the,.
I don't think it's an either/or proposition. The problem with electricity and hydrogen is storage. So why not use biodiesel, or any other carbohydrate for that matter, as the energy storage medium. There are already high temperature fuel cells that can feed directly on natural gas. Give it some time and we might have fuel cells that munch on diesel fuel. Or then you can have an onboard reformer that breaks down the diesel into something the fuel cell can use.
Unfortunately, europositron is a scam.
Cray does not hesitate to use Linux where it is appropriate. However, when you are doing something like designing your own vector processor from scratch, porting Linux to it just doesn't make sense.
My guess is that porting Linux is way less work than writing a new OS from scratch. However, in this case Cray already had unicos running on the predecessor of the X1 vector computer (SV1 IIRC), so it was probably easier to port it, and less pain for existing customers, than to port Linux.
If you were to start from scratch, I think Linux (or perhaps one of the *BSD:s if you have GPL issues) is a pretty no-brainer choice to base your OS on.
You do seem to be implying that Linux-based computers running commodity hardware always makes more sense than using things like proprietary interconnects.
I don't see the connection between the OS and whether the interconnect is proprietary or not. You can have Linux and a proprietary interconnect, or a proprietary OS with a standard interconnect.
One can get largely the same results with cfengine or something like that. Well, except for the diskless support, which I guess can be useful.
Note to hard-drive manufacturers:
Please come out with smaller (2.5") and faster (10krpm) desktop drives that don't cost a fortune (like laptop and enterprise 2.5" drives), and allow many fast, cool, and quiet drives in a SFF.
When will the desktop market transition to 2.5"? When oh when?
I had a ground loop problem once too.
I solved it by putting the 'puter and the stereo in the same wall outlet. AFAIK that isn't a 100 % foolproof solution, but it worked for me.
Apparently you failed to note the sarcasm. ;-)
/.).
As for the recipes thing, it was from the blurb. I haven't actually RTFA (now that's a surprise here on
Obviously I *need* a quad cpu machine to handle my recipes database.
So, uh, it seems that AM is a professional level headed guy. No surprises there really.
But to my gripe: Starting the interview with "Do you think it was good to have had the time with BitKeeper in kernel development, or should they have stuck with CVS?". Gee, being so in tune with what's happening in kernel land just makes want to run to the nearest newsstand and get the latest "Linux Format": "The essential read for all Linux users".
I believe that Sun incents its employees to blog.
Wouldn't surprise me if it's tacit. Say positive things about the company and products, advance on the corporate ladder, while publically they can honestly say that "hey, we're just encouraging our employees to get in touch with our users, be more open blah blah blah" or something like that.
Anyway, it's transparent enough that it's really nothing more than the dotcom version of infomercials.
No they won't! They have no reason to.
Yes, you're probably right that it doesn't make sense for AMD economically. But I want to run numerical codes at more than 5 % peak performance on my cheap Opterons, so I want to believe.
The vector units that a cray uses aren't like altivec, sse, or other "bolt-on" vector units. The vector unit on a cray (or NEC) is a latency hiding mechanism. It's a method for forcing the programmer/compiler to structure the code such that the data loaded from memory is used a significant period of time after the load is initiated.
Yes, I know. And that's precisely the reason why I'd like to see real vectors instead of the sse/altivec toy ones. Main memory latency is hundreds of cycles, and it's getting worse all the time.
Additionally, from a microarchitecture perspective, vectors have quite a few advantages there too.
This works pretty well on the HPC code that is used on crays, but not at all for the everyday server/workstation code that opterons run.
I'm not sure about that. I guess technical apps vectorize just as well as HPC codes (well perhaps not the UI, but the code that runs the actual simulation or whatever). Heck, even some database code vectorizes nicely (sorting and hash joins).
Furthermore, to support that sort of vector unit, you need to have about eight times as much memory bandwidth as an opteron, which means many more pins on the socket, which are very expensive.
Yes, as I said some Alpha Tarantula like design is probably overkill for the vast majority of the market. My point was that a vector ISA extension with modest execution resources wouldn't need that much die area, and could help make better use of the available bandwidth, whatever that bandwidth is. As you said yourself, the expensive thing is IO. Transistors are cheap by comparison. So not having instructions that allow one to effectively use the available IO resources is a real shame.
I think you're much more likely to see the cray vector processor retooled with lots of hypertransport connections, so it can use an opteron as its scalar unit, and use the same seastar routers that the xt3 uses. On the X1, the scalar unit already runs ahead of the vector unit, so I bet it's not all that important for the scalar unit to be on-die.
Yes, that sounds feasible. IIRC it is something like this that Cray has cooked up for the Cascade project; I.e. a node consists of 8 (or was it 4) scalar processors connected to memory (I guess these could be Opterons or further in the future some kind of Processor-in-memory (PIM) stuff), and a vector unit with its own cache and fast access to the main memory via the scalar cpu:s.
As for the seastar thing, I think you're right that that's what they'll use for inter-node communication. Currently X1(E) uses Numalink licenced from SGI, so they're certainly looking at replacing that with existing in-house tech. BTW, 2H2006 will see the XT4, with the new Opteron sockets with DDR2 memory and the Seastar2 router that provides twice the BW compared to the existing Seastar.
Now, if only they could put four X1e CPUs into an air-cooled, rack-mount server and charge a reasonable amount for it. I'd much rather have a handful of vector processors than a few dozen opterons, anyday.
NEC sells an entry-level deskside SX-8, called IIRC SX-8i, with one processor. Unfortunately "entry-level" in this case means $100000+.
Tera bought far more than a name when they bought us. They also bought a bunch of software and hardware people, many of whom (myself not included) have been with Cray Research (the original Cray) for many years. So, while it's certainly not the Cray of the mid-1980's, the tradition still goes back there, especially with the vector machines like the Cray X1/X1E and its impending follow-on.
You work for Cray. Cool.
Please tell me that this deal implies that AMD is going to add some proper vector instructions to amd64. Pretty pretty please.
Specifically, I'm thinking about something like Alpha Tarantula. Well, the huge bunch of execution units and memory BW of tarantual was perhaps a bit overkill for a general purpose processor, but the vector ISA extensions and the general architecture looked all right. On paper, of course.
How could your comment be moderated insightful? Interesting perhaps, but insightful???
You must be new around here, thinking that the moderation system generally works as it should.
Methinks you've been playing too much Call of Duty
Uh, is that some video game? Well, not knowing what it is I guess I haven't played it.
It's damn hard to shoot long guns accurately on a ship that's heaving and rolling on the ocean.
Umm, having actually been aboard big ships, I can tell you that big ships don't heave very quickly, even in very high seas. Practically speaking, in the relatively calm weather required for these pirates to operate, a big ship is more or less a stationary platform as far as firing a rifle is concerned.
The rule of thumb when engaging vehicles, boats or hardened targets is that you need some decent firepower, and that means 7.62mm or
Uh, did you see the pictures of these pirates in their boats? We're talking about 6 m open wooden or fibreglass boats with an outboard motor, with about 3 people aboard. You don't exactly need a battleship to disable those. A rifle bullet will go through a boat like that like a hot knife goes through butter, for lack of a better analogy. But yeah, a machinegun would be even better (that's why I mentioned it in my previous post).
Don't forget that these pirates may be shooting at you as well, which makes your job that much harder.
Yes, of course. But my point was that a big ship is a significantly better gun platform, offering much better stability and protection from incoming fire, than a 6m open boat.
I think that asking slashdot and expecting some insightful discussion about this issue is pretty stupid.
I'm wondering why these cargo ships are not defending themselves. Cargo ships are pretty stable even in choppy seas, and have lots of steel to cover behind. Just a simple high power rifle with a scope, and you could pick off these pirates when they're coming in their dinky open boats way before they get into range to shoot anywhere near accurately. Hell, given a machine gun, everybody on that little pirate boat would be dead meat within seconds.
(you know, the type who read Tom's Hardware every day)
Yes I know the type. Rabid fanboys with strong opinions on everything, and no clue in sight. hey, that almost reminds me of
Linux is substantially more scaleable now than it was even just 6 months ago (not the vanilla, but quite well tested scaleability patches).
Perhaps it is, but is has nothing to do with BG, since a) BG doesn't have shared memory, and each 2 cpu node (1 dual core processor) runs its own kernel and b) Linux is only used on the service nodes (the nodes handling disk IO, interactive logins, compiling etc.), not the compute nodes (where the actual action takes place).
I'm quite sure that the improvements are due to tweaking the LINPACK benchmark itself (yes, this is allowed), ESSL libraries (IBM:s version of BLAS), and improving the XL Fortran compiler.
They can't use OpenMP since they don't have shared memory beoynd 2 cpu:s. HPF is dead, or least dying, due to lackluster scaling beyond a few dozen cpu:s.
What they use in practice is MPI.