These are all good points. I (as in "I who wrote the paper and presented the slides") did measure power and for LINPACK you do hit TDP. See my other publications.
And unfortunately, we don't get to choose voltage-frequency point neither does AMD, Intel, nor NVIDIA with such flexibility. Operating voltage starte at 5V and now
it is at 1V. Silicon junction switches at 0.7V and the closer you get to 0.7V the less reliable the junction is (that's why it once was 5V). So you have about 0.3V max in terms of voltage. And frequency is capped at 4 GHz due to the voltage problem. So you have to live somewhere between 1 GHz and 4 GHz. Lookup Dennard scaling and its demise for details of voltage, frequency and area scaling. I only make presentations about iPad apps so don't know much about hardware;-)
This is a very poor quality article, I analyzed it
before. There are possibly better ones mentioned
by others.
Just look at the matrix multiplication case.
Look at the graph and see that 1000x1000
takes 30 seconds on CPU and 7 seconds on GPU.
Let's translate it to Millions of operations
per second: CPU -> 33 Mop/s, GPU -> 142 Mop/s
Matrix multiplication has cubic complexity
so for CPU: 1000 * 1000 * 1000 / 7 seconds / 1000000 = 33 Mop/s
Now think a while: 33 million operations on 1.5
GHz Pentium 4 with SSE (I assume there is no SSE2).
Pentium 4 has fuse multiply-add unit which makes
it do two ops per clock. So we get 3 billion
ops per second peak performance! What they claim
is that the CPU is 100 times slower for matrix
multiply. That is unlikely. You can get 2/3
of peak on Pentium 4. Just look at ATLAS
or FLAME
projects. If you use one of these projects you can
multiply 1000 matrix in half a second: 14 times
faster than the quoted GPU.
Another thing is the floating point arithmetic.
GPU uses 32-bit numbers (at most). This is too
small for most scientific codes. CPU can do
64-bits. Also, if you use 32-bits on CPU it will
be 4 times as fast as 64-bit (SSE extension).
So in 32-bit mode, Pentium 4 is 28 times faster
than the quoted GPU.
Finally, the length of the program. The reason
matrix multiply was chosen is becuase it can be
encoded in very short code - three simple loops.
This fits well with 128-instruction vertex code
length. You don't have to keep reloading the code.
For more challenging codes it will exceed
allowed vertex code length. The three loop
matrix multiply implementation stresses memory
bandwidth. And CPU has MB/s and GPU has GB/s.
No wonder GPU wins. But I can guess that without
making any tests.
Microsoft is not innovative company as far as
Computer _Science_ is concerned. This is more or
less true statement. However, they shine in other
aspects. It is too be predicted because they're
in a unique position in the history and have
to come up with ideas to maintain this position.
Here's a list from the top of my head:
Triple E: Embrace-Extend-Exterminate
Psychological take-over: MS annouces it will
start competing in market X. Customers in market
X stop updating their software waiting for MS
to release their thing. Market X vendors
stock goes down, MS acquires one of them only
to lay off everybody and butcher the software.
Including court system in business process:
it's OK to be fined for being a monopoly as long
as Wall Street expectations are met.
Thats not correct. The parent poster is reffering to the fact that methods in Python have a argument called self, which you sometimes have to type, sometimes to declare, sometimes not.
I'm well aware of that. But 'self' is not a keyword. If you don't like to type it, use something shorter, say: 's'.
I have no clue what you mean by "declaring" and "sometimes not". It is always there, it can be called differently. And you never declare things in Python.
For the C++/Java folks, self is the equivalent of this. No one likes code like that: self.doThis(); self.doThat();
Except when you have class member 'thing', method argument 'thing', and a local argument 'thing'. In C++ you can also have a global variable called 'thing'. You might argue that it's a bad style, but I've seen codes that do that and play tricks with undescores to sort out the mess. In short, in C++/Java you're allowed to be incosistent: 'thing' and 'this.thing' mean the same, sometimes. Python doesn't have to save keystrokes because it's relatively succinct. I like the 'self' idea because it defines the context for me without looking it up in other places. You don't like it and that's fine with me - that's life.
But how should a class of PYthon 3.0 load a class from Python 1.0 over installatino boundaries?
You just start Python 1.0 as a subbrocess of Python 3.0 process and have a proxy class in Python 3.0 to look like Python 1.0 class that communicates with the subprocess. A page or two of coding.
And why does Python evolve that uncoordinated?
I don't think it can be called uncoordinated - at best you may call the changes too big. The changes go through a well defined process -
PEPs. And there is a vote. So in a sense you should blame the majority. Guido only breaks the ties. With Ruby it's Matz's opinion that matters
read his opinions in an interview. He hinder Ruby's support for international encondings (see this post).
IMHO it would make sense to define now a Python 7.0, or something. And put everything into it you want, and then let the implementations evolve from the current point towards 7.0. So everyone knows what the final language will look like and kows that the current state is only an interims state.
You've just described what's called Python 3k. The idea has been entertained for a while on Python lists. The problem I have is that I cannot code for something that doesn't exist so I prefer to code for what's out there today. I guess it works for me and it doesn't for you - that's life.
I read Python books/sites and they say with a straight face "the great thing about Python is there's only one way to do things".. what they fail to mention is, one way *per Python version*.
I first used Python at 1.5 when it was pushed as a "prototyping language". I'm not coming back until they finish figuring out their object model and scoping rules.
So you're saying Python evolves. Doesn't C, C++, Java, and... Ruby do the same? It is easy to have many Python versions installed and usable at the same time. It is as easy to add packages to any of the installed versions. Is it easier to manage change in Ruby?
Someday, a bright Pythoner will get hit by lightning and realize, "hey, str(obj) just calls obj.__repr__().. why the heck don't we all just call obj.__repr__() directly? And do we really need *four* underscores? And do we really need to type 'self' all the time???" At that moment, the Rubification will begin.
You're confusing 'str(obj)' with 'repr(obj)' but that's OK - you're Ruby Zealot. I think you're to picky and if I was as picky about Ruby, here's what I'm reading here : 'obj.__id__' is the same as 'obj#id' and 'obj.__send__' is the same as 'obj#send'. So not only that I see four underscores (was it borrowed from Python by any chance?) and two ways of doing the same (there are more examples on the web page) but there is another peculiarity: '#' has two meanings: you can use it as in 'obj#id' and you can use it to start comments!
Now about 'self'. It comes from Smalltalk
and Matz (creator of Ruby) claims that Ruby borrows ideas from Smalltalk. So I don't see your point. On top of that, you don't have to use 'self'. You can use 'this', 'that', and 'other' or even 'S' if you aim at brevity.
Cross language inheritance is nothing unique. Jython has it. Jython and C# came about at roughly the same time.
As other posters suggest, problems start when you
realize that Java has single inheritance and Python
has multiple. The same problem occurs when bridging C# and Python.
I guess everything looks better when it comes from M$ PR department.
Thanks for the link - I like it a lot.
But the site's goal is different. They want to keep good ideas free for everybody. I want to keep the bad (or trivial) ideas. Ideas that would not get through the reviewing process on your site but would get a stamp of approval from USPTO.
I'm going to submit my idea to ShouldExist.org. Let's see what they think about it. After all, it will also help them, because they won't have to look at trivial submissions if they end up in my trash bin first.
So what you're saying is, people should spontaneously do the jobs of federal employees for free?
Don't get me started on USPTO's hard work... But seriously:
The short answer is: yes. Have you heard about "neighbourhood watch"? People watch their neigbours' property - shouldn't police be doing that?
Imagine any USENET group at all, [...] It's like your website, but free and extant.
What about/.? I think there are plenty of good ideas here (I hope my karma gets bumped for saying that:)
I think, people like you are the problem. And I
mean it in the good sense. Here's why:
I totally agree with you that in a perfect world
I'm wrong. You and I both want quality, not reinventing USENET or/., or anything else. But do you think that USPTO does a good job at that? I want honeyd for patents.
Another quick point against USENET: how do you look for it (assuming that somebody has archived all the USENET and is intending to keep it and
make it searchable)? How do I look for prior art for regifting? What I'm proposing is specifically for prior art for the obvious.
It seems that prior art doesn't matter much in
this case but it certainly helped in the Eolas
case.
Has anybody thought about creating competition to
USPTO? Imagine a site (like freshmeat) accepting
ideas with a prototype implementation (perl, python, Lisp, etc.) - nothing general (other than
the description of the idea). This would
constitute a library of prior art for trivial
ideas - The Prior Art Library (TPAL).
Here are some quick thoughts about TPAL:
reinforcement would come from whoever wants
to make money on it: if company A wants to charge
company B for IP, B hires lawyers and references
prior art (that assumes that A filed for patent
after there was an entry in the TPAL)
whenever a developer thinks of an idea that
would be granted a patent (= any idea), an entry
is submitted to TPAL with a 20-line hack. No
worries to get sued later if somebody decides
to take it to USPTO.
Since trivial stuff ends up in TPAL, USPTO
gets worthy patents for a change.
Ideas would not die with companies or retire
with people that had them.
These are all good points. I (as in "I who wrote the paper and presented the slides") did measure power and for LINPACK you do hit TDP. See my other publications. And unfortunately, we don't get to choose voltage-frequency point neither does AMD, Intel, nor NVIDIA with such flexibility. Operating voltage starte at 5V and now it is at 1V. Silicon junction switches at 0.7V and the closer you get to 0.7V the less reliable the junction is (that's why it once was 5V). So you have about 0.3V max in terms of voltage. And frequency is capped at 4 GHz due to the voltage problem. So you have to live somewhere between 1 GHz and 4 GHz. Lookup Dennard scaling and its demise for details of voltage, frequency and area scaling. I only make presentations about iPad apps so don't know much about hardware ;-)
Just look at the matrix multiplication case. Look at the graph and see that 1000x1000 takes 30 seconds on CPU and 7 seconds on GPU. Let's translate it to Millions of operations per second: CPU -> 33 Mop/s, GPU -> 142 Mop/s Matrix multiplication has cubic complexity so for CPU: 1000 * 1000 * 1000 / 7 seconds / 1000000 = 33 Mop/s
Now think a while: 33 million operations on 1.5 GHz Pentium 4 with SSE (I assume there is no SSE2). Pentium 4 has fuse multiply-add unit which makes it do two ops per clock. So we get 3 billion ops per second peak performance! What they claim is that the CPU is 100 times slower for matrix multiply. That is unlikely. You can get 2/3 of peak on Pentium 4. Just look at ATLAS or FLAME projects. If you use one of these projects you can multiply 1000 matrix in half a second: 14 times faster than the quoted GPU.
Another thing is the floating point arithmetic. GPU uses 32-bit numbers (at most). This is too small for most scientific codes. CPU can do 64-bits. Also, if you use 32-bits on CPU it will be 4 times as fast as 64-bit (SSE extension). So in 32-bit mode, Pentium 4 is 28 times faster than the quoted GPU.
Finally, the length of the program. The reason matrix multiply was chosen is becuase it can be encoded in very short code - three simple loops. This fits well with 128-instruction vertex code length. You don't have to keep reloading the code. For more challenging codes it will exceed allowed vertex code length. The three loop matrix multiply implementation stresses memory bandwidth. And CPU has MB/s and GPU has GB/s. No wonder GPU wins. But I can guess that without making any tests.
Microsoft is not innovative company as far as Computer _Science_ is concerned. This is more or less true statement. However, they shine in other aspects. It is too be predicted because they're in a unique position in the history and have to come up with ideas to maintain this position.
Here's a list from the top of my head:
- Triple E: Embrace-Extend-Exterminate
- Psychological take-over: MS annouces it will
start competing in market X. Customers in market
X stop updating their software waiting for MS
to release their thing. Market X vendors
stock goes down, MS acquires one of them only
to lay off everybody and butcher the software.
- Including court system in business process:
it's OK to be fined for being a monopoly as long
as Wall Street expectations are met.
Any others?Thats not correct. The parent poster is reffering to the fact that methods in Python have a argument called self, which you sometimes have to type, sometimes to declare, sometimes not.
I'm well aware of that. But 'self' is not a keyword. If you don't like to type it, use something shorter, say: 's'.
I have no clue what you mean by "declaring" and "sometimes not". It is always there, it can be called differently. And you never declare things in Python.
For the C++/Java folks, self is the equivalent of this. No one likes code like that: self.doThis(); self.doThat();
Except when you have class member 'thing', method argument 'thing', and a local argument 'thing'. In C++ you can also have a global variable called 'thing'. You might argue that it's a bad style, but I've seen codes that do that and play tricks with undescores to sort out the mess. In short, in C++/Java you're allowed to be incosistent: 'thing' and 'this.thing' mean the same, sometimes. Python doesn't have to save keystrokes because it's relatively succinct. I like the 'self' idea because it defines the context for me without looking it up in other places. You don't like it and that's fine with me - that's life.
But how should a class of PYthon 3.0 load a class from Python 1.0 over installatino boundaries?
You just start Python 1.0 as a subbrocess of Python 3.0 process and have a proxy class in Python 3.0 to look like Python 1.0 class that communicates with the subprocess. A page or two of coding.
And why does Python evolve that uncoordinated?
I don't think it can be called uncoordinated - at best you may call the changes too big. The changes go through a well defined process - PEPs. And there is a vote. So in a sense you should blame the majority. Guido only breaks the ties. With Ruby it's Matz's opinion that matters read his opinions in an interview. He hinder Ruby's support for international encondings (see this post).
IMHO it would make sense to define now a Python 7.0, or something. And put everything into it you want, and then let the implementations evolve from the current point towards 7.0. So everyone knows what the final language will look like and kows that the current state is only an interims state.
You've just described what's called Python 3k. The idea has been entertained for a while on Python lists. The problem I have is that I cannot code for something that doesn't exist so I prefer to code for what's out there today. I guess it works for me and it doesn't for you - that's life.
I read Python books/sites and they say with a straight face "the great thing about Python is there's only one way to do things" .. what they fail to mention is, one way *per Python version*.
I first used Python at 1.5 when it was pushed as a "prototyping language". I'm not coming back until they finish figuring out their object model and scoping rules.
So you're saying Python evolves. Doesn't C, C++, Java, and ... Ruby do the same? It is easy to have many Python versions installed and usable at the same time. It is as easy to add packages to any of the installed versions. Is it easier to manage change in Ruby?
Someday, a bright Pythoner will get hit by lightning and realize, "hey, str(obj) just calls obj.__repr__() .. why the heck don't we all just call obj.__repr__() directly? And do we really need *four* underscores? And do we really need to type 'self' all the time???" At that moment, the Rubification will begin.
You're confusing 'str(obj)' with 'repr(obj)' but that's OK - you're Ruby Zealot. I think you're to picky and if I was as picky about Ruby, here's what I'm reading here : 'obj.__id__' is the same as 'obj#id' and 'obj.__send__' is the same as 'obj#send'. So not only that I see four underscores (was it borrowed from Python by any chance?) and two ways of doing the same (there are more examples on the web page) but there is another peculiarity: '#' has two meanings: you can use it as in 'obj#id' and you can use it to start comments!
Now about 'self'. It comes from Smalltalk and Matz (creator of Ruby) claims that Ruby borrows ideas from Smalltalk. So I don't see your point. On top of that, you don't have to use 'self'. You can use 'this', 'that', and 'other' or even 'S' if you aim at brevity.
I've heard about ports of Python to the Perl's new bytecode called Parrot. Ruby to Parrot is called Cardinal .
I don't know if it's a reason to wake up, though.
Cross language inheritance is nothing unique. Jython has it. Jython and C# came about at roughly the same time.
As other posters suggest, problems start when you realize that Java has single inheritance and Python has multiple. The same problem occurs when bridging C# and Python.
I guess everything looks better when it comes from M$ PR department.
Thanks for the link - I like it a lot. But the site's goal is different. They want to keep good ideas free for everybody. I want to keep the bad (or trivial) ideas. Ideas that would not get through the reviewing process on your site but would get a stamp of approval from USPTO. I'm going to submit my idea to ShouldExist.org. Let's see what they think about it. After all, it will also help them, because they won't have to look at trivial submissions if they end up in my trash bin first.
Don't get me started on USPTO's hard work... But seriously: The short answer is: yes. Have you heard about "neighbourhood watch"? People watch their neigbours' property - shouldn't police be doing that?
Imagine any USENET group at all, [...] It's like your website, but free and extant. What aboutI think, people like you are the problem. And I mean it in the good sense. Here's why: I totally agree with you that in a perfect world I'm wrong. You and I both want quality, not reinventing USENET or /., or anything else. But do you think that USPTO does a good job at that? I want honeyd for patents.
Another quick point against USENET: how do you look for it (assuming that somebody has archived all the USENET and is intending to keep it and make it searchable)? How do I look for prior art for regifting? What I'm proposing is specifically for prior art for the obvious.
It seems that prior art doesn't matter much in this case but it certainly helped in the Eolas case.
Has anybody thought about creating competition to USPTO? Imagine a site (like freshmeat) accepting ideas with a prototype implementation (perl, python, Lisp, etc.) - nothing general (other than the description of the idea). This would constitute a library of prior art for trivial ideas - The Prior Art Library (TPAL).
Here are some quick thoughts about TPAL: