instruction 1 would be on processor one and instuction 2 be on on the second processor
Nope, because of data dependences. What if instruction 1 needs the result of instruction 2? How do you get it from processor A to processor B? Anything you can think of will be slower than simply running both instructions on the same processor.
If you structure your program in such a way that the even-numbered instructions have no dependences with the odd-numbered ones, than what you have is two programs. Any interesting nontrivial program will have data dependences, and then there is no magic bullet to achieve parallelism. Everywhere you look, there's another trade-off.
(I know: my research group does nothing but try to get decent performance (ie. linear scalability) from multiprocessor machines.:-)
This leads me to your second suggestion:
or just devide the program into two halves to be executed on each
Sure, you can do that for trivial programs. Now show me how to divide Win98, with its 60 bazillion lines of code, into two halves.
You don't write code optimized for a certain cache size. You use general techniques for reducing cache footprint. You do optimize for a certain cache line size, but not for the cache size.
You're always best off to make your cache footprint as small as possible, all else being equal. In multitasking OSes, if you have carefully used all the available cache, then one task switch kills your context anyway. OTOH, if you have two tasks with 64K working sets, and you have 128K cache, then the task switch doesn't cost much. Similarly, if you have a resident set of 32K, you can run four of these things.
So, different cache sizes don't make all that much difference for compilers or programmers. You'd use the same techniques to write code for 128K cache as for 512K cache.
Also, the 128K thing is not so much about about a 95% hit rate; it's more about the resident set. If you graph performance of a program vs. cache size, you'll an elbow in the graph where a small reduction in cache size causes a large reduction in performance. The location of this elbow in terms of cache size is the resident set of the program.
Studies show that most programs have a resident set of less than 128K, so that will do for most applications.
Actually, most programs have several elbows in the graph, and so have several resident sets. Usually most of them are less than 128K, so you get most of your performance benefit from a 128K cache size. -- Patrick Doyle
You could say that Unix systems are those that mostly conform to a Posix standard, I suppose. Unix systems have some things in common (off the top of my head):
Multi-user, with different access permissions for each user. By contrast, you can log in as anything you want in Windows 98 and still have access to everything.
Separate address spaces. Some simpler OSes provide separate threads all running in the same address space. Separate address spaces are really implied by the multi-user requirement.
Monolithic kernel. The kernel is not only responsible for managing processes and IPC, but also the filesystem, scheduling, virtual memory, etc. These things are typically in "server" processes in a microkernel architecture.
Priority-based scheduling, with priority boots for interactive processes. One of the distinguishing features of Unix when it was developed was that it was good at handling interactive applications by detecting their usage pattern and reducing their scheduling latency by boosting their priority.
File semantics. The permission structure: read-write-execute for user, group, others. Reference counted deletion: delete a file while someone's using it, and they get to keep using it; the file disappears when they're done. Directory structure: single tree-shaped namespace (modulo hard-links) with devices mounted on certain branches. Symbolic links which act just like the actual file they name (as opposed to Windows shortcuts, which don't).
Virtual memory. This is usually invisible to the user, so it wouldn't matter anyway. But unix also provides such things as mmap whose semantics would be very hard to duplicate without a real virtual memory system.
There are probably lots of things I have missed, but maybe that will get things rolling... -- Patrick Doyle
The halting problem is impossible. It doesn't matter how much computing power you throw at it.
This is totally different from things like factorization, which are merely impractical because they take so darn long. It's these kinds of problems that QC may help solve. -- Patrick Doyle
Linux would, I think, be a good platform for the server end of the game, if not the client. It could use a more real-time scheduler, such as this one, but even without that, I think it beats Windows. -- Patrick Doyle
On the other hand fiber should be able to transmit a signal in 0.2 seconds to any place in the world.
Actually, the earth's circumference being 4e4km, and light travelling at 3e5km/s, that makes circumnavigation take 0.13 seconds. The other end of the earth is half that distance, requiring 0.067 seconds.
Of course, there will be additional delays from routers and switches, and the fact that not all traffic will travel in great circles, so 0.2 seconds is probably more realistic. -- Patrick Doyle
I know it's off-topic, but here's my review ot Titus...
Review of Titus Patrick Doyle Rating: 3/10
If you have ever wanted to see what would happen if Shakespeare dropped acid and decided to remake his play Titus Andronicus in the style of Spawn, then this is the movie for you. Otherwise, save your money; this one's a yawner.
The Oscar-level acting performances by nearly every member of the cast are the only saving grace of this cumbersome adaptation of the Shakespearean classic--a term I associate quite loosely with this, clearly not the greatest of the Great Bard's achievements.
Of course, Shakespeare is not renowned for the subtlety of his plot lines at the best of times; a fact which is only exacerbated by his use of monologues which tend to expose character traits already apparent. Yet, as if this were not enough, this film piles on ham-handed imagery and anachronism to drive home messages that had already left the ballpark long ago.
The obvious comparisons to Quentin Tarantino's work, particularly Pulp Fiction, do him no justice. The most comparable of his work could be From Dusk Till Dawn; however, I found Titus had less plot and more violence (which will be truly significant to anyone who has seen Tarantino's vampire slaughter spectacle). Not only would he never write such a straightforward story or tell it so bluntly; I imagine he would laugh (as did the audience watching the movie with me) at the humourously ineffective use of music.
However, it bears repeating that the performances of actors such as Anthony Hopkins (Titus), Jessica Lange (Tamora), and Alan Cumming (Saturninus), were riveting despite the disaster of a motion picture that surrounded them. (Hopkins' one apparent lapse into the Hannibal Lecter role could, perhaps, be attributed to poor direction.) And the daunting task of making this story's macabre plot developments seem feasible was carried out brilliantly by Matthew Rhys and Jonathan Rhys-Myers, whose wholly convincing portrayals of the psychotic brothers Demetrius and Chiron were riveting.
In addition, the direction was not without occasional merit. <SPOILER> For example, the use of carnival-like music and characters to reveal to Titus the severed heads of his sons was quite effective in its augmented shock value. </SPOILER> However, for the most part, I found the devices employed to be obvious and tiresome.
All in all, this movie involved a lot of very talented people, and managed to bore despite them.
Please forgive my ignorance, but exactly what problem will new top-level domain names solve? Companies already get foobar.com, foobar.net, and foobar.org; wouldn't new TLDs just be more of the same?
Or are there supposed to be restrictions on who can register these new ones (like country codes)? -- Patrick Doyle
Because they make simple things easy. Hello World is two lines of code in C and C++:
#include <stdio.h> int main(){ printf("Hello, world!\n"); return 0; }
Languages that make simple things easy will appeal to beginners. Beginners eventually become experts. Then you have a situation where all the experts are using a langauge not because it's a good language, but because it made simple things easy.
Popularity, then, is almost independent of quality.
The most appropriate language for a job is the one that makes that job easy.
Perhaps I misunderstand your remarks about the "delta" thing, but could you not make the same argument about the 3D image you see looking out a window? I don't get a headache from any fuzzy area around the edges of a window. -- Patrick Doyle
If your brain doesn't need rest, then why don't you lie down in a fully-conscious state and just rest your body?
I think probably there are parts of your brain that need less rest than others, so they spend the downtime running simulations (ie. dreams) to train themselves.
First of all, the Earth's gravity is not much less in near-Earth orbit than it is on the ground. For instance, at 600 km, it would be within 20%.
Secondly, I think the terminal velocity argument it moot, since you (and I) have no idea what terminal velocity of a rock would be.
However, here's something to notice...
Sans atmosphere, it would take about 350 seconds to hit the ground from 600km. Thus, it would hit at about 3500 m/s, which is about 2 miles per second.
So, even without wind resistance, we're talking only 16% of the kinetic energy of the same rock at 5 miles/sec. Then you add atmospheric drag, and you're probably in the single-digit range.
Doubling the word size of your processor instantly doubles the chip area you need for most parts of the chip. The extra area you would use for going from 64 to 128-bit could probably be better spent in other ways, like caching or speculative execution.
However, 64-bit is definitely worthwhile over 32-bit because 32 bits can only address 4GB. Under Linux, for instance, you only get 3GB of those because the last GB is reserved for the system. This places a hard limit on the size of things you can map into your address space.
64 bits can address 16EB (that's Exabytes), which should stave off Moore's law for another 50 years or so.
No offence, but that's a really lame memory management scheme. Nobody would ever actually implement free() that way.
My guess was that Andy was referring to data flow analysis techniques that can determine that an object is never used once a certain function ends. Then, the object can be allocated on that function's stack, with very low overhead.
The argument was that compilers can be better because they can rewrite the whole thing each time. Ergo, if you are also willing to rewrite a given piece of code whenever certain conditions change, then you can beat the compiler.
However, for situations where you can't change the code every time anything changes (eg. it's too big), my bet is on the compiler.
Um, Athlon IS AMD.
--
Patrick Doyle
instruction 1 would be on processor one and instuction 2 be on on the second processor
:-)
Nope, because of data dependences. What if instruction 1 needs the result of instruction 2? How do you get it from processor A to processor B? Anything you can think of will be slower than simply running both instructions on the same processor.
If you structure your program in such a way that the even-numbered instructions have no dependences with the odd-numbered ones, than what you have is two programs. Any interesting nontrivial program will have data dependences, and then there is no magic bullet to achieve parallelism. Everywhere you look, there's another trade-off.
(I know: my research group does nothing but try to get decent performance (ie. linear scalability) from multiprocessor machines.
This leads me to your second suggestion:
or just devide the program into two halves to be executed on each
Sure, you can do that for trivial programs. Now show me how to divide Win98, with its 60 bazillion lines of code, into two halves.
--
Patrick Doyle
You don't write code optimized for a certain cache size. You use general techniques for reducing cache footprint. You do optimize for a certain cache line size, but not for the cache size.
You're always best off to make your cache footprint as small as possible, all else being equal. In multitasking OSes, if you have carefully used all the available cache, then one task switch kills your context anyway. OTOH, if you have two tasks with 64K working sets, and you have 128K cache, then the task switch doesn't cost much. Similarly, if you have a resident set of 32K, you can run four of these things.
So, different cache sizes don't make all that much difference for compilers or programmers. You'd use the same techniques to write code for 128K cache as for 512K cache.
Also, the 128K thing is not so much about about a 95% hit rate; it's more about the resident set. If you graph performance of a program vs. cache size, you'll an elbow in the graph where a small reduction in cache size causes a large reduction in performance. The location of this elbow in terms of cache size is the resident set of the program.
Studies show that most programs have a resident set of less than 128K, so that will do for most applications.
Actually, most programs have several elbows in the graph, and so have several resident sets. Usually most of them are less than 128K, so you get most of your performance benefit from a 128K cache size.
--
Patrick Doyle
Ok, I give up. Why didn't this get moderated up? :-)
--
Patrick Doyle
- Multi-user, with different access permissions for each user. By contrast, you can log in as anything you want in Windows 98 and still have access to everything.
- Separate address spaces. Some simpler OSes provide separate threads all running in the same address space. Separate address spaces are really implied by the multi-user requirement.
- Monolithic kernel. The kernel is not only responsible for managing processes and IPC, but also the filesystem, scheduling, virtual memory, etc. These things are typically in "server" processes in a microkernel architecture.
- Priority-based scheduling, with priority boots for interactive processes. One of the distinguishing features of Unix when it was developed was that it was good at handling interactive applications by detecting their usage pattern and reducing their scheduling latency by boosting their priority.
- File semantics. The permission structure: read-write-execute for user, group, others. Reference counted deletion: delete a file while someone's using it, and they get to keep using it; the file disappears when they're done. Directory structure: single tree-shaped namespace (modulo hard-links) with devices mounted on certain branches. Symbolic links which act just like the actual file they name (as opposed to Windows shortcuts, which don't).
- Virtual memory. This is usually invisible to the user, so it wouldn't matter anyway. But unix also provides such things as mmap whose semantics would be very hard to duplicate without a real virtual memory system.
There are probably lots of things I have missed, but maybe that will get things rolling...--
Patrick Doyle
The halting problem is impossible. It doesn't matter how much computing power you throw at it.
This is totally different from things like factorization, which are merely impractical because they take so darn long. It's these kinds of problems that QC may help solve.
--
Patrick Doyle
Can anyone tell me what information would be in the source code for a video driver which is not in the binary, that needs to be protected?
--
Patrick Doyle
Linux would, I think, be a good platform for the server end of the game, if not the client. It could use a more real-time scheduler, such as this one, but even without that, I think it beats Windows.
--
Patrick Doyle
Actually, the earth's circumference being 4e4km, and light travelling at 3e5km/s, that makes circumnavigation take 0.13 seconds. The other end of the earth is half that distance, requiring 0.067 seconds.
Of course, there will be additional delays from routers and switches, and the fact that not all traffic will travel in great circles, so 0.2 seconds is probably more realistic.
--
Patrick Doyle
I know it's off-topic, but here's my review ot Titus...
Review of Titus
Patrick Doyle
Rating: 3/10
If you have ever wanted to see what would happen if Shakespeare dropped acid and decided to remake his play Titus Andronicus in the style of Spawn, then this is the movie for you. Otherwise, save your money; this one's a yawner.
The Oscar-level acting performances by nearly every member of the cast are the only saving grace of this cumbersome adaptation of the Shakespearean classic--a term I associate quite loosely with this, clearly not the greatest of the Great Bard's achievements.
Of course, Shakespeare is not renowned for the subtlety of his plot lines at the best of times; a fact which is only exacerbated by his use of monologues which tend to expose character traits already apparent. Yet, as if this were not enough, this film piles on ham-handed imagery and anachronism to drive home messages that had already left the ballpark long ago.
The obvious comparisons to Quentin Tarantino's work, particularly Pulp Fiction, do him no justice. The most comparable of his work could be From Dusk Till Dawn; however, I found Titus had less plot and more violence (which will be truly significant to anyone who has seen Tarantino's vampire slaughter spectacle). Not only would he never write such a straightforward story or tell it so bluntly; I imagine he would laugh (as did the audience watching the movie with me) at the humourously ineffective use of music.
However, it bears repeating that the performances of actors such as Anthony Hopkins (Titus), Jessica Lange (Tamora), and Alan Cumming (Saturninus), were riveting despite the disaster of a motion picture that surrounded them. (Hopkins' one apparent lapse into the Hannibal Lecter role could, perhaps, be attributed to poor direction.) And the daunting task of making this story's macabre plot developments seem feasible was carried out brilliantly by Matthew Rhys and Jonathan Rhys-Myers, whose wholly convincing portrayals of the psychotic brothers Demetrius and Chiron were riveting.
In addition, the direction was not without occasional merit. <SPOILER> For example, the use of carnival-like music and characters to reveal to Titus the severed heads of his sons was quite effective in its augmented shock value. </SPOILER> However, for the most part, I found the devices employed to be obvious and tiresome.
All in all, this movie involved a lot of very talented people, and managed to bore despite them.
--
Patrick Doyle
The temperature at which water boils is directly proportional to air pressure.
Uh.. that's inversely proportional.. but yeah.
Nope, it's directly, I think. Double the pressure, double the boiling temperature.
--
Patrick Doyle
Don't forget South Park!
Besides, Heavy Metal was just boobs.
--
Patrick Doyle
Maybe they curve a bit, but I'll pay you a jillion dollars to throw a baseball in a circle.
BTW, are you suggesting that each M&M was given just the right amount of spin to make it move in a circle of just the right radius?
--
Patrick Doyle
Or are there supposed to be restrictions on who can register these new ones (like country codes)?
--
Patrick Doyle
Try Unreal Tournament. It has female characters.
--
Patrick Doyle
Because they make simple things easy. Hello World is two lines of code in C and C++:
#include <stdio.h>
int main(){ printf("Hello, world!\n"); return 0; }
Languages that make simple things easy will appeal to beginners. Beginners eventually become experts. Then you have a situation where all the experts are using a langauge not because it's a good language, but because it made simple things easy.
Popularity, then, is almost independent of quality.
The most appropriate language for a job is the one that makes that job easy.
--
Patrick Doyle
Yes, Eiffel.
--
Patrick Doyle
Perhaps I misunderstand your remarks about the "delta" thing, but could you not make the same argument about the 3D image you see looking out a window? I don't get a headache from any fuzzy area around the edges of a window.
--
Patrick Doyle
If your brain doesn't need rest, then why don't you lie down in a fully-conscious state and just rest your body?
I think probably there are parts of your brain that need less rest than others, so they spend the downtime running simulations (ie. dreams) to train themselves.
--
Patrick Doyle
First of all, the Earth's gravity is not much less in near-Earth orbit than it is on the ground. For instance, at 600 km, it would be within 20%.
Secondly, I think the terminal velocity argument it moot, since you (and I) have no idea what terminal velocity of a rock would be.
However, here's something to notice...
Sans atmosphere, it would take about 350 seconds to hit the ground from 600km. Thus, it would hit at about 3500 m/s, which is about 2 miles per second.
So, even without wind resistance, we're talking only 16% of the kinetic energy of the same rock at 5 miles/sec. Then you add atmospheric drag, and you're probably in the single-digit range.
--
Patrick Doyle
Doubling the word size of your processor instantly doubles the chip area you need for most parts of the chip. The extra area you would use for going from 64 to 128-bit could probably be better spent in other ways, like caching or speculative execution.
However, 64-bit is definitely worthwhile over 32-bit because 32 bits can only address 4GB. Under Linux, for instance, you only get 3GB of those because the last GB is reserved for the system. This places a hard limit on the size of things you can map into your address space.
64 bits can address 16EB (that's Exabytes), which should stave off Moore's law for another 50 years or so.
--
Patrick Doyle
What do you mean by a transaction? And what does word size have to do with how many of them you can do per second?
--
Patrick Doyle
I think you could get a sensible cross-product of six 7-dimensional vectors. Or, in general, n-1 vectors of n dimensions.
Remember how to get the cross-product manually? You make a matrix like this:
| i. j. k. |
| x1 y1 z1 |
| x2 y2 z2 |
...and take the determinant. To get a similar matrix with 7-dimensional vectors, you'd need six of them.
But I don't know how much use that would be.
--
Patrick Doyle
No offence, but that's a really lame memory management scheme. Nobody would ever actually implement free() that way.
My guess was that Andy was referring to data flow analysis techniques that can determine that an object is never used once a certain function ends. Then, the object can be allocated on that function's stack, with very low overhead.
--
Patrick Doyle
The argument was that compilers can be better because they can rewrite the whole thing each time. Ergo, if you are also willing to rewrite a given piece of code whenever certain conditions change, then you can beat the compiler.
However, for situations where you can't change the code every time anything changes (eg. it's too big), my bet is on the compiler.
--
Patrick Doyle