Actually, Linux 2.4 had threads as kernel scheduled entities, it's just that they were treated as processes instead of threads which worked out just fine save for some POSIX compliance issues.
Remember -- if you run DOS on a CPU with a large enough L2 cache, you can fit the entire address space minus extended or expanded memory (or whatever they called it) into L2 cache!
Branch prediction is a workaround. It is not a radical performance enhancing technology. It is there to keep the CPU busy when it would otherwise be starved for instructions and data. Branch prediction is simply there to allow the CPU to operate at an insanely high clock speed as compared to the memory bus. And it only works well when you have a relatively fixed target to optimize for (namely Windows.) Branch prediction is also needed because later generations of the i686 processor have insanely long pipelines.
Not so much. Branch prediction is needed for pipelining to be a win. I'll agree it's pretty much nessicary for good performance in pipeline architectures, but I don't think I'd call it a workaround. On average, most applications have a basic block size (amount of instrcutions between branches) of four to five. I don't know about you, but I don't think that four is "insanely long". To get the most out of a pipelined superscalar processor you (obviously) want have as many instructions running in parallel as possible and keep as many of the slots in your pipeline as full as possible so you will continue to have many instructions running in parallel. If you have a processor whose pipeline is longer than four or five stages, you'll be better off just picking a direction to go on the branch than you will be waiting for the result while your pipeline empties -- if you're wrong you'll be no worse off than if you waited, but if you're right, you've just got a number of useful cycles of work in for free. Branch prediction provides a better option than just picking a direction and running with it.
And no, branch prediction does not work well only for Windows. It works well whenever the direction of the branch will take is easy to predict for example, loops will be consistantly jumping backwards. In fact, loops are one of the places where branch prediction in a pipelined architecture is probably most useful: if the loop has any significant number of iterations, it should be pretty obvious as to which way it will go and the branch prediction will always be a win (except for when we exit the loop).
The problem in x86 is that they are eight and even those have locked meanings to some degree.
Locked meanings? I'm not so sure. If we do a MUL EAX then the result goes into EDX:EAX. Since EAX gets clobered, it'll get renamed. Combine that with the fact that most compilers generate code that does not use instructions in which registers have special meaning anyway and I don't think this is actually a problem.
If you have that, then (at least) zsh lets you have a function named "preexec" that is executed just before each command prompt, so you could just source all the files from there.
Well, it depends on what you want to do. Until I started to admin a linux cluster, I didn't really understand why this was done either.
1) Most of the folders have a PURPOSE./bin has vital system binaries (sh, login, and so on),/sbin has binaries and daemons vital to starting up the system,/etc has files containing startup and default settings,/var has variable information (like logs),/tmp is for temporary files, and so on.
Why is this powerful? Well...
- Want your machines to behave similarly on startup? Replicate/etc on these machines or have them mount a shared/etc on top of the original early in the boot process.
- Want to have faster access to temporary files? Make/tmp be on a ramdisk.
- Want to limit log sizes so they don't fill up the disk? Make a seperate partition for/var
- Want to shared data across a bunch of *nix boxen? Make/usr/share and friends NFS shares.
In general, You can do interesting things by combining the fact that directories are usually per-purpose rathar than per-program. Granted, in the desktop world, this isn't so much useful, but it makes cluster management and system maintainence SO much easier.
2) The issue you complain about can be taken care of by a package management system or some arangement of symlinks.
I want something that is uber-configurable. You want something that is not. My needs are a superset of yours. You can't make something that is not configurable be uber-configurable but you can make something that is uber-configurable behave in an easy to configure way.
I think the bios sets aside 16 MB or so of memory at boot time as a cache for translated code, so you're not just limited to cache alone... that'd be insane. I'm guessing that's why they do the 16 MB thing...
Well, strong but flexable silk gives the spiders that spin it a pretty good reproductive advantage over those that don't. Over time, natural selection will favor those spiders with strong but thin (and, as such, difficult to see) webs. It's not too suprising that scientists are no match for millions of years of evolution.
AFAIK, our entire CS department at Caltech is either on Linux or Solaris, and I would be willing to bet that the vast majority of students use PINE to check their mail. I don't think exposure on the college level is a problem.;)
Right, after all, if X is the law, then X must be the "right thing".
I agree with you one hundred percent. We should let the RIAA sue the heck out of the nation's brightest students. Better yet, why don't we toss those bastards in jail! After all, It's not like they're doing anything useful at MIT.
Re:NOBODY expects a Slashdotting!
on
X11 in ASCII
·
· Score: 1
Agreed, this is prety impressive, especialy since the Athlon 64 didn't have all 16 registers that it uses in 64-bit mode available - Like you said the extra registers can increase perfomance up to 30%. And in 64-bit mode it doesn't use any of the x87-register-stack brain damage. I can't wait to get my hands on one of these!
Actualy, many of DEC's engineers went to AMD after the company went under - so they are, in a sense, building something like the Alpha. Also, the x86-64's extra registers will be a big help and the bus is intelegently designed... what more could you ask for?
Its just an OS, for goodness sake, its not the second coming.
Speaking of the second coming... In early christian art, one would sometimes find the letters "Chi" and "Rho" with images of Jesus (meaning "Christ", I would assume). The letters "chi" and "rho" look like X and P respectively. Now, what is this about Windows XP? I think that Microsoft is trying to say something here... or mabye it's an appeal by their marketing department to a higher being...
But also consider that the 3.06GHz P4 must hyperthread - run 2 threads simultaniously - with only 3 Integer units. Dual Athlons have (total) 6 Integer units and a larger L1 cache. Assuming that the Athlons get 2 IPC (not too unreasonable), You get a total of 4 IPC for both chips. The P4 can probly get 2.5-3 IPC with Hyperthreading, which is (obviously) better than a single Athlon, but not dual Athlons. (This is all assuming, of course, ideal - we're in the inner loop and it fits in Cache - conditions).
Some wild / off-subject speculation: AMD's Hammers will be Multithreading... WTF else would you do with SIX integer units?
Although it is less probable, someone could put malicious code in closed source software. Just because someone works for a software company doesn't mean they won't slip something like this trojan in... Granted, I think it would be much harder to do, but it is possible.
Bah! An Athlon 2600+ can't stand up to my 666Mhz Int-hell Pentagram 3. It has UNHOLY power! (Ypu can even sell your soul to the chip to get 100% branch prediction accuracy)
(the '386 architecture is also not quite as elegant as the PPC architecture. Most of the registers would have to be stored in RAM, and that would hurt you BIGTIME).
Actualy if the PPCs emulated registers are used frequently, they would prety much always be in the L1 cache. Much less hurt that way, to the tune of 1-3 cycles acess time which, due to the long pipelines, means you can get to the data almost as fast as registers.
Actually, Linux 2.4 had threads as kernel scheduled entities, it's just that they were treated as processes instead of threads which worked out just fine save for some POSIX compliance issues.
Remember -- if you run DOS on a CPU with a large enough L2 cache, you can fit the entire address space minus extended or expanded memory (or whatever they called it) into L2 cache!
Not so much. Branch prediction is needed for pipelining to be a win. I'll agree it's pretty much nessicary for good performance in pipeline architectures, but I don't think I'd call it a workaround. On average, most applications have a basic block size (amount of instrcutions between branches) of four to five. I don't know about you, but I don't think that four is "insanely long". To get the most out of a pipelined superscalar processor you (obviously) want have as many instructions running in parallel as possible and keep as many of the slots in your pipeline as full as possible so you will continue to have many instructions running in parallel. If you have a processor whose pipeline is longer than four or five stages, you'll be better off just picking a direction to go on the branch than you will be waiting for the result while your pipeline empties -- if you're wrong you'll be no worse off than if you waited, but if you're right, you've just got a number of useful cycles of work in for free. Branch prediction provides a better option than just picking a direction and running with it.
And no, branch prediction does not work well only for Windows. It works well whenever the direction of the branch will take is easy to predict for example, loops will be consistantly jumping backwards. In fact, loops are one of the places where branch prediction in a pipelined architecture is probably most useful: if the loop has any significant number of iterations, it should be pretty obvious as to which way it will go and the branch prediction will always be a win (except for when we exit the loop).
Locked meanings? I'm not so sure. If we do a MUL EAX then the result goes into EDX:EAX. Since EAX gets clobered, it'll get renamed. Combine that with the fact that most compilers generate code that does not use instructions in which registers have special meaning anyway and I don't think this is actually a problem.
If you have that, then (at least) zsh lets you have a function named "preexec" that is executed just before each command prompt, so you could just source all the files from there.
1) Most of the folders have a PURPOSE. /bin has vital system binaries (sh, login, and so on), /sbin has binaries and daemons vital to starting up the system, /etc has files containing startup and default settings, /var has variable information (like logs), /tmp is for temporary files, and so on.
Why is this powerful? Well ...
- Want your machines to behave similarly on startup? Replicate /etc on these machines or have them mount a shared /etc on top of the original early in the boot process. /tmp be on a ramdisk. /var /usr/share and friends NFS shares.
- Want to have faster access to temporary files? Make
- Want to limit log sizes so they don't fill up the disk? Make a seperate partition for
- Want to shared data across a bunch of *nix boxen? Make
In general, You can do interesting things by combining the fact that directories are usually per-purpose rathar than per-program. Granted, in the desktop world, this isn't so much useful, but it makes cluster management and system maintainence SO much easier.
2) The issue you complain about can be taken care of by a package management system or some arangement of symlinks.
I want something that is uber-configurable. You want something that is not. My needs are a superset of yours. You can't make something that is not configurable be uber-configurable but you can make something that is uber-configurable behave in an easy to configure way.
AFAIK, gcc 4.0 will include support for this. See the -fmudflap option in the gcc manual.
I think the bios sets aside 16 MB or so of memory at boot time as a cache for translated code, so you're not just limited to cache alone ... that'd be insane. I'm guessing that's why they do the 16 MB thing ...
Well, strong but flexable silk gives the spiders that spin it a pretty good reproductive advantage over those that don't. Over time, natural selection will favor those spiders with strong but thin (and, as such, difficult to see) webs. It's not too suprising that scientists are no match for millions of years of evolution.
AFAIK, our entire CS department at Caltech is either on Linux or Solaris, and I would be willing to bet that the vast majority of students use PINE to check their mail. I don't think exposure on the college level is a problem. ;)
In many cases, gentoo packages will remove compiler options that have been found to produce buggy code, so this is usualy not a problem.
I agree with you one hundred percent. We should let the RIAA sue the heck out of the nation's brightest students. Better yet, why don't we toss those bastards in jail! After all, It's not like they're doing anything useful at MIT.
(Screen goes black. White text appears.)
(Redundant,-1)
Oh, damn.
Agreed, this is prety impressive, especialy since the Athlon 64 didn't have all 16 registers that it uses in 64-bit mode available - Like you said the extra registers can increase perfomance up to 30%. And in 64-bit mode it doesn't use any of the x87-register-stack brain damage. I can't wait to get my hands on one of these!
there is just a bit of evil in everyone's head.
Actualy, many of DEC's engineers went to AMD after the company went under - so they are, in a sense, building something like the Alpha. Also, the x86-64's extra registers will be a big help and the bus is intelegently designed ... what more could you ask for?
No! You're wrong. Methlab is a mathmatics program for computers.
No reason why you couldn't just write a program / modify a driver to look at video memory - there's no protection there ...
Speaking of the second coming ... In early christian art, one would sometimes find the letters "Chi" and "Rho" with images of Jesus (meaning "Christ", I would assume). The letters "chi" and "rho" look like X and P respectively. Now, what is this about Windows XP? I think that Microsoft is trying to say something here ... or mabye it's an appeal by their marketing department to a higher being ...
Some wild / off-subject speculation: AMD's Hammers will be Multithreading ... WTF else would you do with SIX integer units?
Although it is less probable, someone could put malicious code in closed source software. Just because someone works for a software company doesn't mean they won't slip something like this trojan in... Granted, I think it would be much harder to do, but it is possible.
Bah! An Athlon 2600+ can't stand up to my 666Mhz Int-hell Pentagram 3. It has UNHOLY power! (Ypu can even sell your soul to the chip to get 100% branch prediction accuracy)
The filter seems to block my school's main site, but not the CS Department's site. Any Ideas?
(the '386 architecture is also not quite as elegant as the PPC architecture. Most of the registers would have to be stored in RAM, and that would hurt you BIGTIME).
Actualy if the PPCs emulated registers are used frequently, they would prety much always be in the L1 cache. Much less hurt that way, to the tune of 1-3 cycles acess time which, due to the long pipelines, means you can get to the data almost as fast as registers.