Robert Love Explains Variable HZ
An anonymous reader writes "Robert Love, author of the kernel preemption patch for Linux, has backported a new performancing boosting patch from the 2.5 development kernel to the 2.4 stable kernel. This patch allows one to tune the frequency of the timer interrupt, defined in 2.4 as "HZ=100". Robert explains 'The timer interrupt is at the heart of the system. Everything lives and dies based on it. Its period is basically the granularity of the system: timers hit on 10ms intervals, timeslices come due at 10ms intervals, etc.' The 2.5 kernel has bumped the HZ value up to 1000, boosting performance."
To make a long story short, for number crunching machines, servers, and other applications which don't need much user interaction, larger timeslices are preferable because it doesn't matter how responsive the user interface is. For desktop systems, the timeslice can be decreased to improve the responsiveness of the user interface and give a better "feel" to the system at the expense of a minor performance loss. Being able to tune these parameters to meet your needs is one of Linux's great strengths.
I tried recompiling the stock RedHat kernel, and sure enough that was a on option in there to increase the hz for the internal timer.
Slashdot: Where people pretend to be twice as smart as they really are by behaving like children.
The reason is that across a scheduling tick the processors cache gets flushed and reloaded. This means that you end up doing a burst of memory reads, and that will dominate if the clock tick is too short.
-WolfWithoutAClause
"Gravity is only a theory, not a fact!"My first thought is, "It's about time..." FreeBSD has had this for ages, and it struck me as strange that Linux was nailed to HZ=100 when I started porting some apps over.
Among other things, streaming media is an important beneficiary of this change. Let's say you have a medium-bitrate video stream (about 2.5 to 5 megabits). That means that your packets should be spaced about 2 to 4 milliseconds apart. This is easy to schedule when your system has a 1 millisecond granularity, but is a disaster when your clocks are 10 milliseconds apart -- your packets end up going out in clumps. Your 100bT network may not care either way, but if you are pushing video over ADSL, 802.11b, or ATM, you may find your packets getting lost along the way.
The reason is that across a scheduling tick the processors cache gets flushed and reloaded.
;)
Whoa! What architecture is that!
That just doesn't sound right. The register files get flushed(well swapped), but if that 2 meg cache got flushed on every context switch there wouldn't be much point in having it at all. You can get cache thrashing if too many cache hungry programs are running simultaniously but that's why you get a bigger cache if you run lots of those programs, it so that their working set is saved across context switches.
Perhaps you mean the L1 caches? They can get tossed out cuz it can only hold a few inner loops and a few small working sets at a time anyway, but all that stuff should still be in the L2 cache and get loaded very quickly into those puny L1 caches, the L1 data cache is practically a register file anyway on P4's, 64 bit moves to/from them happen in a cycle...
Those L2->L1 moves might start to affect you at 1,000,000 ticks per second, but no one is proposing that, right? Even so in a typical environment the other context is just the scheduler which I can't imagine filling the L1 cache... It's not that complicated on a mostly idle machine. (Quick & Dirty schedulers have been written, some which looked through the entire process list. Erm, but on my machine there are less than 100 processes right now, still not so bad for L1
Anyway I think 1000 is just fine, if you're doing real-time music synthesis on lotsa channels a larger number might be better. Someone in Europe is working on a music disto, so maybe they will discover that 8000 is the magic number for 16 channels at 48000khz on a P4 at 2Ghz.
It would be neat if someone came up with metrics so that the tick was set so that 99.999% of the time the sound systems got their slices once every 500 usec but otherwise the timeslices were as large as possible. Then you could just tune that 500 usec thing, make it longer if you're on a 386, shorter if you really need more than half millisecond timings. I guess any program that needed frequent time slices could write to some proc file how much more often it should be called, or if it could afford to be called less often. For example 1.2 if it want's to be called more often, 0.8 if it's time needs were met. The kernel would only have to insure all the numbers it got were less than 1.0, and if the largest one were less than 0.95 it could even afford fewer time slices. The kernel might also want to ensure through process accounting that the time sensitive processes never got more than a certain percentage of the cycles available even if it meant they got called less often. This to prevent a denial of service where you just always write 10 to that proc file whenever you get run so the time tick grows until you spend all your time in the scheduler. It might also want to set a floor, so that a human can interact with the machine. Ticks should never be less than say 10 for instance on a PC(or 250 if it's my machine). Though for some special purpose interstellar Linux probe you might want to sleep for a whole second at a time before checking your direction once on your way so a tick of 1 would be acceptible once out of your solar system. (You still want 64 bit uptimes for you're interstellar probe it would be so embarassing if it arrived and the aliens were like, "Woah this species can't develop an operating system with more than 3 day uptime for a space probe that took like 40 years to get here, what l0s3rs!")
Way to go. Any binary that used the 'HZ' variable (a constant defined in a header file) will need to be recompiled for these new kernels. Way to go, Linux. Keep it up.
The point is that virtual memory reduces the amount of real memory you need for each thread- each only takes what it really needs. Sure if memory is cheap, it may not matter so much. But even if it is cheap do you really want to give each process 1 gig of space on the off-chance that it might need it? I don't think so.
Virtual memory is when a process thinks it has 1 gigabyte of memory, but it actually only has, say 128 megabytes. It can read or write to any bit of it, and the OS does what is necessary to ensure that it never notices the difference; obviously upto the actual system limits.
Virtual memory and swap space go together very nicely, but one does not imply the other. You can use virtual memory to implement garbage collection for example; with no backing store at all.
I guess there are other ways to do similar things- for example, don't use virtual memory, use real memory and set up the MMU so that each thread can only see its own map. But there are issues with that, and it isn't necessarily faster.
-WolfWithoutAClause
"Gravity is only a theory, not a fact!"