Robert Love Explains Variable HZ
An anonymous reader writes "Robert Love, author of the kernel preemption patch for Linux, has backported a new performancing boosting patch from the 2.5 development kernel to the 2.4 stable kernel. This patch allows one to tune the frequency of the timer interrupt, defined in 2.4 as "HZ=100". Robert explains 'The timer interrupt is at the heart of the system. Everything lives and dies based on it. Its period is basically the granularity of the system: timers hit on 10ms intervals, timeslices come due at 10ms intervals, etc.' The 2.5 kernel has bumped the HZ value up to 1000, boosting performance."
This is actually a easy to tune kernel config variable. Quick and easy performance boosts to be had by all!
An overclockable operating system. You wouldn't believe how long I've waited for this. Now where do I get a software heatsink?
To make a long story short, for number crunching machines, servers, and other applications which don't need much user interaction, larger timeslices are preferable because it doesn't matter how responsive the user interface is. For desktop systems, the timeslice can be decreased to improve the responsiveness of the user interface and give a better "feel" to the system at the expense of a minor performance loss. Being able to tune these parameters to meet your needs is one of Linux's great strengths.
I tried recompiling the stock RedHat kernel, and sure enough that was a on option in there to increase the hz for the internal timer.
Slashdot: Where people pretend to be twice as smart as they really are by behaving like children.
Note that although that looks like a tenfold change, by time 2.6 is released processing power will have doubled about twice since 2.4, so the change is equivalent to running a 2.4 system with HZ=250 instead of 100.
Free Java games for your phone: Tontie, Sokoban
My first thought is, "It's about time..." FreeBSD has had this for ages, and it struck me as strange that Linux was nailed to HZ=100 when I started porting some apps over.
Among other things, streaming media is an important beneficiary of this change. Let's say you have a medium-bitrate video stream (about 2.5 to 5 megabits). That means that your packets should be spaced about 2 to 4 milliseconds apart. This is easy to schedule when your system has a 1 millisecond granularity, but is a disaster when your clocks are 10 milliseconds apart -- your packets end up going out in clumps. Your 100bT network may not care either way, but if you are pushing video over ADSL, 802.11b, or ATM, you may find your packets getting lost along the way.
...but Windows had this way back in '95. Ouch.
The only benefit of increasing HZ is latency.
Presumably you meant "The only benefit of increasing HZ is decreasing latency" which is not a bad thing unto itself. Most people run interactive desktop applications, not scientific number crunching jobs for days at a time.
Having a minimum granularity of 1/50th of a second for a select() when HZ=100 really sucks, quite frankly.
Music players and animation programs have to resort to busy wait loops to get good response and tie up all CPU in the process. This is completely unnecessary in a modern OS.
It's 1/50th not 1/100th of a second with HZ=100 because of the way POSIX defines select() you have to wait for two jiffies at a minimum according to Linus.
Anyway, HZ > 500 sure as hell is better than HZ=100.
A HZ-less kernel with on-demand timer scheduling would be much better, though. IBM has such a kernel patch for their mainframe version of Linux to improve responsiveness when hundreds of Linux VMs are running concurrently.
Pity about the USER_HZ = 100 thing to accomodate all the borken programs that pick up HZ from the linux kernel header file and assume it is a) constant, or worse yet b) 100.
Had HZ had been a proper syscall instead of a #define in the first place for user-land programs this would not have been a problem today.
Can someone do me a big favor and post RedHat 8.0's asm-i386/param.h file so I can see how they defined HZ, USER_HZ and friends? I'd like to see it without actually going to the trouble of installing RedHat 8.0.
Wow. Are you saying that linux pages out the running process at every context switch? I think I might have found an explanation for X's choppiness.
Yes. Say you have one thread running flat out and another that needs to do 100microseconds of work. With 100 ticks per second you will lose 5 usec to context switching and 9900 usec to waiting for the next context switch.
No! The task does 100 microseconds of work and then calls the sleep command, or does I/O or whatever. This ultimately goes through the kernel and the kernel does an early context switch. It certainly doesn't waste the rest of the timeslice.
Incidentally, the overhead of doing the context switch is much bigger than you say here- one of the things that the kernel has to do is flush the caches as it swaps the virtual memory in and out- that will slow the system for tens of thousands of instructions afterwards.
Anyway, you're wrong about it not improving performance; it certainly can improve latency, which is very definitely a performance metric; but obviously you'll lose some cpu time due to the more frequent context switches that will occur.
-WolfWithoutAClause
"Gravity is only a theory, not a fact!"Way to go. Any binary that used the 'HZ' variable (a constant defined in a header file) will need to be recompiled for these new kernels. Way to go, Linux. Keep it up.
Windows 95 absolutely does have virtual memory. (Are you thinking of Mac OS 9??) It's true that it crashed a lot, and that's because the protections afforded by a real OS were not in '95 (it was easy to turn off virtual memory protections and trample on the address space of another process). But each process definitely had its own virtual address space, and most of the things that a real OS does (page table, TLB, paging to disk, etc.) were in 95. I don't know what this business is about not having to page out all the memory -- I never saw the 95 source code but it probably does what any other real OS does: set the page table to the one of the process and flush the TLB.
No, look swapping is not the same as virtual memory. Virtual memory is useful even in the absence of any disk or swap space at all.
I wasn't clear enough, I see 0% chance that virtual memory will disappear from Linux because it provides protection from one application playing with another's memory.
I do fear for swapping, but only years from now when it's not so common. I do not fear for the loss of MMU support including virtual memory.
It isn't clear this is what I'm saying from that post, but if you read what I said before I think it's clear. I was agreeing with you on the point of virtual memory not being a big deal, but adding that swapping was in dirge territory on the modern systems that will benefit from upping HZ. Your original comment on swapping is what inspired me to write the comment, cuz I thought you were making the point that it's not a performance loss to use virtual memory even if you never swap, while my point on swapping had nothing to do with performance, but code maintinance. If an signal never fires who cares how long it takes to handle it after all.
If you have to do any swapping to disk I don't care how much you try to tune HZ, you need to buy more memory or run fewer apps to get a snappier system.
But enough on this point, it's tangental and I think I agree with everything you said in this last comment without exception.
Robert Love will be giving a talk 2.5 and the preption patches at the Southern California Linux Expo
If you use the promo code: F633F you can get into the expo free.
I find this funny... Solaris defaults to 100 interrupts per second.
/etc/system. Setting hires_tick to true increases the programmable clock interrupt frequency to 1000 interrupts per second. Now I realize the comparing Solaris to Linux is
As a matter of fact in the book "Solaris Internals" RMC and JM basically
say "be *very* careful when increasing the
interrupt rate, because this can reduce system
performance dramatically" (pg. 56). On Solaris
this is accomplished by adding the line:
set hires_tick = 1
to
like comparing apples to oranges, but I am
pretty sure that the functions that take place
on interrupt are pretty standard:
- timeslicing
- tracking of various resources (mem, cpu, etc)
- checking/calculating of paging parameters
and so on. I can understand that this would
be good for real time systems, but for the
average desktop or server, will this really
increase preformance? Are Solaris and Linux
really that different on what would seem to
be a rather fundamental issue?
- Andrew
Windows 95 absolutely does have virtual memory. (Are you thinking of Mac OS 9??)
Mac OS 7 had virtual memory. It just wasn't protected virtual memory until Mac OS X.
Will I retire or break 10K?
-
Something is polling that should be event-driven. Some applications (Older versions of Netscape come to mind) like to do something on every tick. (For Netscape, that was a lousy architectural decision made so it would work on the classic MacOS and 16-bit Windows.) There are also some really crappy interprocess communication systems that are polled. Find and fix.
-
Thread scheduling priorities are wrong. This is a subtle issue, but basically, the threads that aren't CPU bound but have tight latency requirements have to have priority over the threads that are CPU bound and don't have tight latency requirements. Smarter schedulers try to achieve this automatically, but some of the guesses made in the UNIX world are tied, for historical reasons, to the TTY end of the system and are no longer appropriate.
A useful exercise is to turn the tick rate way down (maybe 1HZ) and put a compute loop job in the background. Everything that's broken according to the above criteria will turn into a toad, which helps debug the problem.