Syscall Speed On Linux And Windows
1010011010 writes: "IBM has tested the syscall speed of Linux 2.2.16, 2.4.2 and Windows 2000. As it turns out, Linux is a little more than twice as fast. This may be interesting to people who have been reading the LKML recently, as a debate has been doing on about syscall speed. Also, a method ("magic page") for further improving syscall speed is being developed by the kernel developers. The rate at which all aspects of Linux is improving -- kernel, GUIs, etc. -- is phenominal. I think Linux is pretty cool now; I can't wait to see it in 18 months."
Did you actually read the article? It does not by any means "test the syscall speed" of Linux vs. Windows! It introduces timing routines for Linux and Windows which will be used for future articles comparing various things between Linux and Windows. The point of the article is not to reveal that Windows QueryPerformanceCounter() takes 1.945 usec and is therefore less than half as fast as a Linux gettimeofday(), but rather to demonstrate that BOTH systems are capable of providing sub-2-microsecond timing resolution, and that therefore the benchmarks to be performed in future articles will be accurate!
Feel free to interpret this as "Linux r0x, Windoze suxx!!", but really, it's about as significant as saying "gettimeofday() is only 14 characters long, and only lower-case, and can therefore be typed faster that the Windows equivalent, QueryPerformanceCounter(), which is 25 characters and mixed-case! Therefore programming under Linux is quicker and easier!".
Anyway, both methods are a wank. They should just use some inline asm to query the performance counters directly. Same code for both OS then.. :-)
High-resolution timers in Linux are a joke. The early POSIX-RT patches simply multiplied gettimeofday() by 1000.
/proc/cpuinfo and use that to convert cycles into milli- (or micro-) seconds.
s /technical_collateral/pentiumii/RDTSCPM1.HTM
The best way to get performance data on linux or windows is via the Intel chip's time-stamp counter; here's some example gcc code to do it:
static unsigned long long rdtsc(void) {
register unsigned long long d;
__asm__ __volatile__ ("rdtsc" : "=A"(d));
return d;
}
The previous method takes about 13 cycles on an Athlon 750. (DO NOT try and make it inline -- or gcc might optimize your to-be-timed code out from between the rdtsc() calls.) It is a straightforward manner to read the cpu clock speed from
As with any timing method, take care to execute it a few times before you gather any information, to prime the i-cache.
Apparently the lameness filter believes that this is a "junk character post", so I'll type some more. Intel has a useful whitepaper that describes how to do this in an M$ compiler, available here: http://developer.intel.com/software/idap/resource