Posted by
michael
on from the make-kpkg-kernel_image dept.
Sivar writes "Andrew Morton of EXT3 fame has posted benchmarks of Linux 2.5.47 prerelease compared with the latest from the current 2.4 series. With some tasks more than tripling in performance, the future looks very promising."
Well according to Linux Format the reason you need a NVidia card for UT2003 is that only a commercial driver can implement the patented S3 texture compression used by UT2003! Sounds like the beef there is with Epic not ATI. btw, is there a good source on how to get cvs X up and running (on debian for preference)? I got a new laptop with an ATI M9 a couple of weeks ago and know that this is only supported in CVS at the mo. The last times (4+ years ago) I tried compiling X things worked far less than wonderfully!
--
Never underestimate the dark side of the Source
Re:Can't get a speedup of more than 10
by
Zorton
·
· Score: 3, Insightful
So in other words the more time you spend rescheduling things the less time you have for executing the code you have scheduled.
I'm glad you put the "of EXT3 fame" bit, I was worried the article might be talking about the infamous author.
Although he might end up on the front page of/. if he writes an unauthorized biography of Mr. Gates, what kind of juice could be dragged up from the past... I wonder?
-- Are you local? There's nothing for you here!
2.5
by
Anonymous Coward
·
· Score: 5, Funny
Will it make the internet faster?
I'm really sorry.
by
FreeLinux
·
· Score: 5, Informative
Try it again.
In a reply on lkml to Aaron Lehmann's praising of the contest results of the latest 2.5-mm kernel Andrew Morton [interview] explains some of the important performance and design differences between the 2.4 stable series and the 2.5 development series accompanied by illustrating benchmarks.
Most significant gains can be expected at the high end such as large machines, large numbers of threads, large disks, large amounts of memory etc. [...] For the uniprocessors and small servers, there will be significant gains in some corner cases. And some losses. [...] Generally, 2.6 should be "nicer to use" on the desktop. But not appreciably faster.
From: Aaron Lehmann To: linux-kernel Subject: Re: [BENCHMARK] 2.5.47{-mm1} with contest Date: Mon Nov 11 2002 - 18:04:53 AKST
On Tue, Nov 12, 2002 at 10:31:38AM +1100, Con Kolivas wrote: > Here are the latest contest (http://contest.kolivas.net) benchmarks up to and > including 2.5.47.
This is just great to see. Most previous contest runs made me cringe when I saw how -mm and recent 2.5 kernels were faring, but it looks like Andrew has done something right in 2.5.47-mm1. I hope the appropriate get merged so that 2.6.0 has stunning performance across the board.
From: Andrew Morton To: linux-kernel mailing list Subject: Re: [BENCHMARK] 2.5.47{-mm1} with contest Date: Tue Nov 12 2002 - 02:04:23 AKST Aaron Lehmann wrote: > > On Tue, Nov 12, 2002 at 10:31:38AM +1100, Con Kolivas wrote: > > Here are the latest contest (http://contest.kolivas.net) benchmarks up to and > > including 2.5.47. > > This is just great to see. Most previous contest runs made me cringe > when I saw how -mm and recent 2.5 kernels were faring, but it looks > like Andrew has done something right in 2.5.47-mm1. I hope the appropriate get merged so that 2.6.0 has stunning performance across > the board.
Tuning of 2.5 has really hardly started. In some ways, it should be tested against 2.3.99 (well, not really, but...)
It will never be stunningly better than 2.4 for normal workloads on normal machines, because 2.4 just ain't that bad.
What is being addressed in 2.5 is the areas where 2.4 fell down: large machines, large numbers of threads, large disks, large amounts of memory, etc. There have been really big gains in that area.
For the uniprocessors and small servers, there will be significant gains in some corner cases. And some losses. Quite a lot of work has gone into "fairness" issues: allowing tasks to make equal progress when the machine is under load. Not stalling tasks for unreasonable amounts of time, etc. Simple operations such as copying a forest of files from one part of the disk to another have taken a bit of a hit from this. (But copying them to another disk got better).
Generally, 2.6 should be "nicer to use" on the desktop. But not appreciably faster. Significantly slower when there are several processes causing a lot of swapout. That is one area where fairness really hurts throughput. The old `make -j30 bzImage' with mem=128M takes 1.5x as long with 2.5. Because everyone makes equal progress.
Most of the VM gains involve situations where there are large amounts of dirty data in the machine. This has always been a big problem for Linux, and I think we've largely got it under control now. There are still a few issues in the page reclaim code wrt this, but they're fairly obscure (I'm the only person who has noticed them;))
There are some things which people simply have not yet noticed.
Andrea's kernel is the fastest which 2.4 has to offer; let's tickle its weak spots:
Run mke2fs against six disks at the same time, mem=1G:
2.4.20-rc1aa1: 0.04s user 13.16s system 51% cpu 25.782 total 0.05s user 31.53s system 63% cpu 49.542 total 0.05s user 29.04s system 58% cpu 49.544 total 0.05s user 31.07s system 62% cpu 50.017 total 0.06s user 29.80s system 58% cpu 50.983 total 0.06s user 23.30s system 43% cpu 53.214 total
2.5.47-mm2: 0.04s user 2.94s system 48% cpu 6.168 total 0.04s user 2.89s system 39% cpu 7.473 total 0.05s user 3.00s system 37% cpu 8.152 total 0.06s user 4.33s system 43% cpu 9.992 total 0.06s user 4.35s system 42% cpu 10.484 total 0.04s user 4.32s system 32% cpu 13.415 total
Write six 4G files to six disks in parallel, mem=1G:
2.4.20-rc1aa1: 0.01s user 63.17s system 7% cpu 13:53.26 total 0.05s user 63.43s system 7% cpu 14:07.17 total 0.03s user 65.94s system 7% cpu 14:36.25 total 0.01s user 66.29s system 7% cpu 14:38.01 total 0.08s user 63.79s system 7% cpu 14:45.09 total 0.09s user 65.22s system 7% cpu 14:46.95 total
2.5.47-mm2: 0.03s user 53.95s system 39% cpu 2:18.27 total 0.03s user 58.11s system 30% cpu 3:08.23 total 0.02s user 57.43s system 30% cpu 3:08.47 total 0.03s user 54.73s system 23% cpu 3:52.43 total 0.03s user 54.72s system 23% cpu 3:53.22 total 0.03s user 46.14s system 14% cpu 5:29.71 total
Compile a kernel while running `while true;do;./dbench 32;done' against the same disk. mem=128m:
2.5.46: Throughput 19.3907 MB/sec (NB=24.2383 MB/sec 193.907 MBit/sec) Throughput 16.6765 MB/sec (NB=20.8456 MB/sec 166.765 MBit/sec) make -j4 bzImage 412.16s user 36.92s system 83% cpu 8:55.74 total
2.5.47-mm2: Throughput 15.0539 MB/sec (NB=18.8174 MB/sec 150.539 MBit/sec) Throughput 21.6388 MB/sec (NB=27.0485 MB/sec 216.388 MBit/sec) make -j4 bzImage 413.88s user 35.90s system 94% cpu 7:56.68 total - fifo_batch strikes again
It's the "doing multiple things at the same time" which gets better; the straightline throughput of "one thing at a time" won't change much at all.
Corner cases....
Nice to see Linux "Growing Up"
by
zanerock
·
· Score: 4, Interesting
Nice to see Linux doing good on big machines with standard packages and such. I love linux, and it's the only thing I use at home for anything serious, but commercial software has always had the edge on *big* things (big disks, large processes, etc.). With recent advances in process management, and now this, a lot more people will be able to use Linux top to bottom.
I think one interesting thing that could come out of this is that IBM (and others) will be pushed more and more towards a pure service or application only niche. They won't always be able to say, "Sure Linux is great for the workstation, but what about your 8 TB database?" There's a ways to go, but a lot of the features are falling into place.
Having a unified OS from your palmtop to your TB file server will open up a lot of possibilities for people. My personal interest is in a next level of integration which is more natural to use and easier to develop, and we're getting close.
Re:Nice to see Linux "Growing Up"
by
iabervon
·
· Score: 3, Insightful
IBM is also going to stay in the high-end hardware department; it'll be "Sure commodity hardware is great for the workstation, but what about your 8 TB database that has to survive even if someone saws it in half down the middle?" This also puts them in, essentially, the BIOS department for these machines (you want to run you web site off of whatever portion of your database machine isn't actually being used by the database, without risking problems if the web server gets hacked).
Re:Nice to see Linux "Growing Up"
by
zanerock
·
· Score: 3, Interesting
I don't disagree at all. I said that this would begin to push others more to service. With each new thing that you can get for free that works just as well as what other's charge for, you capture a little bit more of the market. This alone, and what has been developed to date, will not push IBM out, nor is everything that needs to be done for such things as you say been done. But it's a step.
I'm not talking now, nor even tomorrow, but in 5-10 years, I think we could see a very different landscape in how old school commercial software and hardware companies (or, in IBM's case, departments) work.
If you can spend $1 million on developing your whizzy new file system, or you can use something that's freely available (or spend $100,000 to tweak it), then the economics of it start to push people out of commercial development in some areas, especially around OS and OS functionality. Instead, you just consult, or deploy, or support and such.
Hello moderators! This is not a troll. He pointed out something that has a nice graph to support what he's saying. Not only that, but it's very well known. AIO on Linux has never been stellar...but should be soon enough.
Someone, please mod the partent back up. He wasn't trolling and was simply stating fact!
Then it's just karma whoring then?
Look he pulled the giff out of his ass, and tried to pass himself off as a expert.
You might look at some of his other posts.
Meanwhile I do recall seeing such a graph much like, if not that graph, by Open Bench Labs (name my not be correct -- it's been a while). Not only that, but the graph EXACTLY matches known expectations on a common Linux AIO Linux implementation (which is purely userland). This is why kernel level AIO implementations have been underway for some time now.
In other words, you may not like the message but it EXACLY matches the current state of some AIO implementations on Linux. Call is a lie if you like, meanwhile those of us that know, will simply nod, move on, and occationally laugh at those that continue to hide their heads in the sand.
Believe it or not, Linux isn't the be-all, end-all of OS's. If it were, there wouldn't be a need for the 2.6 kernel.
Re:Can't get a speedup of more than 10
by
certron
·
· Score: 3, Informative
"It's impossible to get a speedup of more than 10 with any processor-related activities.
Using Amdahl's Law, one can find that Speedup = (s + p ) / (s + p / N ) where N is the number of processors, s is the amount of time spent (by a serial processor) on serial parts of a program and p is the amount of time spent (by a serial processor) on parts of the program that can be done in parallel."
While I'm no expert in software engineering (and I haven't really looked over the equation you put too closely) I think it assumes the original was written with some sort of intelligence behind it. I bet I could write some really atrocious code that would be so incredibly inefficient that almost anyone else could get a huge performance gain from it.
I'm not sure if I would have to try hard or not try at all to write really bad code.:-)
So what does this mean for the everyday linux user
by
pardasaniman
·
· Score: 3, Interesting
For a guy such as myself, who does all his daily tasks on a linux box, what does this mean? Will it mean faster loading time/stability. Or will it make little difference at all?
Performance gains mostly for high-end
by
Dacmot
·
· Score: 5, Interesting
I'm a huge linux fan and I love to brag about how much better than Windows it is, etc. However I don't think it's right to say false truth like "linux 2.6 will be 3 times faster!!!!!" KernelTrap mentions that:
Most significant gains can be expected at the high end such as large machines, large numbers of threads, large disks, large amounts of memory etc. [...] For the uniprocessors and small servers, there will be significant gains in some corner cases. And some losses. [...] Generally, 2.6 should be "nicer to use" on the desktop. But not appreciably faster.
Some of the biggest improvements for desktop responsiveness can be found (for Kernel 2.4.x) at Con Kolivas' web site of performance linux patches.
--
Re:Performance gains mostly for high-end
by
Sivar
·
· Score: 2
However I don't think it's right to say false truth like "linux 2.6 will be 3 times faster!!!!!" That would be why I said: "With some tasks more than tripling in performance,..."
I doubt any semi-knowledgeable person is going to take that statement to mean that kernel 2.6 makes a Linux system three times faster, but depending on what they use that system for, it may do just that. The performance figures are very respectable alone, but when you consider that they kernel hasn't even been frozen yet and that tuning hasn't begun, as I sais, the future looks very promising.
-- Computer Science is no more about computers than astronomy is about telescopes. --E. W. Dijkstra
Re:Performance gains mostly for high-end
by
blakestah
·
· Score: 4, Informative
Yes.
The fine-grained locking improvements on SMP will make it noticeably better for SMP boxes.
A very big improvement is that IDE has been parallelized, meaning that if you use multiple IDE devices at once you will see a "night and day" difference in performance.
If you are uniprocessor and all SCSI and already use low-latency patches, well, as you were.
Re:So what does this mean for the everyday linux u
by
FreeLinux
·
· Score: 2
It means that you won't see too much speed-up on your desktop machine. But, if you run a big server that does multiple processes at once, say Oracle, you could see significant performance gains.
Re:Make it simple please
by
straponego
·
· Score: 2, Insightful
You mean, when will compiling a Linux kernel, which most users will never need to do, become as straightforward as recompiling your Windows kernel, which you can't do?
make xconfig && make dep && make bzImage && make modules && make modules_install && make install
For those of you wondering, this is not a proof that you cannot optimize something to be more than 10 times faster in general.
For example, suppose you have an algorithm A that takes X time. And then suppose you change it to algorithm B that takes 11X time by making it do algorithm A 11 times. Well algorithm B can be optimized to be 11 times faster by making it algorithm A instead, since they give the same result.
Anyway, just wanted to make sure no one was missing the "processor-related activities" clause in your statement.
-- You know where you are? You're in the $PATH, baby. You're gonna get executed!
Re:Make it simple please
by
jericho4.0
·
· Score: 5, Informative
It'll be quite a while before recompiling a kernel gets any simpler. Recompiling assumes that you know (somewhat) what you're doing. Keep at it. It took me at least 10 tries before I compiled a bootable kernel.
quick hint; isnstall the kernel sources that came with your dist. Use the.config file found in this to compile first. These are the settings that your kernel was compiled with. The you can use make xconfig alter a known working config. Good luck.
-- "A language that doesn't affect the way you think about programming, is not worth knowing" - Alan Perlis
Re:By Sturgeon's Law
by
rangek
·
· Score: 2, Insightful
The parent post should be ignored. The information content, while real, is misapplied, and that "10" number is pulled out of his ass.
That is what I thought at first, too. But the orignal poster is right (in a way), a factor of 10 is about the best you can hope for when parallelizing code. Since Amdahl's (or some other guy's) law also says something like 90% of the time is spent in 10% of the code. That makes s=10 and p=90. The limit of his equation, (s+p)/(s+p/n), as n goes to infinity is 10. A number not pulled out of anyone's ass.
Maybe the original poster should be moderated down because I don't think the stuff here is really about parallelization (they talk about speed ups on uniproc systems too), but for the parallel case, he seems to be right.
Re:This is This is the exact opposite of my findin
by
be-fan
·
· Score: 5, Informative
Um, doing benchmarks between an Athlon XP and a Pentium 4 is folly. The P4 has notoriously slow context switching performance. Also, if you are running a small number of threads, your computer isn't spending a whole lot of time thread switching anyway, so the hit doesn't really affect you. When you have lots of threads, scheduling becomes far more important, and so the increase is much more noticible.
-- A deep unwavering belief is a sure sign you're missing something...
make xconfig dep clean bzImage modules modules_install
-adnans
-- "In short: just say NO TO DRUGS, and maybe you won't end up like the Hurd people." --Linus Torvalds
Re:Can't get a speedup of more than 10
by
Daniel+Phillips
·
· Score: 4, Informative
Informative? I don't think so. (Moderators, please check the crack that you are smoking)
Amdahl's law makes a (wrong) statement about the amount of speedup that can be obtained through parallel as opposed to serial execution. (By the way, the number 10 doesn't come into it anywhere. You might as well have mentioned the speed of sound.).
Here, we are talking about the comparative performance of two operating systems running on the same number of processors. Since there is no limit on how stupidly the original could have been implemented, there is correspondingly no limit on the amount of possible speedup due to a better implementation.
Anyway, if you think you know something about Amdahl's law, you need to google for "Gustafsons's law". Executive summary: Amdahl was wrong. Exactly how wrong is still a matter of debate, but it's generally agreed that it lies somewhere between "very" and "completely". Please don't quote this nonsense in support of anything, just don't do it.
Re:So what does this mean for the everyday linux u
by
iabervon
·
· Score: 5, Informative
You'll get better interactive performance under load. So if you're encoding an mp3 and writing your home directory to a CD, your mouse cursor won't stick and your windows will refresh reasonably well. Unless you're doing something kind of disk/processor intensive, you won't notice the difference, because 2.4 is too good already for there to be much improvement. If you try to encode 32 mp3s at the same time, 2.6 will actually do worse than 2.4, but at least it won't make ls quite so slow.
The main goals are interactivity (input gets handled quickly), low latency (your mp3 player gets a chance to send the next second of audio to the sound card before this second is over), and fairness (every program makes at least a little progress after a short amount of time).
Re:So what does this mean for the everyday linux u
by
Azar
·
· Score: 5, Informative
Overall throughput has not increased (actually, it is believed to have decreased). So the overall speed of the system is relatively equal to the 2.4 series of kernels. You probably won't see any major performance speedups in any apps you use.
However, the overall responsiveness of the system is improved. Most people who have used it have claimed that it feltmuch faster than the 2.4 series. You won't have starved processess.
This means if you're running XMMS and you compile a kernel, XMMS won't just hang until the compilation is done. The kernel developers have done a great job in improving -fairness- between processes.
Mostly, the results will be seen on Big Iron and server applications, but the overall desktop experience is expected to improve.
Re:Linux Benchmarks
by
geogeek6_7
·
· Score: 4, Funny
Great, now people compiling PHP are committing patches to linux....;)
Re:By Sturgeon's Law
by
Daniel+Phillips
·
· Score: 2, Insightful
the orignal poster is right (in a way), a factor of 10 is about the best you can hope for when parallelizing code. Since Amdahl's (or some other guy's) law also says something like 90% of the time is spent in 10% of the code. That makes s=10 and p=90.
No it doesn't. How do you know the 90% is serializable and the 10% isn't? Answer: you don't, there is no relationship whatsoever.
There was at least one performance bug with Mandrake 8 that resulted in extremely slow X performance. I don't remember the details but maybe someone will share them...
Re:Make it simple please
by
Wavicle
·
· Score: 5, Interesting
It is simple, tar -xvzf linux-{current}.tar.gz. cd linux; make menuconfig; make dep bzImage modules modules_install
You're joking, right? How many options in 2.5.47 must be selected in order for your run of the mill $9 generic PS/2 keyboard to work? I can't tell you how much fun it was building 2.5.47, missing one *somewhere* and suddenly I couldn't do anything because my keyboard stopped working.
The kernel only has an expert mode. It would be nice if there were a higher order config that asked you basic questions and built the things you were most likely to need, with the option of going into a more expert mode if you needed to fine tune something.
-- Education is a better safeguard of liberty than a standing army. Edward Everett (1794 - 1865)
Re:inexperience
by
myz24
·
· Score: 2, Informative
The short answer is that KDE is written in C++.
The long answer is that anything written in C++ on Linux will load slow (but should run fairly quick once loaded) because of something to do with loading the C++ libraries and some other compiler gook. I can't remember where I read it, or how I found it on google, but aparently this will be fixed soon in glibc.
Of course, I could be WAY off, so if someone could back me up...
Wow, you can disprove Ahmdahl's law?
by
Fefe
·
· Score: 5, Informative
Please write and publish a paper about it!
This is a major breakthrough in computer science.
It also is quite unlikely, since Ahmdahl's law is a trivial observation that is completely independent of parallelization or even software engineering (it also applies to hardware design or even accounting). Basically, it says: if initially only 10% of X (CPU cycles, money, whatever you are trying to save) is spent in the part you are optimizing, there is an upper bound of 10% to the X you can save.
I'm very interested in how you can disprove that.
Re:Wow, you can disprove Ahmdahl's law?
by
Daniel+Phillips
·
· Score: 3, Informative
Please write and publish a paper about it!
Such rhetoric, oh my.
This is a major breakthrough in computer science.
It also is quite unlikely, since Ahmdahl's law is a trivial observation that is completely independent of parallelization or even software engineering (it also applies to hardware design or even accounting). Basically, it says: if initially only 10% of X (CPU cycles, money, whatever you are trying to save) is spent in the part you are optimizing, there is an upper bound of 10% to the X you can save.
Sorry, wrong law. You seem to be thinking "90% of the time in 10% of the code", a rule of thumb that nobody to my knowledge has ventured to dignify with the term "law". Amdahl's Law (which IMHO doesn't deserve the dignity either) was an attempt to make a statement about the limitations of parallel computing. Relying on wrong assumptions, he drew wrong conclusions, and in the event, parallel clusters have gone on to scale nearly linearly into the tens of thousands of processors, a result he would have liked to have proved impossible.
make oldconfig dep clean modules modules_install install
Yes oldconfig is nice when you already have a.config file from a previous kernel. But I have really been missing xoldconfig, that will give me the xconfig interface but with only the questions I'd need to answer when using oldconfig.
I looks like if you are talking about large internet operations and large cpu intensive tasks and large io tasks all at the same time... then YES! it will make the internet faster.
So maybe this will help some people stand up to being/.'ed.
--
Liberty.
Re:This is This is the exact opposite of my findin
by
Sivar
·
· Score: 2, Insightful
The P4 has notoriously slow context switching performance.
The Pentium IV has notoriously slow performance in some areas, but a processor being slow in context switching doesn't make sense. Depending on the context(English context, not computer context), context switching is either the system switching from kernel mode (running kernel code) to user mode (user applications) or vice-versa, OR it is simply moving from one execution path to another (as was scheduled by the, um, scheduler)
The processor has nothing to do with it. Context switching in BOTH instances is handled entirely by the operating system. While Windows NT 3.1 may have "slow context switching" and Linux with the O(1) scheduler may have "fast context switching", the Pentium IV cannot "have fast or slow context switching" because it doesn't have anything to do with the Pentium IV.
One might theorize that the original poster's comment was refering to the Pentium IV being particularly slow at the actual instructions used in context switching. Regarding the discussion of the kernel scheduler, the meaning of "context switching" that we are using probably refers to switching between tasks (AKA multitasking), so the important instructions would simply be jump instructions like "jmp", which AFAIK are not particularly slow on the Pentium IV like, say, bit shifting (which is glacially slow on the Pentium IV).
-- Computer Science is no more about computers than astronomy is about telescopes. --E. W. Dijkstra
I like this one better...
by
Some+Dumbass...
·
· Score: 2, Flamebait
How about this image, from the same article. Note how green, which is SUSE Linux, is winning:)
Needless to say, context is everything.
Re:Make it simple please
by
momobaxter
·
· Score: 3, Insightful
A home user (meaning non hacker) never has the need to recompile a kernel. NEVER. Your distribution has all the modules available and if you're running the more popular distros, they will even detect your hardware and load the module for you.
Sometimes people shouldn't mess with stuff, the kernel is one of those things. RedHat does a good job with their builds and an average user doesn't need to rebuilt it at all. A more experienced user might want to tweak, but then he can use make menuconfig or make config...and choose his options.
My grandmother will never recompile her kernel.
-- "Full sources for linux currently runs to about 200kB compressed" --Linus Torvalds 31-Jan-1992
I know exactly how you feel. I actually use linux quite a bit, but it's all precompiled suse packages for the most part except when I need oddball stuff like gif support for GD. Then it's time to compile php.
I'm blessed to have friends that know more than I do and are willing to help me out when I get stuck.
Compiling the kernel is something I haven't attempted since 386DX40 days.
-- The man who trades freedom for security does not deserve nor will he ever receive either. - Benjamin Franklin
Re:Make it simple please
by
Anna+Merikin
·
· Score: 4, Informative
I grew up with DOS, too. If you installed Borland's Sidekick (many did) successfully, you can compile. That's the stuff that went on in Sidekick's install process: it used Borland's compiler -- and that's why it ran so well.
I just finished *this morning* compiling a 2.2.22 (yes, RH-6.2) for my box. Use the.config file from the stock kernel sources for your distro, usually in/usr/src/linux* (you may have to install them) open a root terminal window in/usr/src, issue `make xconfig' choose the.config from the load configuration file box and start disabling everything you KNOW you do not need. The help buttons are mostly very helpful. If your box is used for web surfing, compile in ppp, same with lpd if you need to print. Unless you have a SCSI drive, disable all SCSI boxes. Load as much of your equipment into the kernel as you can, and disable the modules that enable hardware and features you don't have or use, like firewire or USB. Make sure equipment you DO HAVE are supported either in the kernel or as a module. Keep doing `next' until the end, when there is no `next.' Choose Main Menu,
Then save the new configuration. Do a 'make dep bzImage modules modules_install' and copy the ~/System.map file as System.map-new.kernel.number and drill down to/usr/linux/arch/i86/boot and copy bzImage as vmlinuz-number.of.kernel to/boot.
from/usr/src/linux , do make modules_install. Modify/etc.lilo.conf to include the new kernel and System.map. Activate lilo (/sbin/lilo -v -v).
Reboot into new kernel. If you get lots of error messages about modules not loading, reboot at the command prompt, and everything will have been rewritten magically. Use your new kernel for testing. You may find you want to try another configuration. Do it all again, changing the Makefile each time under line 3 EXTRAVERSION with another digit or letter to keep it from overwriting a working kernel when you copy in to/boot and to keep the modules straight (though they appear not to care....)
Frankly, I've tried nine builds and although my kernels are smaller than stock, use about 5Kb less RAM and benchmarks seem to indicate about 5-6 per cent increase in speed, I feel no difference in use.
I do feel better knowing I am using the latest (and perhaps the last) kernel in the 2.2.x series, though. FWIW.
Re:Make it simple please
by
Istealmymusic
·
· Score: 2
This is even easier:
make installkernel KERNCONF=GENERIC
make buildkernel KERNCONF=GENERIC
Of course, you have to be using a BSD kernel. Theres nothing wrong with using GNU userland tools and a BSD kernel...
-- "The lesson to be learned is not to take the comments on slashdot too literally." --Vinnie Falco, BearShare
Re:This is This is the exact opposite of my findin
by
cpeterso
·
· Score: 2
I think some processors have multiple register sets, so threads do not have to thrash the same set of registers for every thread context switch.
Re:Can't get a speedup of more than 10
by
ctr2sprt
·
· Score: 2
It actually doesn't matter, because speedups are calculated using algorithm speed, not clock ticks or anything concrete like that. In other words, speedups only give you part of the story. A poorly-written program using a O(log n) search algorithm may be slower than a well-written program using a O(n) one; but in the normal case, the programmers will be sufficiently competent for the better algorithm to make the better program as well.
There are a whole bunch of ways you can conceal information or mislead readers by claiming really good big-oh times, but this isn't really one of them. (How about a perfect hash table that calculates keys using a O(m^n) hashing algorithm?)
Re:Can't get a speedup of more than 10
by
Ondo
·
· Score: 2
Anyway, if you think you know something about Amdahl's law, you need to google for "Gustafsons's law". Executive summary: Amdahl was wrong.
If you had actually tried using google for "Gustafson's law" you would have seen as the first link a paper claiming it and Amdahl's law are identical, not that Amdahl was wrong.
Re:This is This is the exact opposite of my findin
by
puetzk
·
· Score: 3, Interesting
well, this guy is apparently a troll, but just for the sake of argument... Anyone repeating his test would probably find very similar results. HZ (the constant controlling how often the scheduler runs) has been changed from 100 to 1000, improving smoothness for many things (multimedia apps espescially) at the cost of making the schduler overhead 10 times what it was before.
Luckily, it was very small before, and it's still very small. Maybe it went from taking 0.001% of your CPU power to 0.01%:-). The *only* times the scheduler was really a problem before were a) when it made bad choices and b) when there was gazillions of tasks. The rest of the time, it was totally negligible.
So, even if the scheduler did slow down by a factor of 2 as he claimed (and in fact, it would have slowed down by a factor of 10 due to the HZ changes so his claim would leave O(1) 5 times faster than the old scheduler) it really wouldn't matter to an ordinary desktop/server. The scheduler time is too small to be important on normal machines .
-- The Matrix is going down for reboot now! Stopping reality: OK. The system is halted.
Re:This is This is the exact opposite of my findin
by
be-fan
·
· Score: 4, Informative
The instructions involved in the context switch are slow on the Pentium 4. The P4 has a long internal pipeline to flush, and a huge amount of internal state to synchronize, which makes context switches slow. For example, an interrupt/return pair take 2000 clock cycles on the P4!
-- A deep unwavering belief is a sure sign you're missing something...
Re:Linux Benchmarks
by
Anonymous Coward
·
· Score: 5, Funny
I don't know about troll, but perhaps just an overactive imagination:)
Apparently he works for a development firm, studies meterology, works for Verizon store at a local mall, owns a chain of pet stores in London, and has a thing for CmdrTaco.
Read together, they make amusing reading:)
Disk buffers & memory subsystem updated??
by
Anonymous Coward
·
· Score: 5, Interesting
I'm a big time VMware user (I use it for testing and Windows). I usually have 2 or 3 VMware machines running at any given time and I have plently of memory (usually 1GB, sometimes more). However, the disk buffer (or disk caching) of Linux sucks ass. I'm not kidding, if I have 1GB of memory, 900+ megs will be used for disk buffers and my very important interactive VMware processes will be swapped out to the slow disk swap file. Just using one of the VMware processes causes a lot of disk I/O and all that I/O gets loaded into the disk buffers in memory then when I go to use another VMware process it has to come out of swap. Linux is pretty bad about this with normal processes, but VMware exasperates the problem.
To boil it down: The disk buffering in 2.4 is way , way too aggressive and I haven't figured out a way to fix it. I need to be able to either limit the total ammount of memory the buffers will use or a better method would be to tag certain processes so that they will never be moved into swap for disk buffers (moving to swap "normally" is OK, just not for disk buffers). Or maybe just make it never swap out any process for disk buffers.
It seems Windows uses a more reasonable disk buffering technique and VMware works better there (especially when using several instances). I don't want to use Windows as my primary OS though because I like the built-in disk encryption and network security of Linux (the ip filter stuff is much better than Windows).
Anyone know if 2.5 has got any better disk buffering?
> would be to tag certain processes so that they > will never be moved into swap for disk buffers
I beleive that this is what the sticky bit was intended for. Before I go about explaining what it is and how to use it, does anyone know if Linux actually *honors* the sticky bit or does it just have it for compatibility?
man chmod:...and the Linux kernel ignores the sticky bit on files. Other ker-
nels may use the sticky bit on files for system-defined purposes. On
some systems, only the superuser can set the sticky bit on files. -- Matt
Re:Why did they get rid of the old make xconfig?
by
gmack
·
· Score: 3
Because the old system had 3 diffrent parsers each with their own bugs and it had become a maintainance nightmare.
Making new configurators is simple with the new system and I'm sure there will be gtk/whatever else configurators available.
the stability of these releases are questionable, and they have been known to cone dump into various output files.
Cone dumping is a problem... hopefully, this new vehicle solves it.
--
-- Karma is overrated, whoring is ok.
Re:Can't get a speedup of more than 10
by
mdechene
·
· Score: 3, Interesting
Amdahl's law is used to predict speed increases for multi-processor systems. In this case, you can see a gain of more than 10 if you have enough processors in use, and the majority of the work is in parallel.
I think it assumes the original was written with some sort of intelligence behind it. I bet I could write some really atrocious code that would be so incredibly inefficient that almost anyone else could get a huge performance gain from it.
It doesn't really assume anything. The equation pertains to gains simply by increasing the number of parallel processors, not the strength of the code.
Anyways, this is probably redundant, but the big gains from the new kernel is that the amount of parallel processes are increased and the serial processes decreased. In a single processor system, performance decreases as there is more overhead in swapping processes in and out. In multi-processor systems, the gains would be enormous.
--
Karma: Not Particularly Funny.
Re:This is This is the exact opposite of my findin
by
UncleFluffy
·
· Score: 2
The Pentium IV has notoriously slow performance in some areas, but a processor being slow in context switching doesn't make sense.
Well, those of us who actually design CPUs and stuff rather than pretend we know about them use the term "context switch" to describe dumping the current CPU state (to memory, other registers, whatever) then loading a new state, or something logically equivalent. This can be for a thread switch, interrupt handling, whatever.
The processor has nothing to do with it.
A CPU level context switch is part of what happens during an OS level context switch, and therefore has a significant effect on OS performance.
It may be very interesting to run the same tests on various other free operating systems, especially BSD.
-- {{.sig}}
You aree arguing the wrong point
by
The+Creator
·
· Score: 2
He newer say that linux is worse, just that linux has an achilles heal.
--
FRA: STFU GTFO
Re:You aree arguing the wrong point
by
Some+Dumbass...
·
· Score: 2
He newer say that linux is worse, just that linux has an achilles heal.
On the contrary, you stated that there was a reason why serious Linux users needed the 2.5 kernel. But if Linux and XP perform the same overall, why do serious Linux users need the 2.5 kernel? They don't.
I'm not saying that this particular article doesn't make certain conclusions about asynchronous I/O. It's a simple fact that it does. But you made a conclusion based on those conclusions, and your conclusion is what disagreed with.
I also pointed out that presenting a single image without understanding the full context of the article is silly. The tests in the article you quoted were run on laptop computers. What kind of "serious Linux users" with great need for asynchronous I/O use a laptop for that purpose? The image I quoted showed performance in a more typical laptop use pattern (according to the article's author). For that matter, would they use ReiserFS, or would they use XFS, JFS, or another filesystem?
Re:So what does this mean for the everyday linux u
by
FooBarWidget
·
· Score: 2
How does it compare to 2.4 with the low-latency, preemptive and O(1) patches?
Interesting, but not for me...
by
pavera
·
· Score: 2
Inspired by the numbers and new "snappyness" under load, I decided to download and compile the 2.5.47 kernel, and see for myself, disappointed is all I can say, 2.4.19 with preempt and low-latency is snappier by quite a bit than 2.5.47. My test isn't quite as numeric as the stories... I simply start ripping a DVD (oops did I say that...) to avi, and compiling something (in this case xmms) and then get my term window, open limewire, and drag the term window around on the maximized limewire window, under 2.4.19 I can never get the whole window grey (as I drag the term window it acts as an eraser on the limewire window, until that window is redrawn) undery 2.5.47 I can easily grey out the entire limewire window, normally for 2 or 3 seconds before it redraws... under 2.4.19 I can maybe grey out about 1 term window worth of area in the limewire window before it is redrawn...
Of course it states in the story that 2.5 has not been tuned at all really, so hopefully this will improve, but for now I'm sticking with 2.4.19 preempt low latency
Re:Interesting, but not for me...
by
GooberToo
·
· Score: 2
You wouldn't happen to know where I can get a pre-patched 2.4.19 kernel that has O(1), low latency, preempt and XFS all rolled together would ya?
Re:Interesting, but not for me...
by
pavera
·
· Score: 2
I use gentoo, the gentoo-sources kernel has low latency preempt pre-patched but it doesn't have xfs, they have an xfs kernel as well, but I don't know if it has low latency and preempt already, I seem to remember something about low latency and or preempt causing problems with the xfs kernel, but I might just be smoking crack. check forums.gentoo.org. Ok I just did, and yeah, preempt + XFS is a bad idea, much instability, the patches fight with each other (XFS trying to journal, preempt trying to let something else use the cpu) result, massive instability. So, no I don't think you can get all three to play nice, but you can run low-latency+XFS, or low-latency+preempt, but you can't throw preempt in with XFS... gentoo is nice and patches the kernel automatically, if not running gentoo, you'd have to patch the kernel yourself...
Re:Interesting, but not for me...
by
GooberToo
·
· Score: 2
Okay. Thanks.
I've tried manually patching XFS, O(1) and a couple of other odds and ends (latency, preempt) and wind up with something that doesn't even boot or crashes/panics right after words.
So thanks...that pretty much confirms that it's not something I'm doing...;)
Disable your swap.
by
Effugas
·
· Score: 3, Informative
Buy more RAM and disable swap. Or just disable it -- at 1Gb, you're close to what you need anyway.
I'm serious. With another gig costing a hundred dollars -- maybe less -- the overhead of disk-based VM is just no longer justified.
WinXP benefits from this optimization even more than Linux.
Yours Truly,
Dan Kaminsky
DoxPara Research
http://www.doxpara.com
Older boxen?
by
soupforare
·
· Score: 2, Interesting
I'm still milling over which kernel to use with my old 486's.
Right now, they're running 2.2.10, iirc; whatever the debian stable had on her boot disks.
I'm not going to compile any kernels until my dual ppro is fixed, because compiling a kernel on a PoS 486 portable is not fun:P
Anyone have any comments/recommendations on if/which new kernels are good to run on old shite?
I've just installed NetBSD+Apache on a 33Mhz 486 laptop with 8MB RAM. I did
recompile the kernel on another machine, though. This was essential
because the generic laptop kernel was taking too much memory. The end
result is nice and quite smooth.
Of course there are lighter webservers such as Boa, but I needed PHP,
and the Apache process is taking less than a megabyte of memory.
--
--
If you moderate this, then your children will be next.
Most distributions go with a more or less specific kernel (586/686/Atholon, etc) but only i386 applications. Newer processors only really sing with specially compiled code.
A distribution such as Gentoo may not be the easiest to install but you get the whole gubbins, X, Gnome or KDE and the apps compiled for your system.
Microsoft tend to distribute generic code, and if you are lucky you may get a model specific dll. What Microsoft can not do is to distribute code that can be compiled for a specific model, well not until they deliver code that gets compiled during setup. Note, this can be done with optimisable intermediate code rather than source, but it wouldn't be easy.
Please note that I *did* state that it wasn't for the newbie. However, if the packaging was improved (i.e., loading of groups of programs), I don't see why compilation from source can't be hidden from installer and a better configuration shell added for configuration.I wouldn't not recommend Gentoo to someone who was a novice. Did you try configuring Linux in the early days? A learning experience, but not neccessarily a bad one. However if somone doesn't want to learn a lot about Linux, then, yes, Gentoo isn't the best.
Installation from source allows a system to be customised and is extremely powerful. With many home systems equipped with significant hard disk and memory, why can't a system rebuild itself overnight automatically?
The user stated that they were inexperienced. Inexperience is not an absence of intelligence and this is clearly a person who is at least willing to try things.
I'm sorry that you consider that an inexperienced person should be afraid of other ways of installing. Please remember that the idea of a GUI installer for an operating system is quite new. Haven't you ever tried to get an operating system up and running with inadequate documentation, a lot of unwritten dependancies and nothing but the command line?
I don't consider myself an evangelist for Gentoo but I want to explain that there is a faster way to run Linux. Wouldn't you agree? Would you be happier putting in a slow distro, getting the hang of it and then moving onto something faster, if you know that things *will* get better?
I found Gentoo relatively easy to get up. It took a lot loooonger to get it do what I wanted though.
No, I have worked on a lot of computers in the past and have had to struggle through a lot worse than this.OTOH, my notebook is running RH and I keep Gentoo for some other systems.
I've been in a much worse situation courtesy of Redmond with their dependency hell which forced an operating system reinstall.
Re:So what does this mean for the everyday linux u
by
GooberToo
·
· Score: 2
The use of "big server" is somewhat misleading.
Fact is, anyone that heavily uses their Linux box will see some difference. It's just that the heavier your box gets used, the bigger difference you'll see.:)
Those that do little serious multitasking may see "smoother" multitasking but little more. Those that perform concurrent compiles, heavy CPU or I/O database servers, big time-share systems, etc, will see larger and larger note worthy gains.
I'd LOVE to try out the 2.5 series, but because LVM is still not in there (not a week ago at least), and I have all my data (movies, oggs, etc) on LVM, I'm unable to use it...:(
Does anyone have a clue when there will be LVM for 2.5?
With some tasks more than tripling in performance, the future looks very promising
:-(
Damn, I wish my video card had kernel updates
In college, really poor, need a flatscreen.
So in other words the more time you spend rescheduling things the less time you have for executing the code you have scheduled.
:)
Hmm..... sounds like modern business management
I'm glad you put the "of EXT3 fame" bit, I was worried the article might be talking about the infamous author.
Although he might end up on the front page of /. if he writes an unauthorized biography of Mr. Gates, what kind of juice could be dragged up from the past... I wonder?
Are you local? There's nothing for you here!
Will it make the internet faster?
Try it again.
;))
In a reply on lkml to Aaron Lehmann's praising of the contest results of the latest 2.5-mm kernel Andrew Morton [interview] explains some of the important performance and design differences between the 2.4 stable series and the 2.5 development series accompanied by illustrating benchmarks.
Most significant gains can be expected at the high end such as large machines, large numbers of threads, large disks, large amounts of memory etc. [...] For the uniprocessors and small servers, there will be significant gains in some corner cases. And some losses. [...] Generally, 2.6 should be "nicer to use" on the desktop. But not appreciably faster.
From: Aaron Lehmann
To: linux-kernel
Subject: Re: [BENCHMARK] 2.5.47{-mm1} with contest
Date: Mon Nov 11 2002 - 18:04:53 AKST
On Tue, Nov 12, 2002 at 10:31:38AM +1100, Con Kolivas wrote:
> Here are the latest contest (http://contest.kolivas.net) benchmarks up to and
> including 2.5.47.
This is just great to see. Most previous contest runs made me cringe when I saw how -mm and recent 2.5 kernels were faring, but it looks like Andrew has done something right in 2.5.47-mm1. I hope the appropriate get merged so that 2.6.0 has stunning performance across the board.
From: Andrew Morton
To: linux-kernel mailing list
Subject: Re: [BENCHMARK] 2.5.47{-mm1} with contest
Date: Tue Nov 12 2002 - 02:04:23 AKST
Aaron Lehmann wrote:
>
> On Tue, Nov 12, 2002 at 10:31:38AM +1100, Con Kolivas wrote:
> > Here are the latest contest (http://contest.kolivas.net) benchmarks up to and
> > including 2.5.47.
>
> This is just great to see. Most previous contest runs made me cringe
> when I saw how -mm and recent 2.5 kernels were faring, but it looks
> like Andrew has done something right in 2.5.47-mm1. I hope the appropriate get merged so that 2.6.0 has stunning performance across
> the board.
Tuning of 2.5 has really hardly started. In some ways, it should be tested against 2.3.99 (well, not really, but...)
It will never be stunningly better than 2.4 for normal workloads on
normal machines, because 2.4 just ain't that bad.
What is being addressed in 2.5 is the areas where 2.4 fell down: large machines, large numbers of threads, large disks, large amounts
of memory, etc. There have been really big gains in that area.
For the uniprocessors and small servers, there will be significant gains in some corner cases. And some losses. Quite a lot of work has gone into "fairness" issues: allowing tasks to make equal progress when the machine is under load. Not stalling tasks for unreasonable
amounts of time, etc. Simple operations such as copying a forest of files from one part of the disk to another have taken a bit of a hit from this. (But copying them to another disk got better).
Generally, 2.6 should be "nicer to use" on the desktop. But not appreciably faster. Significantly slower when there are several processes causing a lot of swapout. That is one area where fairness really hurts throughput. The old `make -j30 bzImage' with mem=128M takes 1.5x as long with 2.5. Because everyone makes equal progress.
Most of the VM gains involve situations where there are large amounts of dirty data in the machine. This has always been a big problem
for Linux, and I think we've largely got it under control now. There are still a few issues in the page reclaim code wrt this, but they're
fairly obscure (I'm the only person who has noticed them
There are some things which people simply have not yet noticed.
Andrea's kernel is the fastest which 2.4 has to offer; let's tickle its weak spots:
Run mke2fs against six disks at the same time, mem=1G:
2.4.20-rc1aa1:
0.04s user 13.16s system 51% cpu 25.782 total
0.05s user 31.53s system 63% cpu 49.542 total
0.05s user 29.04s system 58% cpu 49.544 total
0.05s user 31.07s system 62% cpu 50.017 total
0.06s user 29.80s system 58% cpu 50.983 total
0.06s user 23.30s system 43% cpu 53.214 total
2.5.47-mm2:
0.04s user 2.94s system 48% cpu 6.168 total
0.04s user 2.89s system 39% cpu 7.473 total
0.05s user 3.00s system 37% cpu 8.152 total
0.06s user 4.33s system 43% cpu 9.992 total
0.06s user 4.35s system 42% cpu 10.484 total
0.04s user 4.32s system 32% cpu 13.415 total
Write six 4G files to six disks in parallel, mem=1G:
2.4.20-rc1aa1:
0.01s user 63.17s system 7% cpu 13:53.26 total
0.05s user 63.43s system 7% cpu 14:07.17 total
0.03s user 65.94s system 7% cpu 14:36.25 total
0.01s user 66.29s system 7% cpu 14:38.01 total
0.08s user 63.79s system 7% cpu 14:45.09 total
0.09s user 65.22s system 7% cpu 14:46.95 total
2.5.47-mm2:
0.03s user 53.95s system 39% cpu 2:18.27 total
0.03s user 58.11s system 30% cpu 3:08.23 total
0.02s user 57.43s system 30% cpu 3:08.47 total
0.03s user 54.73s system 23% cpu 3:52.43 total
0.03s user 54.72s system 23% cpu 3:53.22 total
0.03s user 46.14s system 14% cpu 5:29.71 total
Compile a kernel while running `while true;do;./dbench 32;done' against
the same disk. mem=128m:
2.4.20-rc1aa1:
Throughput 17.7491 MB/sec (NB=22.1863 MB/sec 177.491 MBit/sec)
Throughput 16.6311 MB/sec (NB=20.7888 MB/sec 166.311 MBit/sec)
Throughput 17.0409 MB/sec (NB=21.3012 MB/sec 170.409 MBit/sec)
Throughput 17.4876 MB/sec (NB=21.8595 MB/sec 174.876 MBit/sec)
Throughput 15.3017 MB/sec (NB=19.1271 MB/sec 153.017 MBit/sec)
Throughput 18.0726 MB/sec (NB=22.5907 MB/sec 180.726 MBit/sec)
Throughput 18.2769 MB/sec (NB=22.8461 MB/sec 182.769 MBit/sec)
Throughput 19.152 MB/sec (NB=23.94 MB/sec 191.52 MBit/sec)
Throughput 14.2632 MB/sec (NB=17.8291 MB/sec 142.632 MBit/sec)
Throughput 20.5007 MB/sec (NB=25.6258 MB/sec 205.007 MBit/sec)
Throughput 24.9471 MB/sec (NB=31.1838 MB/sec 249.471 MBit/sec)
Throughput 20.36 MB/sec (NB=25.45 MB/sec 203.6 MBit/sec)
make -j4 bzImage 412.28s user 36.90s system 15% cpu 47:11.14 total
2.5.46:
Throughput 19.3907 MB/sec (NB=24.2383 MB/sec 193.907 MBit/sec)
Throughput 16.6765 MB/sec (NB=20.8456 MB/sec 166.765 MBit/sec)
make -j4 bzImage 412.16s user 36.92s system 83% cpu 8:55.74 total
2.5.47-mm2:
Throughput 15.0539 MB/sec (NB=18.8174 MB/sec 150.539 MBit/sec)
Throughput 21.6388 MB/sec (NB=27.0485 MB/sec 216.388 MBit/sec)
make -j4 bzImage 413.88s user 35.90s system 94% cpu 7:56.68 total - fifo_batch strikes again
It's the "doing multiple things at the same time" which gets better; the
straightline throughput of "one thing at a time" won't change much at all.
Corner cases....
Nice to see Linux doing good on big machines with standard packages and such. I love linux, and it's the only thing I use at home for anything serious, but commercial software has always had the edge on *big* things (big disks, large processes, etc.). With recent advances in process management, and now this, a lot more people will be able to use Linux top to bottom.
I think one interesting thing that could come out of this is that IBM (and others) will be pushed more and more towards a pure service or application only niche. They won't always be able to say, "Sure Linux is great for the workstation, but what about your 8 TB database?" There's a ways to go, but a lot of the features are falling into place.
Having a unified OS from your palmtop to your TB file server will open up a lot of possibilities for people. My personal interest is in a next level of integration which is more natural to use and easier to develop, and we're getting close.
This was a major reason that 2.5 is, put simply, needed by any and all serious Lunix users.
Based on this image (0202_lab_xp_4.gif), one can see that large volumes of asynchronous I/O is, as the author puts it, the "Achilles' Hell" of Linux.
The Linux kernel itself in all versions 2.5 serializes disk Input/Output with a single spinlock.
(The yellow is the Windows XP box; the green line is the data for the SuSE Linux pee sea)
Department of Physics and Atmospheric Science, Dalhousie University, Halifax, N.S., Canada, B3H 3J5
"It's impossible to get a speedup of more than 10 with any processor-related activities.
:-)
Using Amdahl's Law, one can find that
Speedup = (s + p ) / (s + p / N ) where N is the number of processors, s is the amount of time spent (by a serial processor) on serial parts of a program and p is the amount of time spent (by a serial processor) on parts of the program that can be done in parallel."
While I'm no expert in software engineering (and I haven't really looked over the equation you put too closely) I think it assumes the original was written with some sort of intelligence behind it. I bet I could write some really atrocious code that would be so incredibly inefficient that almost anyone else could get a huge performance gain from it.
I'm not sure if I would have to try hard or not try at all to write really bad code.
fair.org counterpunch.com truthout.com indymedia.org salon.com
eff.org guerrilla.net debian.org gentoo.org
For a guy such as myself, who does all his daily tasks on a linux box, what does this mean? Will it mean faster loading time/stability. Or will it make little difference at all?
Some of the biggest improvements for desktop responsiveness can be found (for Kernel 2.4.x) at Con Kolivas' web site of performance linux patches.
--
It means that you won't see too much speed-up on your desktop machine. But, if you run a big server that does multiple processes at once, say Oracle, you could see significant performance gains.
make xconfig && make dep && make bzImage && make modules && make modules_install && make install
For example, suppose you have an algorithm A that takes X time. And then suppose you change it to algorithm B that takes 11X time by making it do algorithm A 11 times. Well algorithm B can be optimized to be 11 times faster by making it algorithm A instead, since they give the same result.
Anyway, just wanted to make sure no one was missing the "processor-related activities" clause in your statement.
You know where you are? You're in the $PATH, baby. You're gonna get executed!
quick hint; isnstall the kernel sources that came with your dist. Use the .config file found in this to compile first. These are the settings that your kernel was compiled with. The you can use make xconfig alter a known working config. Good luck.
"A language that doesn't affect the way you think about programming, is not worth knowing" - Alan Perlis
That is what I thought at first, too. But the orignal poster is right (in a way), a factor of 10 is about the best you can hope for when parallelizing code. Since Amdahl's (or some other guy's) law also says something like 90% of the time is spent in 10% of the code. That makes s=10 and p=90. The limit of his equation, (s+p)/(s+p/n), as n goes to infinity is 10. A number not pulled out of anyone's ass.
Maybe the original poster should be moderated down because I don't think the stuff here is really about parallelization (they talk about speed ups on uniproc systems too), but for the parallel case, he seems to be right.
Um, doing benchmarks between an Athlon XP and a Pentium 4 is folly. The P4 has notoriously slow context switching performance. Also, if you are running a small number of threads, your computer isn't spending a whole lot of time thread switching anyway, so the hit doesn't really affect you. When you have lots of threads, scheduling becomes far more important, and so the increase is much more noticible.
A deep unwavering belief is a sure sign you're missing something...
Your command line can be much shorter:
make xconfig dep clean bzImage modules modules_install
-adnans
"In short: just say NO TO DRUGS, and maybe you won't end up like the Hurd people." --Linus Torvalds
Informative? I don't think so. (Moderators, please check the crack that you are smoking)
Amdahl's law makes a (wrong) statement about the amount of speedup that can be obtained through parallel as opposed to serial execution. (By the way, the number 10 doesn't come into it anywhere. You might as well have mentioned the speed of sound.).
Here, we are talking about the comparative performance of two operating systems running on the same number of processors. Since there is no limit on how stupidly the original could have been implemented, there is correspondingly no limit on the amount of possible speedup due to a better implementation.
Anyway, if you think you know something about Amdahl's law, you need to google for "Gustafsons's law". Executive summary: Amdahl was wrong. Exactly how wrong is still a matter of debate, but it's generally agreed that it lies somewhere between "very" and "completely". Please don't quote this nonsense in support of anything, just don't do it.
Have you got your LWN subscription yet?
You'll get better interactive performance under load. So if you're encoding an mp3 and writing your home directory to a CD, your mouse cursor won't stick and your windows will refresh reasonably well. Unless you're doing something kind of disk/processor intensive, you won't notice the difference, because 2.4 is too good already for there to be much improvement. If you try to encode 32 mp3s at the same time, 2.6 will actually do worse than 2.4, but at least it won't make ls quite so slow.
The main goals are interactivity (input gets handled quickly), low latency (your mp3 player gets a chance to send the next second of audio to the sound card before this second is over), and fairness (every program makes at least a little progress after a short amount of time).
Overall throughput has not increased (actually, it is believed to have decreased). So the overall speed of the system is relatively equal to the 2.4 series of kernels. You probably won't see any major performance speedups in any apps you use.
However, the overall responsiveness of the system is improved. Most people who have used it have claimed that it felt much faster than the 2.4 series. You won't have starved processess.
This means if you're running XMMS and you compile a kernel, XMMS won't just hang until the compilation is done. The kernel developers have done a great job in improving -fairness- between processes.
Mostly, the results will be seen on Big Iron and server applications, but the overall desktop experience is expected to improve.
Great, now people compiling PHP are committing patches to linux.... ;)
the orignal poster is right (in a way), a factor of 10 is about the best you can hope for when parallelizing code. Since Amdahl's (or some other guy's) law also says something like 90% of the time is spent in 10% of the code. That makes s=10 and p=90.
No it doesn't. How do you know the 90% is serializable and the 10% isn't? Answer: you don't, there is no relationship whatsoever.
Sheesh.
Have you got your LWN subscription yet?
If I pulled a gif compleatly out of context as proof of anything would you trust me?
"think of it as evolution in action"
There was at least one performance bug with Mandrake 8 that resulted in extremely slow X performance. I don't remember the details but maybe someone will share them...
It is simple , tar -xvzf linux-{current}.tar.gz.
cd linux; make menuconfig ; make dep bzImage modules modules_install
You're joking, right? How many options in 2.5.47 must be selected in order for your run of the mill $9 generic PS/2 keyboard to work? I can't tell you how much fun it was building 2.5.47, missing one *somewhere* and suddenly I couldn't do anything because my keyboard stopped working.
The kernel only has an expert mode. It would be nice if there were a higher order config that asked you basic questions and built the things you were most likely to need, with the option of going into a more expert mode if you needed to fine tune something.
Education is a better safeguard of liberty than a standing army.
Edward Everett (1794 - 1865)
The short answer is that KDE is written in C++.
The long answer is that anything written in C++ on Linux will load slow (but should run fairly quick once loaded) because of something to do with loading the C++ libraries and some other compiler gook. I can't remember where I read it, or how I found it on google, but aparently this will be fixed soon in glibc.
Of course, I could be WAY off, so if someone could back me up...
Please write and publish a paper about it!
This is a major breakthrough in computer science.
It also is quite unlikely, since Ahmdahl's law is a trivial observation that is completely independent of parallelization or even software engineering (it also applies to hardware design or even accounting). Basically, it says: if initially only 10% of X (CPU cycles, money, whatever you are trying to save) is spent in the part you are optimizing, there is an upper bound of 10% to the X you can save.
I'm very interested in how you can disprove that.
This is easier:
.config file from a previous kernel. But I have really been missing xoldconfig, that will give me the xconfig interface but with only the questions I'd need to answer when using oldconfig.
make oldconfig dep clean modules modules_install install
Yes oldconfig is nice when you already have a
Do you care about the security of your wireless mouse?
So maybe this will help some people stand up to being /.'ed.
Liberty.
The P4 has notoriously slow context switching performance.
The Pentium IV has notoriously slow performance in some areas, but a processor being slow in context switching doesn't make sense. Depending on the context(English context, not computer context), context switching is either the system switching from kernel mode (running kernel code) to user mode (user applications) or vice-versa, OR it is simply moving from one execution path to another (as was scheduled by the, um, scheduler)
The processor has nothing to do with it. Context switching in BOTH instances is handled entirely by the operating system. While Windows NT 3.1 may have "slow context switching" and Linux with the O(1) scheduler may have "fast context switching", the Pentium IV cannot "have fast or slow context switching" because it doesn't have anything to do with the Pentium IV.
One might theorize that the original poster's comment was refering to the Pentium IV being particularly slow at the actual instructions used in context switching. Regarding the discussion of the kernel scheduler, the meaning of "context switching" that we are using probably refers to switching between tasks (AKA multitasking), so the important instructions would simply be jump instructions like "jmp", which AFAIK are not particularly slow on the Pentium IV like, say, bit shifting (which is glacially slow on the Pentium IV).
Computer Science is no more about computers than astronomy is about telescopes. --E. W. Dijkstra
How about this image, from the same article. Note how green, which is SUSE Linux, is winning :)
Needless to say, context is everything.
A home user (meaning non hacker) never has the need to recompile a kernel. NEVER. Your distribution has all the modules available and if you're running the more popular distros, they will even detect your hardware and load the module for you.
Sometimes people shouldn't mess with stuff, the kernel is one of those things. RedHat does a good job with their builds and an average user doesn't need to rebuilt it at all. A more experienced user might want to tweak, but then he can use make menuconfig or make config...and choose his options.
My grandmother will never recompile her kernel.
"Full sources for linux currently runs to about 200kB compressed" --Linus Torvalds 31-Jan-1992
I know exactly how you feel. I actually use linux quite a bit, but it's all precompiled suse packages for the most part except when I need oddball stuff like gif support for GD. Then it's time to compile php.
I'm blessed to have friends that know more than I do and are willing to help me out when I get stuck.
Compiling the kernel is something I haven't attempted since 386DX40 days.
The man who trades freedom for security does not deserve nor will he ever receive either. - Benjamin Franklin
I grew up with DOS, too. If you installed Borland's Sidekick (many did) successfully, you can compile. That's the stuff that went on in Sidekick's install process: it used Borland's compiler -- and that's why it ran so well.
.config file from the stock kernel sources for your distro, usually in /usr/src/linux* (you may have to install them) open a root terminal window in /usr/src, issue `make xconfig' choose the .config from the load configuration file box and start disabling everything you KNOW you do not need. The help buttons are mostly very helpful. If your box is used for web surfing, compile in ppp, same with lpd if you need to print. Unless you have a SCSI drive, disable all SCSI boxes. Load as much of your equipment into the kernel as you can, and disable the modules that enable hardware and features you don't have or use, like firewire or USB. Make sure equipment you DO HAVE are supported either in the kernel or as a module. Keep doing `next' until the end, when there is no `next.' Choose Main Menu,
/usr/linux/arch/i86/boot and copy bzImage as vmlinuz-number.of.kernel to /boot.
/usr/src/linux , do make modules_install. /etc.lilo.conf to include the new kernel and System.map. Activate lilo (/sbin/lilo -v -v).
/boot and to keep the modules straight (though they appear not to care....)
I just finished *this morning* compiling a 2.2.22 (yes, RH-6.2) for my box. Use the
Then save the new configuration. Do a 'make dep bzImage modules modules_install' and copy the ~/System.map file as System.map-new.kernel.number and drill down to
from
Modify
Reboot into new kernel. If you get lots of error messages about modules not loading, reboot at the command prompt, and everything will have been rewritten magically. Use your new kernel for testing. You may find you want to try another configuration. Do it all again, changing the Makefile each time under line 3 EXTRAVERSION with another digit or letter to keep it from overwriting a working kernel when you copy in to
Frankly, I've tried nine builds and although my kernels are smaller than stock, use about 5Kb less RAM and benchmarks seem to indicate about 5-6 per cent increase in speed, I feel no difference in use.
I do feel better knowing I am using the latest (and perhaps the last) kernel in the 2.2.x series, though. FWIW.
Of course, you have to be using a BSD kernel. Theres nothing wrong with using GNU userland tools and a BSD kernel...
"The lesson to be learned is not to take the comments on slashdot too literally." --Vinnie Falco, BearShare
I think some processors have multiple register sets, so threads do not have to thrash the same set of registers for every thread context switch.
cpeterso
There are a whole bunch of ways you can conceal information or mislead readers by claiming really good big-oh times, but this isn't really one of them. (How about a perfect hash table that calculates keys using a O(m^n) hashing algorithm?)
Anyway, if you think you know something about Amdahl's law, you need to google for "Gustafsons's law". Executive summary: Amdahl was wrong.
If you had actually tried using google for "Gustafson's law" you would have seen as the first link a paper claiming it and Amdahl's law are identical, not that Amdahl was wrong.
well, this guy is apparently a troll, but just for the sake of argument... Anyone repeating his test would probably find very similar results. HZ (the constant controlling how often the scheduler runs) has been changed from 100 to 1000, improving smoothness for many things (multimedia apps espescially) at the cost of making the schduler overhead 10 times what it was before.
:-). The *only* times the scheduler was really a problem before were a) when it made bad choices and b) when there was gazillions of tasks. The rest of the time, it was totally negligible.
Luckily, it was very small before, and it's still very small. Maybe it went from taking 0.001% of your CPU power to 0.01%
So, even if the scheduler did slow down by a factor of 2 as he claimed (and in fact, it would have slowed down by a factor of 10 due to the HZ changes so his claim would leave O(1) 5 times faster than the old scheduler) it really wouldn't matter to an ordinary desktop/server. The scheduler time is too small to be important on normal machines .
The Matrix is going down for reboot now! Stopping reality: OK. The system is halted.
The instructions involved in the context switch are slow on the Pentium 4. The P4 has a long internal pipeline to flush, and a huge amount of internal state to synchronize, which makes context switches slow. For example, an interrupt/return pair take 2000 clock cycles on the P4!
A deep unwavering belief is a sure sign you're missing something...
I don't know about troll, but perhaps just an overactive imagination :)
:)
Apparently he works for a
development firm,
studies meterology,
works for Verizon store at a local mall,
owns a chain of pet stores in London, and
has a thing for CmdrTaco.
Read together, they make amusing reading
I'm a big time VMware user (I use it for testing and Windows). I usually have 2 or 3 VMware machines running at any given time and I have plently of memory (usually 1GB, sometimes more). However, the disk buffer (or disk caching) of Linux sucks ass. I'm not kidding, if I have 1GB of memory, 900+ megs will be used for disk buffers and my very important interactive VMware processes will be swapped out to the slow disk swap file. Just using one of the VMware processes causes a lot of disk I/O and all that I/O gets loaded into the disk buffers in memory then when I go to use another VMware process it has to come out of swap. Linux is pretty bad about this with normal processes, but VMware exasperates the problem.
To boil it down: The disk buffering in 2.4 is way , way too aggressive and I haven't figured out a way to fix it. I need to be able to either limit the total ammount of memory the buffers will use or a better method would be to tag certain processes so that they will never be moved into swap for disk buffers (moving to swap "normally" is OK, just not for disk buffers). Or maybe just make it never swap out any process for disk buffers.
It seems Windows uses a more reasonable disk buffering technique and VMware works better there (especially when using several instances). I don't want to use Windows as my primary OS though because I like the built-in disk encryption and network security of Linux (the ip filter stuff is much better than Windows).
Anyone know if 2.5 has got any better disk buffering?
Because the old system had 3 diffrent parsers each with their own bugs and it had become a maintainance nightmare.
Making new configurators is simple with the new system and I'm sure there will be gtk/whatever else configurators available.
the stability of these releases are questionable, and they have been known to cone dump into various output files.
Cone dumping is a problem... hopefully, this new vehicle solves it.
--
Karma is overrated, whoring is ok.
Amdahl's law is used to predict speed increases for multi-processor systems. In this case, you can see a gain of more than 10 if you have enough processors in use, and the majority of the work is in parallel.
I think it assumes the original was written with some sort of intelligence behind it. I bet I could write some really atrocious code that would be so incredibly inefficient that almost anyone else could get a huge performance gain from it.
It doesn't really assume anything. The equation pertains to gains simply by increasing the number of parallel processors, not the strength of the code.
Anyways, this is probably redundant, but the big gains from the new kernel is that the amount of parallel processes are increased and the serial processes decreased. In a single processor system, performance decreases as there is more overhead in swapping processes in and out. In multi-processor systems, the gains would be enormous.
Karma: Not Particularly Funny.
The Pentium IV has notoriously slow performance in some areas, but a processor being slow in context switching doesn't make sense.
Well, those of us who actually design CPUs and stuff rather than pretend we know about them use the term "context switch" to describe dumping the current CPU state (to memory, other registers, whatever) then loading a new state, or something logically equivalent. This can be for a thread switch, interrupt handling, whatever.
The processor has nothing to do with it.
A CPU level context switch is part of what happens during an OS level context switch, and therefore has a significant effect on OS performance.
What would Lemmy do?
It may be very interesting to run the same tests on various other free operating systems, especially BSD.
{{.sig}}
He newer say that linux is worse, just that linux has an achilles heal.
FRA: STFU GTFO
How does it compare to 2.4 with the low-latency, preemptive and O(1) patches?
Inspired by the numbers and new "snappyness" under load, I decided to download and compile the 2.5.47 kernel, and see for myself, disappointed is all I can say,
2.4.19 with preempt and low-latency is snappier by quite a bit than 2.5.47. My test isn't quite as numeric as the stories... I simply start ripping a DVD (oops did I say that...) to avi, and compiling something (in this case xmms) and then get my term window, open limewire, and drag the term window around on the maximized limewire window, under 2.4.19 I can never get the whole window grey (as I drag the term window it acts as an eraser on the limewire window, until that window is redrawn) undery 2.5.47 I can easily grey out the entire limewire window, normally for 2 or 3 seconds before it redraws... under 2.4.19 I can maybe grey out about 1 term window worth of area in the limewire window before it is redrawn...
Of course it states in the story that 2.5 has not been tuned at all really, so hopefully this will improve, but for now I'm sticking with 2.4.19 preempt low latency
Buy more RAM and disable swap. Or just disable it -- at 1Gb, you're close to what you need anyway.
I'm serious. With another gig costing a hundred dollars -- maybe less -- the overhead of disk-based VM is just no longer justified.
WinXP benefits from this optimization even more than Linux.
Yours Truly,
Dan Kaminsky
DoxPara Research
http://www.doxpara.com
I'm still milling over which kernel to use with my old 486's. :P
Right now, they're running 2.2.10, iirc; whatever the debian stable had on her boot disks.
I'm not going to compile any kernels until my dual ppro is fixed, because compiling a kernel on a PoS 486 portable is not fun
Anyone have any comments/recommendations on if/which new kernels are good to run on old shite?
--- Do you believe in the day?
A distribution such as Gentoo may not be the easiest to install but you get the whole gubbins, X, Gnome or KDE and the apps compiled for your system.
Microsoft tend to distribute generic code, and if you are lucky you may get a model specific dll. What Microsoft can not do is to distribute code that can be compiled for a specific model, well not until they deliver code that gets compiled during setup. Note, this can be done with optimisable intermediate code rather than source, but it wouldn't be easy.
See my journal, I write things there
The use of "big server" is somewhat misleading.
:)
Fact is, anyone that heavily uses their Linux box will see some difference. It's just that the heavier your box gets used, the bigger difference you'll see.
Those that do little serious multitasking may see "smoother" multitasking but little more. Those that perform concurrent compiles, heavy CPU or I/O database servers, big time-share systems, etc, will see larger and larger note worthy gains.
I'd LOVE to try out the 2.5 series, but because LVM is still not in there (not a week ago at least), and I have all my data (movies, oggs, etc) on LVM, I'm unable to use it... :(
Does anyone have a clue when there will be LVM for 2.5?
My other account has a 3-digit UID.