Tuning The Kernel With A Genetic Algorithm

Futurology by Cros13 · 2005-01-08 01:06 · Score: 1, Funny

So the terminator will run linux!

--
--cros13

Re:Futurology by Moonbird · 2005-01-08 01:10 · Score: 2, Interesting

You know, in the german translation of Terminator 2 Arnold talks about how his brain is made of "Neutrale Netze" or "neutral nets" re-translated.

It's pretty funny considering he is talking in his grave and totally serious robo-voice...

--

--
All extremists should be taken out and shot.
Re:Futurology by wertarbyte · 2005-01-08 02:08 · Score: 3, Funny

It's pretty funny considering he is talking in his grave and totally serious robo-voice...

You mean there is another voice?

--
Life is just nature's way of keeping meat fresh.
Re:Futurology by Eric+Giguere · 2005-01-08 02:11 · Score: 1

If one of Linus' kids takes over from his father then that could be considered a kind of tuning of the kernel using genetic algorithms!
Eric
Why the Vioxx recall reduced spam (well, maybe temporarily) (see also my William Shatner All-Bran humor
Re:Futurology by BJH · 2005-01-08 02:17 · Score: 1

Try watching "Twins" or "Kindergarten Cop" - he's also got a I'm-trying-to-be-funny robo-voice.

Innovation and open source by Anonymous Coward · 2005-01-08 01:09 · Score: 4, Insightful

A common criticism of Open Source is the accusation that it lacks innovation.

I mean, common. Just look at this. Amazing.

And even if it turns out to be not that good, it was still a good read :-)

Re:Innovation and open source by Anonymous Coward · 2005-01-08 01:14 · Score: 0

What innovative about a genetic algorithm?

They've been around in various forms for over over ten years. Popularised by Koza early nineties, they existed in a more mathmatical form in the fifties.

Choosing the objective function was hardly a chore.
Re:Innovation and open source by leehwtsohg · 2005-01-08 01:18 · Score: 1

No no no!
John Holland popularized genetic algorithms.
Koza popularized genetic programming which according to him is totaly different.
Re:Innovation and open source by Flaming+Foobar · 2005-01-08 01:20 · Score: 1

What innovative about a genetic algorithm?
Innovative != invention
This is very, very, innovative. It's easily the most innovative thing to hit a kernel since the 70's.

--
while true;do echo -e -n "\033[s\n\033[u\134_\033[B";done
Re:Innovation and open source by Anonymous Coward · 2005-01-08 01:24 · Score: 1, Insightful

No, this isn't amazing.

It's KEWL.

No different than every open source freak over the years proclaiming how 3D desktops were so obviously better than the existing 2D desktops of the day.

And all the while cut/copy/paste still doesn't work properly...
Re:Innovation and open source by Anonymous Coward · 2005-01-08 01:36 · Score: 0

I dunno, my powerbook has a 3d desktop and it looks and runs great. Even copy/paste works.
Re:Innovation and open source by Anonymous Coward · 2005-01-08 02:38 · Score: 0

the accusation that it lacks innovation.

Never hear that outside of the users who consider OSS as "alternative" (thus looking exactly for most clonest apps). Face it, if it's not from MS, it would never be seen as innovative here.
Re:Innovation and open source by devillion · 2005-01-08 02:54 · Score: 1

This is not really that amazing. I had a similar idea a year ago but never had time to test/implement it.
This is also not a new idea. Someone wrote a program which optimized gcc's flags to give the best possible combination of optimization flags.
In general the use of optimization (statistical inference methods usually work better) allows adaption of programs and algorithms to hardware.
I would be more impressed if if systems could actively develop/learn (adation in a much more general sense and with 'memory') and plan ahead.
Example: software which learns which combinations of parameters cause function to crash (causal inference), has (or builds) a model how actions (instructions, interrupts) change system (learning bayesian networks + modelling by hand) and can use models & gathered information to workaround or fix (simple/some) bugs (planning, logic or statistical 'rule' models (I'm working on (hobby) [implementation&theory] these ones at the moment)).
This idea is really not that new. Try googling 'self-healing systems'.
Want some fame? Then just make above idea work in practice and release it OSS.
This is much more complicated and you must have studied AI, statistics and machine learning to be able to understand how to implement above.
Re:Innovation and open source by LarsWestergren · 2005-01-08 04:17 · Score: 1

>> [...]theoretically allowing them to automatically tune themselves for the best possible performance for any given workload.

> A common criticism of Open Source is the accusation that it lacks innovation.

> I mean, common. Just look at this. Amazing.

Well, not to denigrate the efforts of the people working on the Linux kernel, but Java has had Hotspot and other Just in Time compiling tricks since 2001, and I'm sure there are others before it.

Java Hotspot does not use the same technology of course, but the goal is the same - automatic tuning.

Still, this was a very interesting read.

--
Being bitter is drinking poison and hoping someone else will die
Re:Innovation and open source by gaj · 2005-01-08 04:39 · Score: 1

Troll.
Just repeating a lie over and over doesn't make it true.
Cut/Copy/Paste works just ducky, and has for years. In fact, both methods of cut/paste work just ducky -- highlight/middle-click and highlight/shift-ctrl-x/shift-ctrl-v. And copy with highlight/shift-ctrl-c.
Re:Innovation and open source by MediumWare · 2005-01-08 04:51 · Score: 1

so you did RTFA???
Re:Innovation and open source by Anonymous Coward · 2005-01-08 05:06 · Score: 0

You fail it retard.

There is only one form of cut/copy/paste.
Re:Innovation and open source by Anonymous Coward · 2005-01-08 05:16 · Score: 0

RTFA, What you desribe is what it does.
Re:Innovation and open source by Anonymous Coward · 2005-01-08 05:31 · Score: 0

This is not really that amazing. I had a similar idea a year ago but never had time to test/implement it.
me too
Self learning system are cool for example i had this idea of a self learning lighter, after some learning period it should automatically ignit when holding it near a sigarette or whatever you like to burn.
Re:Innovation and open source by ColaMan · 2005-01-08 07:40 · Score: 3, Insightful

Sooo...... can I copy and paste an image from gPhoto into The Gimp yet? No? Call me when you guys manage to do what windows 3.0 could.

--

You are in a twisty maze of processor lines, all alike.
There is a lot of hype here.
Re:Innovation and open source by SunFan · 2005-01-08 08:40 · Score: 1

OSS is a perfect stomping ground for new ideas. As these ideas mature, the bad ones get weeded out, and they gain popularity, even the big commercial vendors adopt them.

For example, Solaris now officially includes and supports with Netscape/Mozilla, Perl, GTK, Python, Tcl/TK, Ghostscript, Apache, Ant, Bash, BIND, bzip, cdrw, Evolution, FreeType, lots of GNU utilities, GNOME, ImageMagick, IP Filter, SSH, MySQL, Ogg Vorbis, OpenSSL, Samba, Sendmail, Tomcat, X.org, and others, I'm sure. There's also sunfreeware.com and blastwave.org for pretty much everything else.

--
-- Microsoft is the most expensive commodity operating system and office suite vendor in the marketplace.
Re:Innovation and open source by Anonymous Coward · 2005-01-08 10:30 · Score: 0

yeah, and compilers have had keyhole optimizing for ever too. ohh and looky its not the same thing, yet it has the same goal, to make the program faster.
Re:Innovation and open source by AstroDrabb · 2005-01-08 12:56 · Score: 2, Informative

gPhoto is an image _viewer_ not an editor. In WinXP the default image viewer is "Windows Picture and Fax Viewer" and it doesn't allow you to copy an image from it and paste it into paint or photoShop, so it is no different than gPhoto. Have you ever tried to drag an image into Gimp? It works quit well.
Oh, and X supports many formats on the clipboard
One of the really cool, yet rarely used, features of the selection mechanism is that it can negotiate what data formats to use. It's not just about text. When one application asks another for the selection, part of their communication involves the requester asking the owner for the list of types in which they are capable of delivering the selection data; then the requester picks the format they like best, and asks for it that way.
As a simple example, suppose there is a program displaying text in multiple fonts. When pasting that into a text-only program, you'd want to paste only the text. But when pasting that into a word processor, you'd want to keep the font information: if both applications spoke HTML, they could use that as the intermediate format by which they transferred the data.
More complex things are possible, too: for example, when an image is selected on a web page, the web page displayer could offer to serve that up as raw image bits; or as JPEG data; or as the original URL of the image. When trying to copy and paste an image into a text editor that can't do images, the text editor might decide that the next best thing would be to paste the filename of the image, or the URL.

--
If Tyranny and Oppression come to this land,
it will be in the guise of fighting a foreign enemy. -James Madison
Re:Innovation and open source by TheKarateMaster · 2005-01-08 15:35 · Score: 1

But then there's always print screen / aquire...
Re:Innovation and open source by Chandon+Seldon · 2005-01-08 16:01 · Score: 0

So... why is this a nessisary feature? Really?

Just because Windows 3.0 had a workflow model built around copy and paste (and now has gone so far that that's what you do with *files*) doesn't mean that copy and paste of anything other than text is appropriate for a efficient desktop environment.

--
-- The act of censorship is always worse than whatever is being censored. Always.
Re:Innovation and open source by Anonymous Coward · 2005-01-08 16:50 · Score: 0

Except that it's been done before in at least five operating systems that I know of, three of them UNIX variants.
Re:Innovation and open source by Anonymous Coward · 2005-01-08 18:23 · Score: 0

The day you don't have to print screen/acquire to paste a jpeg into an application, Windows will be ready for the desktop.
Re:Innovation and open source by LarsWestergren · 2005-01-08 21:14 · Score: 1

yeah, and compilers have had keyhole optimizing for ever too. ohh and looky its not the same thing, yet it has the same goal, to make the program faster.

The difference is that Hotspot and this genetic algoritm performs the optimizing at runtime, not at compile time.

--
Being bitter is drinking poison and hoping someone else will die
Re:Innovation and open source by ColaMan · 2005-01-11 12:58 · Score: 1

Points noted.

I *was* being a little bit trollish :-)

Perhaps I should have just put "Generic Image Application A" and "Generic Image Application B" in there. I've had a lot more sucess in the windows world with it's clipboard and random applications-at-large than I've ever had with X.

--

You are in a twisty maze of processor lines, all alike.
There is a lot of hype here.

Complexity? by BurntNickel · 2005-01-08 01:20 · Score: 5, Insightful

So how much additional complexity is added for a 1-3% perfomance improvement? I'm all for more speed, but keeping thinks simple can often be more improtant when it comes to maintainablity and adding additional features.

--
And the knowledge that they fear is a weapon to be used against them...

Re:Complexity? by bersl2 · 2005-01-08 01:56 · Score: 2, Insightful

It's only one guy (I think; didn't RTFA), and it's nowhere near being included in the mainline kernel. Your observation may be correct in general, but what's the problem here? If an experiment pans out, the internals will be changed down the line to incorporate the new idea; if this idea were to have yielded a 5-10% increase in performance, would you have made that comment?
Re:Complexity? by Albinofrenchy · 2005-01-08 02:07 · Score: 1

Innovation sometimes requires complexity. Besides, genetic algorithms are not that complex.

--
"A man is but the product of his thoughts what he thinks, he becomes." -Mahatma Gandhi
Re:Complexity? by BurntNickel · 2005-01-08 02:12 · Score: 1

I still may have made the comment at 10%. It is really a matter of how much complexity you are willing to accept for a given performance improvement.

--
And the knowledge that they fear is a weapon to be used against them...
Re:Complexity? by arivanov · 2005-01-08 02:14 · Score: 1

I hate to sound like Victor Meldrew, but I seriously dislike the way 2.6 is going. It should have settled by now, but instead of that you see major changes to scheduler, network drivers in every changelog. The only thing that is being left alone for now is the VM (thanks god for that).

While I have to admit that none of them is anything as scary as some of the stuff that happened to 2.4 around 2.4.7-2.4.13 when the VM got changed, it is not right.

The supposedly ready and released kernel 2.6 by now from minor version 0 to minor version 10 has had its entire ethertool interface changed at least twice, the process scheduler changed at least once, the IO scheduler changed at least once, it still does not support things that 2.4 used to support in the USB subsystem (anyone running a scanner on it raise your hand). Shall I continue?

In btw, does anyone know of any place on the network where you can view the changelog split by subsystem/kernel tree directory? It is 1.5 on the average (so much for "stable" kernel) between minor version and it is getting quite hard to read.

--
Baker's Law: Misery no longer loves company. Nowadays it insists on it
http://www.sigsegv.cx/
Re:Complexity? by devillion · 2005-01-08 02:24 · Score: 2, Insightful

Understanding (basic of) GAs is easy and so is implementation. They also work quite well. That's why they are so popular.
IMHO, if GA implementation can be made really reliable it could maybe replace other code which may be (I don't know) even more complicated.
Re:Complexity? by m50d · 2005-01-08 02:29 · Score: 2, Insightful

Like I've said before, kernel 2.6 is simply not stable yet. Wait until they fork off 2.7, then with luck it will settle down.

--
I am trolling
Re:Complexity? by Sophrosyne · 2005-01-08 02:33 · Score: 1

don't worry, the new Kernel also comes on DVD.
Re:Complexity? by Morosoph · 2005-01-08 02:59 · Score: 2, Interesting

Genetic algorithms are pretty simple, compared to what the bulk of what the kernel is doing. Furthermore, the technology is a known quantity, and probably won't be running at run time. Given the existing size of the kernel (6 million lines), I don't think that it'll add a lot to complexity.

--
Wikileaks, no DNS
Re:Complexity? by BurntNickel · 2005-01-08 03:16 · Score: 1

I think the sheer size of the kernel is a good reason to reconsider adding every feature into it. There may very well be a lot of value in adding a GA scheduler but I think careful consideration needs to taken to be sure that features aren't added solely beacuase they are new and fancy.

I will be honest though and admit that I think there is already too much stuff in the Linux kernel.

--
And the knowledge that they fear is a weapon to be used against them...
Re:Complexity? by gnuLNX · 2005-01-08 03:44 · Score: 2, Insightful

GA's a pretty trivial to code. So very little complexity is added.

--
what?
Re:Complexity? by Carewolf · 2005-01-08 04:11 · Score: 1

It is stable. And if you check out the new development model, you can see there will not be a 2.7 anytime soon.
Re:Complexity? by Anonymous Coward · 2005-01-08 04:22 · Score: 0

The supposedly ready and released kernel 2.6 by now from minor version 0 to minor version 10 has had its entire ethertool interface changed at least twice, the process scheduler changed at least once, the IO scheduler changed at least once, it still does not support things that 2.4 used to support in the USB subsystem (anyone running a scanner on it raise your hand). Shall I continue?
No, you should start. Start describing something that is an actual problem. Well, okay, missing USB features/drivers is a problem, but it's the opposite of a stability problem. (And, BTW, I run a scanner with 2.6 and don't have any trouble with it).
The 2.6 kernel has continued evolving, but has remained stable, fast and reliable. Linus is so happy with the new development process that allows him to continue making significant changes in the stable branch that he's indicated there might not be a 2.7 at all.
Re:Complexity? by Anonymous Coward · 2005-01-08 04:30 · Score: 0

"I am hoping a scheduler guru will able to rework them to give higher gains."

Perhaps we will see better performance, due to this idea. Having new ideas is hard, making old ideas work better is easier.
Re:Complexity? by Dan+Ost · 2005-01-08 05:56 · Score: 1

If you don't count device drivers and other modules, how big is the kernel
now days?

--

*sigh* back to work...
Re:Complexity? by marcosdumay · 2005-01-08 06:02 · Score: 1

In a sense, don't asking the user numerical kernel parameter is simplifying it's instalation. So, i does make the kernel bigger (not very big, ineed), but simpler to tune.
But I also prefer the hill climbing for that, it is hard to trust a genetic algorithm at run time.

--
Rethinking email
Re:Complexity? by gnuLNX · 2005-01-08 06:28 · Score: 1

"it is hard to trust a genetic algorithm at run time"

Having developed several applications which rely an GA's I would have to agree with that statement. However someone could opimizing several scenarios with a GA and then having the schedular check for the closest scenario and then apply the parameters most appropriate for that scenario.

--
what?
Re:Complexity? by Billly+Gates · 2005-01-08 07:05 · Score: 1

Then why does Debian still include kernel 2.4?

2.6 is still beta.

BSD looks alot better.

--
http://saveie6.com/
Re:Complexity? by Apro+im · 2005-01-08 07:16 · Score: 1

Yeah, we see how well that worked out with 2.6.8's CD burning.

Not that I agree with the grandparent, but it ticks me off that there's no 2.7 branch, and i mean like awful. I'm all for developing the stable branch, but to your average user (like me!), you'd rather know your kernel is going to work.

How about instead of 2.7 being the groundwork for 2.8, why don't we put changes into 2.7, let the early adopters and kernel hackers use it, and when we find bugs like the Cd-burning problem, we can fix them before we release into 2.6
Re:Complexity? by jnelson4765 · 2005-01-08 07:55 · Score: 1

The best thing about this, IMHO, is screwball steups.
Under a normal load, perhaps, the standard schedulers work fine, but with some < 1% kind of setup, it will probably show much greater improvement.
The place where any GA shines is when it has a large number of generations - perfect for stick-in-a-corner-and-run kind of machines. Not something that I would put on a machine with wildly varying loads - it'd be impossible for it to stabilize, and depending on the time between generations, you could have some nasty harmonics develop.
But for clusters/backup servers/batch-processing systems, it's great.

--
Why can't I mod "-1 Idiot"?
Re:Complexity? by torpor · 2005-01-08 07:56 · Score: 1

Then why does Debian still include kernel 2.4?

Since when has Debian been the paragon of 'bleeding edge linux distros'?

--
; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
Re:Complexity? by grammar+fascist · 2005-01-08 07:58 · Score: 4, Informative

Understanding (basic of) GAs is easy and so is implementation.

I'm a machine learning researcher, and I'll second this. Also, the code for it will be quite self-contained (if done right), and shouldn't affect any parts of the kernel except where it's called to run an iteration.

They also work quite well. That's why they are so popular.

Sure they do. For a lot of problems, though, they're not so hot compared to other optimization methods like hill climbing and empirical gradient descent - they tend to run slowly - and I would like to ask Mr. Moilanen why he didn't use one. GAs are especially good with nominal parameters (discrete, unordered). But I would expect tuning parameters to be either discrete or continuous.

GAs are theoretically capable of finding global optima, except that's kind of a red herring. The only reason you can prove that is that, theoretically, no matter how small the probability, you'll eventually get exactly the right sequence of mutations to produce a global optimum. In practice, they tend to produce local optima like most other optimization algorithms - especially as Moilanen describes it:

All of the tunings are then ordered from best performance to worst, and the worst half of the performers are replaced with new possible tunings derived from the best half of the performers.

You generally have to be a little less selective (more random) than this.

--
I got my Linux laptop at System76.
Re:Complexity? by Billly+Gates · 2005-01-08 08:32 · Score: 1

Debian is not bleeding edge because they count bugs and only something bug free on all platforms is added into stable.

If 2.6 was production ready the debian folks would include it.

--
http://saveie6.com/
Re:Complexity? by xenocide2 · 2005-01-08 08:32 · Score: 2, Interesting

Genetic programming is a known quanitity, but if it "wont be running at run time" then its borderline useless. The whole idea is that the genetic algorithm adapts to a workload. Currently, scheduling is about fine tuning at the expense of flexibility. You tune the scheduler for a desktop setup, or for a database server, or whatever you have. Most tweaks yield a few percentage point gains in theory and maybe a single percent in practice for the given problem set, at the expense of about a ten percent penalty in the worst case scenario. Essentially, the genetic algorithm attempts to fine tune the kernel for you in ways.

Its not really that complex, but reasoning about how well it could perform is VERY difficult comparitively speaking. Furthermore, it introduces a much fuzzier notion of fine tuning. Rather than playing with variables like "swappiness" and cpu affinity, you're messing with a fitness function, where minute changes can move you from a percentage gained over a stock kernel to a percentage (or worse) loss. Certainly, I'm impressed that he's managed to make an improvement over stock with it, which puts interested kernel developers on a good first step. Nobody wants to chase technology that doesn't even show a hint of unlocking performance potential.

--
I Browse at +4 Flamebait
Open Source Sysadmin
Re:Complexity? by m50d · 2005-01-08 09:27 · Score: 1

It's not stable. When it runs on my system, and everyone's system, without crashing more than, say, hourly, then it's stable. 2.6 is nowhere near that.

--
I am trolling
Re:Complexity? by RedWizzard · 2005-01-08 10:05 · Score: 1

It's not stable.
What you meant to say is "it's not stable for me". I'm having no problems with it.
Re:Complexity? by m50d · 2005-01-08 10:24 · Score: 1

No. That's why I mentioned "everyone's system". A program which is stable on some of the systems it runs on is not a stable program. Doubly so for something as critical as a kernel.

--
I am trolling
Re:Complexity? by Anonymous Coward · 2005-01-08 10:34 · Score: 0

What a crock. Debian is loaded with buggy old programs that have been superceeded with better versions (KDE/Gnome 1.x for example).

2.6 is certainly less buggy than previous versions on numerous modern platforms. It's only Debian's perverse rules that elevate Atari ST bugs above those experienced by people buying new systems.
Re:Complexity? by (negative+video) · 2005-01-08 10:36 · Score: 1

If you don't count device drivers and other modules, how big is the kernel now days?
This is for a 2.6.5 kernel as patched by SuSE.
The very core part of the kernel and memory manager:
wc -l `find kernel mm -name "*.c" -or -name "*.h"` | tail -n 1 70669 total
The main part of the kernel:
$ wc -l `find crypto fs include/linux init ipc kernel lib mm net security sound -name "*.c" -or -name "*.h"` | tail -n 1 1605707 total
The i386-specific stuff:
$ wc -l `find arch/i386 -name "*.c" -or -name "*.h"` | tail -n 1 71276 total
The x86-64-specific stuff:
$ wc -l `find arch/x86_64 -name "*.c" -or -name "*.h"` 2> /dev/null | tail -n 1 25434 total
The drivers:
$ wc -l `find drivers -name "*.c" -or -name "*.h"` | tail -n 1 3102875 total
All the architectures:
wc -l `find arch -name "*.c" -or -name "*.h"` | tail -n 1 814266 total
Re:Complexity? by sketerpot · 2005-01-08 11:11 · Score: 1

To give some idea of how simple genetic algorithms are, here's a fun fact: if you know what you're doing, you can write one in a few hours. It'll work, too. Maybe not that well, or quickly, but it can work.
Re:Complexity? by RedWizzard · 2005-01-08 12:02 · Score: 1

No. That's why I mentioned "everyone's system". A program which is stable on some of the systems it runs on is not a stable program. Doubly so for something as critical as a kernel.
Unless the problem is your system: i.e. hardware.
Re:Complexity? by Dan+Ost · 2005-01-08 13:46 · Score: 1

How does that compare to other kernels (solaris, *bsd, etc)?

--

*sigh* back to work...
Re:Complexity? by m50d · 2005-01-08 21:50 · Score: 1

If that's the problem, not that I think it is, it should still be able to handle it as gracefully as 2.4 does.

--
I am trolling
Re:Complexity? by WNight · 2005-01-09 09:42 · Score: 1

Why did you upgrade to 2.6.8? Was it the version shipped with your distro? Did you require it to fix a specific problem? A new kernel version is *not* like a new game. It's not something you have to run out.

Stable in kernel speak means stable interfaces - as in you can code against any 2.6.x interchangably. Of course they can't guarantee that new code doesn't have any consequences so that's why they don't say "Hey Newbies! Come get your brand-new and untested kernel!"

It really is better - it means that because the new code is being developed as part of a kernel in mass use it means that it's easier for the new features to be stabilized. It means that if you really do need 2.6.8 you can get it by the time 2.6.10 is out, as opposed to if you needed a feature from 2.7.8 - you might have to wait till 2.8 is out, or until RedHat or someone else goes to probably no small ammount of work to backport it to the older kernel.

The only problem with this development model is that because new dev kernels are compatible with stable ones there's less of a technical barrier which means that average users try these kernels without understanding the consequences.
Re:Complexity? by arkanes · 2005-01-10 02:42 · Score: 1

10% improvement in scheduler performance would be pretty impressive, actually - the low-hanging fruit in the scheduler has pretty much been picked over in the last few years, and signifigant new improvements are likely to take this sort of complexity.
Re:Complexity? by aav · 2005-01-10 05:15 · Score: 1

From his email address, the guy works at IBM, perhaps in some research department. I would suspect this was published because the project needed more money and they had to show they produced something. Along the lines of "our results are inconclusive. Please give us more money to continue".

As the recent trends show it, in both academia and industry, it's better to publish crap than not to publish at all. It is actually commonly said "publish or perish".

So, the work of this guy is marginal at best. It un-necessarily complicates the code in the scheduler while the gains are minimal.

Even your 5-10% would be useless. They'd be easily compensated for by a slightly faster CPU.

All in all, this seems to follow a trend in OS research nowadays: there are preciously few who have good ideas, and the rest believe they are improving things by throwing in something exotic (e.g. genetic algorithms) they don't even understand, hoping that everyone will be impressed.
Re:Complexity? by aav · 2005-01-10 05:21 · Score: 1

Your point is short sighted. How many times do you actually have to run the optimiser to adapt the parameters to the current load ? If you had to do so one hundred times per minute (not unlikely on a web server, for instance, where processes constantly start and end), then the overhead from the GA would become quite substantial.

A method that takes a microsecond to run once, will take a millennium to finish if ran sufficiently many times.
Re:Complexity? by Apro+im · 2005-01-10 07:36 · Score: 1

I didn't. But unless I had checked various boards, there would be no way for me to *know* that there were problems with 2.6.8. And if I had been tracking kernel-image-2.6 in debian, I would have updated automagically. All I'm calling for is an easy way to differentiate untested and tested kernels.

good luck with that by Illserve · 2005-01-08 01:21 · Score: 4, Insightful

If a parameter space is complex enough that you can use a genetic algorithm to tune it, the solutions it finds may have all sorts of unexpected potholes, bugs, etc.

In other words, non-competitive genetic algorithms are only as smart as the fitness function you give them. If your fitness criteria aren't smart enough to cover all the bases, your solutions will have unpredictable deficiencies.

Re:good luck with that by Anonymous Coward · 2005-01-08 02:31 · Score: 0

The solution to that is obvious. You just need to use a genetic algorithm to determine the fitness function.
Re:good luck with that by Anonymous Coward · 2005-01-08 08:05 · Score: 0

Inside the computer, Tron is wondering, "why does that program have its eyes wired backwards, and have a neck nerve that passes down through the neck, around the heart, and back up into the neck." Progressing though local optimums -- yuck! Engineers that really understand a problem, should be able to outdo evolution.

not a panacea by DrSkwid · 2005-01-08 01:22 · Score: 2, Interesting

They might converge on a point of attraction that is not the highest possible.

Sure the only way is to exhaustively search the "chromosome" space for every possibile combination, and computers are good at brute force!

--
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter

Re:not a panacea by Ada_Rules · 2005-01-08 01:28 · Score: 1

They might converge on a point of attraction that is not the highest possible.
I agree. One of my favorite quotes comes from a Babylon 5 novel and is applicable to genetic algorithms.
"Evolution proceeds at a snail's pace toward imperfection and ends in extinction".
Still this is a pretty interesting approach to kernel tuning.

--
--- Liberty in our Lifetime
Re:not a panacea by nothingx · 2005-01-08 04:03 · Score: 1

Heh, did you read the article? In his high level description of what is going on, he says:

"7.) Mutate a set number of children to keep variance."

Random mutation is exactly what keeps this thing from converging on a local minima. If the trend was going toward a local minima, or suboptimal solution, the random mutations could generate just enough entropy to bump the trend out of that solution and on to a better one.
Re:not a panacea by John+Little+John · 2005-01-08 05:24 · Score: 2, Insightful

They might converge on a point of attraction that is not the highest possible.

Mutation helps with convergence issues.

Sure the only way is to exhaustively search the "chromosome" space for every possibile combination, and computers are good at brute force!

Um, I don't even know how many parameters are involved to be adjusted, but it does not take that many to make exhaustive search useless, even for a computer.

--
The sharp edge of a razor is difficult to cross. Thus the wise say the path to salvation is hard...
Re:not a panacea by Anonymous Coward · 2005-01-08 06:18 · Score: 0

The whole point of this is to avoid brute force algorithms for providing solutions. It is true that the entire search space should be available for the GA, but you need to allow it to "find the optimal", not try every combination. This is done through the mutation and recombination rates and methods, in addition to the number and length of the runs. GA's are really good for finding unique, non-intuitive solutions. If you want to fine tune the parameters for something already in place, brute force should work fine (assuming the number of possiblities is finite). If you want a uniqe algorithm or approach for scheduling, use a GA or GP. I have often seen GA's as a method for training Neural Networks. This is a waste - let the GA create the Neural Net, not just monkey with the wieghts. I kind of got off subject here, but you seem to be right and wrong at the same time. Check out Genetic programming: An introduction, On the automatic evolution of computer programs and its applications by Wolfgang Banzhaf, Peter Nordin, Robert Keller, and Frank Francone
Re:not a panacea by DrSkwid · 2005-01-08 06:19 · Score: 1

randomizing the variables doesnt guarantee anything.

It might even mean that no local maxima are *ever* found

--
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
Re: not a panacea by Black+Parrot · 2005-01-08 10:36 · Score: 1

> Sure the only way is to exhaustively search the "chromosome" space for every possibile combination, and computers are good at brute force!

Problem is, the search space grows exponentially with the size of the chromosome. No problem if you have a short chromosome, but brute force is intractable for long chromosomes.

For example, if you are trying to optimize a set of 100 binary parameters for some problem, brute force requires you to evaluate 2^100 combinations.

--
Sheesh, evil *and* a jerk. -- Jade

Other kernel parameters? by Feint · 2005-01-08 01:24 · Score: 5, Interesting

Could this be extended to include other kernel parameters as well? Depending on your app, things like TCP timeouts and other muck can have a large impact. Tuning this stuff is currently somewhat of a black art. Then as the user community of the app becomes familiar after rollout, a lot of the usage patterns change. In a few cases, this means we end up having to re-tune the kernel.

If this package could be extended to the other parameters, it would save my customers a *lot* of time and money.

If nothing else, this could be a deciding factor for some of our clients to use linux instead of windows.

Re:Other kernel parameters? by Corpus_Callosum · 2005-01-08 02:02 · Score: 2, Interesting

Now your talking. Adaptive tuning is definitely the future. While I disagree that a general purpose GA is useful here (you should not let a GA hit an operational system, you need to let it hit a simulation first to build up a certain amount of fitness in it's solution space), many adaptive techniques would be useful and could eliminate the need to hand tune in many environments.

--
The reason that it can be true that 1+1 > 2 is that very peculiar nonzero value of the + operator
Re:Other kernel parameters? by bvankuik · 2005-01-08 04:36 · Score: 1

What packages do your customers run that kernel tuning on a case-by-case basis is necessary?? Also, what kind of admins do these clients have? I've met quite a few knowledgable admins, but never one that felt the need to deviate from the parameter values mentioned in the manual that came with the db/appserver... That is, if they bothered to read it.

And besides, I don't really feel that the OS is very important on the decisionmaking level, let alone some obscure feature of something called a kernel.
Re:Other kernel parameters? by burns210 · 2005-01-08 09:19 · Score: 2, Interesting

That is a great idea. Now here is a dumb one:

What about adding hooks for applications to to send/recieve performance changes after tweaks? Services, daemons, etc, need to communicate how the GA's latest tweak adjusted performance, right?

So.... by mstefanus · 2005-01-08 01:36 · Score: 2, Funny

So will this means that if I install this kernel on my computer I will have baby Pentiums or baby Athlons soon?

Re:So.... by LiquidCoooled · 2005-01-08 03:36 · Score: 2, Funny

If you nurture your computer, and occasionally sit it next to another computer then maybe, just maybe, when you wake up one morning, you will have little PDAs running around the place :)

--
liqbase :: faster than paper
Re:So.... by stor · 2005-01-08 13:13 · Score: 1

Yes.

And if you get a Mac, the promise of free Ipods will suddenly come true.

Cheers
Stor

--
"Yeah well there's a lot of stuff that should be, but isn't"
Re:So.... by temojen · 2005-01-09 10:00 · Score: 1

No, It'll make little kernels. Soon you'll have a HURD.

Not worth it... by mikelang · 2005-01-08 01:36 · Score: 2, Insightful

1-2% gain is in the borders of statistical error. Definitely not worth the increased complexity of the solution.

Re:Not worth it... by Corfe · 2005-01-08 02:03 · Score: 5, Insightful

It's a unique idea - what's wrong with running it for a while with your typical load (say, for a fileserver), finding some better-than-average parameters for the kernel, then running an unpatched kernel with those parameters manually entered?

What is "on the borders of statistical error" depends on how many times the test was run, and how much variation there had been in his results before. I think it's pretty safe to assume that if he knows how to implement a genetic algorithm into the linux kernel, he knows how to handle statistics properly.
Re:Not worth it... by gatkinso · 2005-01-08 02:07 · Score: 1

Slighty disagree - I think it is worthy of evaluation... probably better place for this is on the 2.7 branch.

--
I am very small, utmostly microscopic.
Re:Not worth it... by gl4ss · 2005-01-08 02:32 · Score: 1

..which would be why he's asking for scheduler gurus to work in it, no?

being mostly just a proof of concept at this stage.

--
world was created 5 seconds before this post as it is.
Re:Not worth it... by UMMCsci · 2005-01-08 06:27 · Score: 1

Increased complexity is not a given. It is true that GA's and Genetic Programming tend to 'bloat' the solution, but the fitness can also be defined by evaluating simplicity as well as effectiveness. This would weed out more complex solutions even if they are somewhat better. If you let the process run for enough generations, you may find a more simple solution that runs better.
Re:Not worth it... by xenocide2 · 2005-01-08 08:59 · Score: 2, Insightful

Toms hardware consistantly favors computer hardware that only pushes above the competition by 1 percent or less. People spend an extra 40 dollars for this performance, and you're not willing to consider that people might like a FREE performance boost of a percent?

--
I Browse at +4 Flamebait
Open Source Sysadmin
Re:Not worth it... by jelle · 2005-01-08 09:51 · Score: 2, Interesting

How do you know the margin of error? I've seen systems/measurements where 50% difference is a statistical error, and systems where it needs to be less than 0.2% to be a statistical error.

Pragmatism and statistics are _not_ a good mix.

Note that, for example, many hosting providers host hundreds of web sites per system. Adding a couple of percent in performance then adds a couple of percent to the bottom line of the cost picture for those companies. The same is true for supercomputer clusters used by many companies and universities with hundreds of nodes.

Even though 1-2% sounds like 'next to nothing', but that's not how you should look at it. If you ignore 2% only five times, you've really already ignored 10.5%...

There is a dutch saying that, when translated in english is like this: "If you don't honour the small things, you're not worth the big ones".

--
--- Hindsight is 20/20, but walking backwards is not the answer.
Re:Not worth it... by Anonymous Coward · 2005-01-09 17:23 · Score: 0

If you ignore 2% only five times, you've really already ignored 10.5%...
It might help your argumentation skills to learn simple arithmetics.
0.98^5 ~= 0.904 1 - 0.904 = 0.096
In other words: 9.6%
Get hard facts wrong;
watch credibility hit ground with big splat.
(Made up Chinese saying.)
Re:Not worth it... by aav · 2005-01-10 05:32 · Score: 1

Nothing wrong with running it off line, then using the found parameters. However, there may be major drawbacks: a possibly good solution for a particular load, might turn out to be unusually bad for a slightly different load. Given that there seems to be a need for genetic algorithms to search for the solution, it follows that the search space is sufficiently complex and not all that "well behaved". This means that the solutions will be rather unstable. In this case, perhaps it's better to accept a solution that's moderately bad for all tasks, instead of picking one that's excellent for a rare configuration and quite bad for most.

Moreover, the theory that genetic algorithms are easy is a myth perpetuated by those with shallow understanding of the concept. Sure, the concept and the code are simple. However, setting the parameters for genetic algorithms is a bitch and their meaning is not entirely related to the problem at hand.

So, all that this guy managed (with virtually no improvement) was to get rid of some understandable parameters coded in the original scheduler, and replace them with the mutation rate, crossover probability and population size, which don't have any meaning in terms of process scheduling.

And, no, kernel hacking does not imply any knowledge of statistics or statistical experiment design.
Re:Not worth it... by Anonymous Coward · 2005-01-17 20:42 · Score: 0

Asshole moron.

1.02^5 = 1.104...

Ignore 2% IMPROVEMENTS was implied.

This has been done before by drsmack1 · 2005-01-08 01:45 · Score: 2, Funny

http://tinyurl.com/6pkzc
"Daystrom felt that such an act was an offense against the laws of God and man, and the computer that carried his engrams also believed it."
--Kirk

--

Humor from a Genetically Molested Mind

Re:ooooh man by johannesg · 2005-01-08 01:45 · Score: 0

Surely you mean "pirated my intellectual property"?

Re:Dear Kernel Coders by gclef · 2005-01-08 01:46 · Score: 2, Insightful

Nice troll, but your sarcasm presents a common fallacy: that work on one issue (adding features like this) means that less work is being done on some other issue (cleaning up security problems). The fact is, throwing more people at a problem does not always make it better, especially if the people you throw at it don't know the subject (which the author of this algorithm may not, can't speak for him).

In other words: if you have someone who's good at writing Genetic Algorithms, but not so good at searching for kernel vulns, why devote that person to security? Why not let them write their contribution, and have the guys who are good at security do their part separately? Why should we only do one thing at once?

If the algorithm is buggy ... by Anonymous Coward · 2005-01-08 01:47 · Score: 0

... it'll come up with Windows!

Re:Dear Kernel Coders by Anonymous Coward · 2005-01-08 01:48 · Score: 0

Blatant troll, besides which the vunerability is already fixed.

Re:Dear Kernel Coders by kneeless · 2005-01-08 01:48 · Score: 3, Informative

As mentioned previously on Slashdot, uselib() comes from Linux 0.13. It was kept for the a.out to ELF transition. Someone recently noticed it and said, "What is _that_ doing in my system?" This is new code that's being looked at by hundreds of developers. It's pretty hard to get root kernel exploits when it's like that. Plus, this code doesn't introduce any calls with user level priviliges. (Read: no exploit)

--
Bran muffins and whiskey.

Simulated Annealing by Mark_MF-WN · 2005-01-08 01:50 · Score: 2, Interesting

That's where 'Simulated Annealing' comes in. It can often avoid local maxima that aren't optimal.

Re:Simulated Annealing by aav · 2005-01-10 05:40 · Score: 1

And simulated annealing is just as good as genetic algorithms. Instead of setting the mutation rate and crossover probability, you have to set the probability to accept a worse solution that takes you out of the local minimum, and you have to define a neighbouring function (which can be a bitch!! - more so than defining the crossover).

Fancy != useful
Re:Simulated Annealing by Mark_MF-WN · 2005-01-10 09:14 · Score: 1

Simulated annealing is just a particular WAY of approaching genetic algorithms -- the idea being that you start with an extremely high mutation rate, and gradually reduce it over time.

One question remains: by Qbertino · 2005-01-08 01:52 · Score: 2, Funny

Did SCO allow him to modify their kernel?

--
We suffer more in our imagination than in reality. - Seneca

Darwin's Law by adeydas · 2005-01-08 01:54 · Score: 1

Looks more like Darwin's law to me. The bottom performace children are eliminated, etc etc. Nice work but the efficiency is far too less.

Re:Darwin's Law by Anonymous Coward · 2005-01-08 02:12 · Score: 0

Well, that's what genetic algorithms are about :) Modelling darwin law and genetic inheritance, to find fairly good solutions
Re:Darwin's Law by gonzo-wireless · 2005-01-08 02:12 · Score: 0

I'm much prefer exaptation over evolution. I'm not religious, but, I think Darwin had it all wrong.

A kernal supporting exaptation would be the start of a new form of AI. Possibly the really bad stuff we see in films. At any rate, it'll be more powerful than this.

As a side note, I would have tried to use Bayesian statistics in this genetic algorithim. I imagine, however, that the strain on resources would be too high.
Re:Darwin's Law by DrKyle · 2005-01-08 06:07 · Score: 1

Just great, now my computer can win a Darwin award when it's mutated optimization wipes my hard disk. Seriously though, this whole "genetic algorithm" may be an advance, but how do they keep the code from getting stuck in local fitness peaks.

Let's say you start with a simple a-b-c-d
you then find that A-b-c-d has an increased fitness and a-B-c-d, a-b-C-d and a-b-c-D all have decreased fitness from the original. Then to optimize you always keep the big A because it is responsible for the performance gain. The problem is, sometimes 2 "wrongs" or less fit mutations when combined can create an even higher fitness. So perhaps a-B-C-d is actually the optimal fitness, but it would not likely be reached because you have to first decrease the fitness to get from A-b-c-d to a-b-c-d (worse) to a-B-c-d (even worse) to a-B-C-d (best)
This phnomenon of local fitness peaks is well understood in biology, but do they have the algorithms to handle it in computing?
Re:Darwin's Law by Anonymous Coward · 2005-01-08 10:20 · Score: 0

Random mutation. You test A-b-c-d, a-B-c-d, a-b-C-d, a-b-c-D, and mutation (say) a-B-C-d. Sometimes (this is known as simulated annealing) you even test children of a-B-C-d, even if a-B-C-d didn't result in a performance increase. (Sometimes being shorthand for "with calculated probability.")

Different setting domains by Anonymous Coward · 2005-01-08 01:54 · Score: 0

Generally the different setting domains should be kept separate. But, there's a problem in that. Extreme settings in one domain can and probably will affect other areas, so there should be control at atleast two levels.

Anyways, nice to see that someone is actually implementing this kind of functionality.

Re:ooooh man by 2$+Crack+Whore · 2005-01-08 01:54 · Score: 1

No! Pirates are people who board naval vessels without consent. Its "you copyright infringed my intellectual property"

Re:Oh, Oh by Sweetshark · 2005-01-08 01:55 · Score: 1

that is so sad.
kudos to SpanKY, who found the right words for this poor soul ....

GA + Hill Climbing... by Corpus_Callosum · 2005-01-08 01:56 · Score: 3, Interesting

First thing: A GA is only truly effective if you let it exhaustively search the search space - which is why GAs are run against simulations rather than in operational systems. Imagine trying to tune a kernel at runtime by occassionally switching to random tuning parameters. I think this is extremely non-optimal. Of course, if most of the heavy lifting is done before-hand and the GA is simply examining pre-defined parameter sets on your machine, it could work. But it's not really much of a GA anymore.

As an alternative, perhaps using some form of pseduo-GA that tries to find pre-tuned parameters that most closely match your operating environment and then letting a Hill-Climbing algorithm hit it would be a better solution.

Hill climbing can also be used in a GA type manner by letting the GA determine witch parameters to climb and in what order. The climbing itself is pretty straightforward, allow vectors to interact with individual parameters. If the result is worse, reverse the vectors or switch to new parameters. Rinse, repeat.

Yes, GA can produce odd bugs and potholes. Yes, it is the fitness test that determines if that will be true. But a good GA will generally find solutions that are as good or better than hand tuning for search spaces that are very complex. Overall, this is a good idea but is probably more complex than advertised.

--
The reason that it can be true that 1+1 > 2 is that very peculiar nonzero value of the + operator

Re:GA + Hill Climbing... by Anonymous Coward · 2005-01-08 02:28 · Score: 0

What parent said is largely true. I suspect the fitness function to be a bear, lots of correlated dicontinuities will appear. Any time some memory resource crosses a page boundary there will be a discontinuity due to the kernel running off to satisfy the page fault. Now consider this situation: some memory block sits near the edge of a page. Making it a little larger is strongly penalized. Moving it wholesale onto new page and growing it may be a very good thing. Moreover once you move it some other block may benefit from expansion. Now if you have killed off the GA and are using a straight forward hill climber (a la Simplex) you won't find the optimal solution. Either you need the GA to continue or your need a really good classical minimizer to move from your pre-defined seed points.

Minimization is not easy. There's a reason, for example, that many particle physicists rely on a minimization package that was first started in the 60's under FortranIV -- it's been pounded on for 30+ years, works well and most importantly it is good at detecting failure. The latter is crucial since many of the regions in parameter space are strongly correlated when fitting forms based on Cauchy's or multiple-exponentials or complicated partial wave expansions.
Re:GA + Hill Climbing... by Corpus_Callosum · 2005-01-08 02:46 · Score: 1

Now if you have killed off the GA and are using a straight forward hill climber (a la Simplex) you won't find the optimal solution. Either you need the GA to continue or your need a really good classical minimizer to move from your pre-defined seed points.

I agree. However, we have a problem: Running a GA on an operational system will (MUST) test more highly sub-optimal solutions than it will near-optimal solutions. The performance penalty for searching the space will overwhelm the performance gains from ocassionally running with highly tuned solutions.

This may be solved, perhaps, by only rarely searching - allow the system to run the most fit solution in acquiesence most of the time (perhaps only letting the GA run for 1 minute every hour or something similar). Combine that with a highly conservative mutator (that resembles a hill climber) and a modestly conservative recombination algorithm for solution mating and perhaps this solution could have legs.

As I point out in another post, we still have the problem of performance metric acquisition. It is the applications that are running that we need to optimize performance for and those metrics are not available to the kernel. It would be interesting to standardize some form of adaptive performance metric publishing for applications to get that data to the kernel.

--
The reason that it can be true that 1+1 > 2 is that very peculiar nonzero value of the + operator
Re:GA + Hill Climbing... by Anonymous Coward · 2005-01-08 03:14 · Score: 0

This may be solved, perhaps, by only rarely searching - allow the system to run the most fit solution in acquiesence most of the time (perhaps only letting the GA run for 1 minute every hour or something similar).

That solution would mean only tuning the performance for whatever the system happens to be doing during that 1 minute. Perhaps a better solution would be to keep track of the current performance (via the fitness function) and begin genetic tuning only when it falls below a certain benchmark (ie: the average of the last 50 fitness function evaluations).

So, for example, you're running a game server for a couple hours so everyone on your LAN can play some UT2k4 or whatever. It tunes your settings to optimize these conditions. Then it stops and just observes, so as to maximize the benefits; "The performance penalty for searching the space will overwhelm the performance gains from ocassionally running with highly tuned solutions." Next, you quit the game and start copying some files to your laptop, a totally different usage pattern. The fitness function returns less-than-running-mean numbers so the GA kicks on and starts testing new settings. Within seconds, everything is tuned up and it stops again, leaving the settings in place and just observing passively.
Re:GA + Hill Climbing... by Corpus_Callosum · 2005-01-08 03:51 · Score: 1

That solution would mean only tuning the performance for whatever the system happens to be doing during that 1 minute. Perhaps a better solution would be to keep track of the current performance (via the fitness function) and begin genetic tuning only when it falls below a certain benchmark (ie: the average of the last 50 fitness function evaluations).

Yes, you are right about that. I suppose I am imagining that this would be most useful in a data-center rather than the desktop. The desktop scenario is so varied as to pose significant problems for the GA. However, let's examine the general case a bit (in a sec).

Your suggestion of flipping the GA back on when current fitness drops below a critical threshold has some inherent assumptions that need to be examined. First, it assumes that we are continually calculating the fitness of the currently running parameter set - an operation that degrades performance. Second, it assumes that we have a fitness test that is temporally monotonically consistent (e.g. over time, the fitness score has the same meaning in an absolute rather than relative sense). Third, it assumes that the fitness score will drop below a critical value (what if it doesn't?). All of these assumptions are suspect, I believe.

So, for example, you're running a game server for a couple hours so everyone on your LAN can play some UT2k4 or whatever. It tunes your settings to optimize these conditions. Then it stops and just observes, so as to maximize the benefits; "The performance penalty for searching the space will overwhelm the performance gains from ocassionally running with highly tuned solutions."

But it is still calculating fitness, which is most likely to be the most expensive part of running the GA.

Next, you quit the game and start copying some files to your laptop, a totally different usage pattern. The fitness function returns less-than-running-mean numbers so the GA kicks on and starts testing new settings.

Here is where the question of temporal monotonicity must be examined. Under completely different usage patterns, are the fitness values meaningful when compared to the previous set? Or are they only meaningful when compared to other fitness values calculated under the same load? Coming up with a fitness test that has the characteristic of temporal monotonicity would be very challenging, I believe.

It is just as likely that while other "tuning solutions" are now better under the new load, the fitness value that is calculated under the new load is even better than it was when the UT server was running (when it was the best solution). You see the problem?

Within seconds, everything is tuned up and it stops again, leaving the settings in place and just observing passively.

Observing actively, since it is still calculating fitness. But I see where you were trying to go with this.

Here is a different solution that may be more effective:

Allocate to every solution (tuning parameter) in the current population a statistically relevent number of timeslices to run, based on it's fitness. So the most fit solution is allocated the highest number of timeslices and the least fit solution is allocated the lowest number of timeslices. The statistical curve for this distribution may also be tuned using the same GA.

For each timeslice, randomly (or following some predetermined pattern) select a tuning set (solution) according to our statistical distribution. Calculate performance metrics.

Compute next generation and repeat.

One variation would be to only compute performance metrics the first time a solution is run in a generation - it would be less accurate but also lower overhead.

The point of this (or my previous suggestion of 1 min / hour, or anything similar) would be to compare apples to apples, since I don't believe we can assume temporal monotonicity.

Thoughts?

--
The reason that it can be true that 1+1 > 2 is that very peculiar nonzero value of the + operator
Re:GA + Hill Climbing... by TapeCutter · 2005-01-08 05:07 · Score: 2, Interesting

Moment preezz, I think the both of you are missing something. I have had some experience writing commercial software that collects performance stats for things like capacity planning, tuning, etc. As you can imagine it would be bad press for an application like that to be a performance hog, but (in my experience) when used to "collect all" the machine will take ~7% performance hit (many competitors were worse for less data).

Granted it was a user level app and stored it's data in an sql db, but roughly half that data is from running processes.

The collection of timely data will overwhelm anything else. I have no idea of the details of what has been done except from the article. I think the only way it could be usefull would be as a tuning tool for servers (most heavy duty servers have regular patterns of usage).

You run the tool on your server to collect data. You then use the data and a seperate machine to "evolve" a solution. The solution may involve many changes at various times, whatever, but these could be sheduled into the server. In most places you would want to collect data for at least a month,preferably for 13 months.

Note that the 3% was against particular benchmarks. If you based your benchmarks on data collected from your servers the "evolution" stage could be a snap. A lot of large companies won't even spend money to collect and analyse basic data because they run on a "fix on fail" basis and just throw hardware at it because memory/disk space is relatively cheap and they think it might help.

Very interesting stuff and a great thread but I can't see the GA-kernel as a significant improvement.

--
And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
Re:GA + Hill Climbing... by Corpus_Callosum · 2005-01-08 05:22 · Score: 1

Moment preezz, I think the both of you are missing something. I have had some experience writing commercial software that collects performance stats for things like capacity planning, tuning, etc. As you can imagine it would be bad press for an application like that to be a performance hog, but (in my experience) when used to "collect all" the machine will take ~7% performance hit (many competitors were worse for less data).

Actually, that is precisely what I am talking about. The fitness test that produces fitness values is responsible for data collection and analysis. It is precisely that activity that I am stating must be minimized in order for adaptive tuning to be useful.

That being said, I do believe there is promise to this technique. The solution must minimize it's data-collection and analysis. Perhaps the other poster's assertion that "critical events" would initiate further tuning is along the right track.

Virtual machines (such as Sun's HotSpot for Java) have shown us that run-time tuning and optimization is a powerful technique. Making use of it in the Linux Kernel is a quite good idea. Doing it correctly and minimally is the challenge.

IMHO

--
The reason that it can be true that 1+1 > 2 is that very peculiar nonzero value of the + operator
Re:GA + Hill Climbing... by Illserve · 2005-01-08 05:53 · Score: 4, Insightful

First thing: A GA is only truly effective if you let it exhaustively search the search space

If you have the resources to exhastively search the space... you don't need a GA.

A GA is generally used when the search space is hopelessly huge and you need to chart a course from A to(wards) B but you don't know the way.

And in finding this solution, which is "grown", not engineered, it's much easier for unintended wierdnesses to creep in. A GA might solve problem X by ruining performance on problem Y, something that you, as a software engineer, would never even consider viable, and hence you forgot to force the fitness function to check problem Y along with problem X.
Re:GA + Hill Climbing... by Anonymous Coward · 2005-01-08 06:17 · Score: 0

A GA is only truly effective if you let it exhaustively search the search space

I don't think that's true. What do you consider effective? I consider the algorithm to be effective if it finds a better solution, which this does.

Even if you consider "effective" to mean "finds the optimal solution", you don't have to do an exhaustive search for it to be effective - you'd have to be really unlucky to find the optimal solution on the very last iteration. You only have to do an exhaustive search to know that you've found the optimal solution.

In any case, brute forcing the solution is often impossible/infeasible, which is where genetic algorithms work best.

which is why GAs are run against simulations rather than in operational systems.

If by that you mean that the fitness function uses a simulation rather than measures its actual performance in the real world, there are two reasons for that, and neither of them are what you describe.

The first reason is that what actually happens in the real world in one iteration may not be representative of usual behaviour. You don't want to optimise your whole system based upon measurements from an out-of-memory condition.

The second reason is that there is no guarantee that any given chromosome will give acceptable behaviour - in fact it's virtually certain that some will be completely broken. The idea behind genetic algorithms is that the population as a whole produces some improvements from generation to generation, not that each individual is an improvement.

Hill climbing can also be used

If hill-climbing worked, then people wouldn't bother with the more complex genetic algorithms, would they? Hill-climbing algorithms tend to get stuck at local maxima rather than find the optimal solution.
Re:GA + Hill Climbing... by Anonymous Coward · 2005-01-08 06:25 · Score: 0

Within seconds, everything is tuned up and it stops again, leaving the settings in place and just observing passively.

Observing actively, since it is still calculating fitness. But I see where you were trying to go with this.

All of the points brought up in the parent in response to my original post were valid concerns. However, if you look at the patches that Jake wrote (and credit to him, being so new to the field and going out on a significant limb) you can see in the function disk_calc_fitness(void), just as an example, that he's calculating the average throughput for the amount of time each child has been implemented. Also, it appears to be "passive" because, as I actually meant, there are no adjustments being made, but when running the fitness function, you can use metrics that are always calculated and record a very basic fitness for the current parameters. I base this on the usage of disk_stat_read(), et al. Since he's using in-kernel metrics already present, I'm not sure 'overhead' will be a primary concern. After doing a good bit of research myself over the years, I'd also venture to say that, in this very specific case, there's no reason to think the fitness function(s) will be the most expensive part(s) of this patch. But that all depends on which implementation you go with (ie: when do you let the GA 'spawn' new children).

He uses a queue to ensure each is tested evenly, though it would certainly be interesting to see, as it was suggested, the performance if we favored the most fit at a given time. I'd be willing to test some of this stuff in a sandbox if you guys want to throw around some idea/code contact me (angel.one\@gmail.com).
Re:GA + Hill Climbing... by Corpus_Callosum · 2005-01-08 07:00 · Score: 1

If you have the resources to exhastively search the space... you don't need a GA.

Ahh, semantics. "Allow the GA to exhaustively search vs. The GA searches exhaustively"

Of course the GA will not do an exhaustive search, but it must have the ability to do so (e.g. explore all the really bad solutions as well as the good ones - the solution space should not be artificially constrained by eliminating random solutions from testing).

It is pretty obvious that I was trying to say this if you actually look at my post: "First thing: A GA is only truly effective if you let it exhaustively search the search space - which is why GAs are run against simulations rather than in operational systems. Imagine trying to tune a kernel at runtime by occassionally switching to random tuning parameters. I think this is extremely non-optimal."

The point is simple: If you free the GA to do what it is supposed to do, you will often be running random (very suboptimal) tuning parameters which will wash out the performance gains that are achieved.

I guess: Listen to what I mean, not what I say ;-)

And in finding this solution, which is "grown", not engineered, it's much easier for unintended wierdnesses to creep in. A GA might solve problem X by ruining performance on problem Y, something that you, as a software engineer, would never even consider viable, and hence you forgot to force the fitness function to check problem Y along with problem X.

Of course, this can be true. That is why debugging the fitness test is the most important activity in implementing a GA. We agree.

--
The reason that it can be true that 1+1 > 2 is that very peculiar nonzero value of the + operator
Re:GA + Hill Climbing... by Corpus_Callosum · 2005-01-08 07:17 · Score: 1

"A GA is only truly effective if you let it exhaustively search the search space"

I don't think that's true. What do you consider effective? I consider the algorithm to be effective if it finds a better solution, which this does.

Effective: Sure, any gain is welcome. That is correct. But for the GA to be effective at what it does, it needs to be able to search the search-space. If the GA starts with 20 predetermined tuning parameter sets and mates and mutates those, then all we have is a fancy hill-climber. In order to find novel Maxima in the search-space, the GA should not be constrained - it must have the ability to exhaustively search, even though it does not (of course) do such a thing.

The first reason is that what actually happens in the real world in one iteration may not be representative of usual behaviour. You don't want to optimise your whole system based upon measurements from an out-of-memory condition.

I am simply stating that running a kernel with random tuning parameters some percentage of the time (which is basically required in a non-simulation GA) will produce ugly results.

The second reason is that there is no guarantee that any given chromosome will give acceptable behaviour - in fact it's virtually certain that some will be completely broken. The idea behind genetic algorithms is that the population as a whole produces some improvements from generation to generation, not that each individual is an improvement.

Actually, the point behind a genetic algorithm is to adapt without human intervention, to evolve solutions that are "good enough", often novel and difficult to create in other ways. You DO NOT WANT to make use of your weak solutions, however. Populations should be diverse and that in no way guarantees you high population fitness. In fact, you will get better results in a GA if you always generate some random solutions. The average fitness over all solutions in a population, in a case like this (when taking random + highly evolved solutions and giving them time-slices) is likely to produce worse system performance and at best will give only modest gains with sporadic somewhat unpredictable performance.

If hill-climbing worked, then people wouldn't bother with the more complex genetic algorithms, would they? Hill-climbing algorithms tend to get stuck at local maxima rather than find the optimal solution.

GA finds the hill. Hill climbing gets to the top of it. Mutation in GAs (if written correctly) often perform some amount of hill-climbing. My point was not to eliminate GAs. My point in the original post was to use them correctly and sparingly and let Hill-climbing do some of the work. You want to minimize random behavior in the kernel as well as tuning overhead.

But I think GA is an excellent idea for tuning the kernel.

--
The reason that it can be true that 1+1 > 2 is that very peculiar nonzero value of the + operator
Re:GA + Hill Climbing... by Illserve · 2005-01-08 07:29 · Score: 1

Nothing about properly implemented GA's can be considered "exhaustive". They are, by definition, a means of effectively navigating a hopeless search space. For example, it is impossible for a GA to "explore all the really bad solutions", as there are far too many in any interesting problem space.
Re:GA + Hill Climbing... by Anonymous Coward · 2005-01-08 07:41 · Score: 0

What part of "Has the ability to exhaustively search the search-space" do you not understand?

Did I ever assert that a GA actually performs an exhaustive search?

A GA samples datapoints in a search-space and uses the results of those samples to refine it's searching algorithm. But to be effective, it must not be artificially constrained - it must have the opportunity to look at all regions of the search space equally.

Am I really not making sense to you?
Re:GA + Hill Climbing... by Illserve · 2005-01-08 07:50 · Score: 1

Having the ability to "exhaustively search" imples not only access to the entire problem domain, but also the computational power to try them all (or at least a significant percentage).

GA's, IMHO, are designed for problems in which the amount of computational firepower is practically insignificant in comparison to the size of the domain.

I think you'd do better finding a different term than "exhaustive" for what you are trying to say.
Re:GA + Hill Climbing... by Anonymous Coward · 2005-01-08 07:55 · Score: 0

You are stuck on this word, exhaustive...

Free your mind. For the forth time, I am stating that a GA must have ACCESS TO THE ENTIRE SEARCH SPACE WITHOUT PREJUDICE OR BIAS.

Sorry I didn't say it in a way that you like. Are you one of these people who pushes your interpretation of the use of the Enlish language on everyone you talk to? Did you not understand what I meant or did you just want to pick a fight over my choice of words? Are you feeling intimidated by me and wanted to give yourself a little boost?

I consider you a troll.
Re:GA + Hill Climbing... by Illserve · 2005-01-08 08:02 · Score: 1

No, it's just that the word exhaustive has a very particular meaning in the context of search algorithms.
Re:GA + Hill Climbing... by Corpus_Callosum · 2005-01-08 08:41 · Score: 1

A GA does exhaustively search a search space, given enough time. It just so happens that the solutions it finds while searching are generally very useful. These solutions may be used immediately and in some cases, you can stop searching further. But if you let the algorithm continue long enough, it WILL SEARCH THE ENTIRE SPACE.

If the GA is not able to reach the entire space or if it is biased in such a way as to make portions of the space hard to reach, then it is not an effective GA. If I wanted to state this in one sentence, I could say "A GA is only truly effective if you let it exhaustively search the search space." This was the exact statement that you took issue to. Please note that nowhere did I state that you had to let the seach finish before the results were useful.

Do you still not understand?

--
The reason that it can be true that 1+1 > 2 is that very peculiar nonzero value of the + operator
Re:GA + Hill Climbing... by Illserve · 2005-01-08 08:52 · Score: 1

These statements apply to all search algorithms for parameterized problem domains. No search algorithm is very effective if it is bottled off from significant portions of the domain.
Re:GA + Hill Climbing... by Corpus_Callosum · 2005-01-08 08:58 · Score: 1

These statements apply to all search algorithms for parameterized problem domains. No search algorithm is very effective if it is bottled off from significant portions of the domain.

Thank you. We agree. That was exactly my point - the implications are that in a scenario like this, if you allow the search to progress as it should, you will constantly be testing very sub-optimal solutions which will result in negative side-effects to system performance. For that reason, it would be necessary to place heavy restrictions on the percentage of time that the GA is testing new solutions and use other techniques such as hill-climbing to augment the GA.

--
The reason that it can be true that 1+1 > 2 is that very peculiar nonzero value of the + operator
Re:GA + Hill Climbing... by Anonymous Coward · 2005-01-08 11:01 · Score: 0

If the GA starts with 20 predetermined tuning parameter sets and mates and mutates those, then all we have is a fancy hill-climber.

This isn't true. The mutation alone differentiates it from a hill-climber. The problem of hill-climbers is that they get stuck at local peaks. Genetic algorithms don't.

In order to find novel Maxima in the search-space, the GA should not be constrained - it must have the ability to exhaustively search, even though it does not (of course) do such a thing.

I don't understand what you consider to be the constraint in this situation. The genetic algorithm does have the ability to find any part of the search space as far as I can tell.

I am simply stating that running a kernel with random tuning parameters some percentage of the time (which is basically required in a non-simulation GA) will produce ugly results.

We agree there then.

The average fitness over all solutions in a population, in a case like this (when taking random + highly evolved solutions and giving them time-slices) is likely to produce worse system performance

We agree here too.

GA finds the hill. Hill climbing gets to the top of it.

I don't see how a fitness function could determine whether or not it had found "the hill".
Re:GA + Hill Climbing... by Corpus_Callosum · 2005-01-08 12:36 · Score: 2, Insightful
"If the GA starts with 20 predetermined tuning parameter sets and mates and mutates those, then all we have is a fancy hill-climber."

This isn't true. The mutation alone differentiates it from a hill-climber.
Mutation in Genetic Algorithms are supposed to act as hill-climbers *most of the time*. Mutations are not supposed to make drastic changes - that is the job of random individulas inserted into populations and recombination (mating). Mutation (of the bit flipping variety) is mostly there to provide the opportunity make good solutions better.
The problem of hill-climbers is that they get stuck at local peaks. Genetic algorithms don't.
Genetic Algorithms get stuck at local maxima all the time, especially in populations with low population count and low genetic diversity.
"In order to find novel Maxima in the search-space, the GA should not be constrained - it must have the ability to exhaustively search, even though it does not (of course) do such a thing."

I don't understand what you consider to be the constraint in this situation. The genetic algorithm does have the ability to find any part of the search space as far as I can tell.
I need to take a look at the code, but my understanding from reading the article was that the GA starts with predefined tuning solutions and uses recombination and mutation to change them. In order to *effectively* search the search-space, more randomness needs to be present (e.g. generate a couple of random solutions each generation). Otherwise, the GA will pretty quickly draw to a local maxima and stick there (even with mutation, altough mutation *can* kick it out of a local maxima eventually - assuming there is higher peak to be found within the range of the mutation function).
I don't see how a fitness function could determine whether or not it had found "the hill".
Why does it have to? The point is, don't use the GA *all the time*. Use it until the system stabilizes and then hill climb for a while. The results will be better for a couple of reasons: (1) You are not running all the bad solutions at the bottom of your population all the time while hill climbing (only the best solution) and (2) Your best solution will be getting deterministically better through hill-climbing. A GA is a bad hill-climber but good at finding hills. Augmenting a GA with a hill climber will improve solutions (often dramatically).

Writing a GA is simple. Writing a GA that delivers results, now that is a different beast alltogether. My experience has shown me that there are some rather more complicated things that need to be taken into account such as:
- Remembering previous good solutions instead of breeding them out of the genepool. In this scenario, it is likely that there are multiple eigenloads that a machine finds itself in. Having to re-generate the correct tuning parameters each time the load switches between these *magic* states is cumbersome. But if the genes that worked in the other eigenstate are still in the genepool, it becomes trivial. Often, this is achievable by augmenting the fitness function to take into account the previous aggregate fitness of a solution or somesuch.
- Keep the right amount of randomness flowing. If the GA does not have a good source of randomness to draw on, it will operate much like a hill-climber and will oscillate between a few solutions. It will get stuck.
- Keep the population correctly sized
- Have multiple populations that occassionally interact
- Statistical mating is important
- In a case like this, the best current solution should get the most timeslices. Period. And this should be like 10x the total number of timeslices for the entire population (e.g. population=20, each generation is 220 timeslices - top solution gets 200, each other gets 1)
- Tuning mutation and crossover is extremely important
- Fitness test. Fitness test. Fitness test.
- many many other things to consider
--
The reason that it can be true that 1+1 > 2 is that very peculiar nonzero value of the + operator

Single best post by Uber+Banker · 2005-01-08 01:58 · Score: 1

Of 2005, so far.

Re:Dear Kernel Coders by Xpilot · 2005-01-08 01:59 · Score: 5, Informative

Go grab the patches. They're commited into the BK repositories already. Sheesh.

Patches for 2.4 can be found in this changeset.

Patches for 2.6 can be found in this changeset.

Click on the little "diff -Nur style" link for a an actual usable patch.

In the course of a few hours, you have the fixes already. Yay for open source.

Btw, nice troll :p

--
"Backups are for wimps. Real men upload their data to an FTP site and have everyone else mirror it." -- Linus Torvalds

Re:Oh, Oh by Leffe · 2005-01-08 02:06 · Score: 1

He's not using -fno-exceptions and -fno-rtti for his CXXFLAGS, I'm disappointed.

Sure, it's quite annoying having to edit the ebuilds that don't filter the flags, but it's worth it.

Re:Oh, Oh by maskedbishounen · 2005-01-08 02:07 · Score: 1

Pfft. You forgot the most important Gentoo flag (mainly for gaming).

APPLY_NUDE_PATCH="1"

Wait, what do you mean it's just me?

--
"An infinite number of monkeys typing into GNU emacs would never make a good program."

Genetic packet scheduler by City+Jim+3000 · 2005-01-08 02:07 · Score: 3, Interesting

Would it be possible to apply a genetic algorithm on a packet scheduler? IMO the packet schedulers available today needs too much manual tweaking.

Re:Dear Kernel Coders by Anonymous Coward · 2005-01-08 02:12 · Score: 0

Bzzt! Not a troll clown.

Nice to see the average SD poster is completely clueless about actually creating and shipping a software product in the real world.

Just wouldn't be the same if suddenly everyone had real world software engineering experience around here.

I have DNA by Anonymous Coward · 2005-01-08 02:19 · Score: 0

I believe you owe me £2 for a cup of tea guvnor

It might be useful to find good default settings by Anonymous Coward · 2005-01-08 02:20 · Score: 1, Interesting

and also find good values of tunables for common workloads, eg: web server, database, misc server, desktop and put them as templates in /Documentation.

The problem: Determining Performance by Corpus_Callosum · 2005-01-08 02:23 · Score: 3, Insightful

The main problem with this or any other adaptive tuning mechanism is actually acquiring performance metrics.

What is the system using to decide if a new parameter set is better than a previous? What is the fitness test?

Some applications are disk-bound, others are CPU-bound, others are network bound. The performance dance is often non-obvious (e.g. some applications will achieve generally higher performance by allowing I/O higher priority than context switching, while others that appear to perform in a similar manner will achive higher performance by reversing that order).

The kernel does not have any mechanism to determine if a particular application is performing better or worse, it can only really get a guage of throughput and load. While this MAY be enough to get small systemwide performance gains, in order to really acheive significant application-specific performance gains, I think that applications would need to explicitly add support for adaptive tuning by logging relevant performance metrics for the kernel to tune around.

Thoughts?

--
The reason that it can be true that 1+1 > 2 is that very peculiar nonzero value of the + operator

Oh no! by brainnolo · 2005-01-08 02:23 · Score: 2, Funny

Does this means Linux will be effected by genetical diseases sooner or later?

Re:Dear Kernel Coders by Anonymous Coward · 2005-01-08 02:26 · Score: 0

In the real world, software sucks. So take that "real world' condescension and shove it up your ass, you pompous dick.

Monte Carlo w Bayesian Stats by G3ckoG33k · 2005-01-08 02:27 · Score: 1

Monte Carlo simulations w Bayesian Stats may explore very large otherwise intractable parameter spaces. Perhaps an alternative path?

Re:Oh, Oh by bcmm · 2005-01-08 02:28 · Score: 1

WTF are those use flags near the bottom of the bugzilla page?

I think you are allowed to just say USE="*"

If you want the binaries to be slower than RPMs...

--
# cat /dev/mem | strings | grep -i llama
Damn, my RAM is full of llamas.

Re:Oh, Oh by m50d · 2005-01-08 02:31 · Score: 1

-funroll-all-loops usually slows things down compared to -funroll-loops.

--
I am trolling

Re:Dear Kernel Coders by Anonymous Coward · 2005-01-08 02:31 · Score: 0

Awww, the little teenage open source nut had his feelings hurt!

Don't worry little guy. The grownups in the commercial world are taking over and finally getting things done.

Go back to jerking off to TCatB.

GAs aren't rocket science by Earlybird · 2005-01-08 02:33 · Score: 5, Insightful

Because most people aren't intimately familiar with genetic algorithms, and because GAs are associated with machine learning/artificial intelligence, GAs are seen as somewhat mysterious and magical, and are therefore either accepted with "whoa, cool!" or rejected with "whoa, complex!" While GAs are indeed novel compared to many long-established algorithms, both mentalities are overreactions.

In reality, the basic GA framework is "just" another efficient search algorithm, no cooler or more complex than, say, a hash table or a binary search tree; at its simplest, a GA is a way to find an optimal configuration of components without looking at all possible (potentially explosively exponential) combinations; instead, you look at just some permutations, and as you iterate through generations, applying breeding and mutation, you arrive at a generation which is statistically close to optimal.

GAs are also in no way new or unproven technology; a nice example of mainstream use is PostgreSQL's query planner, which uses GAs to optimize query plans.

Re:GAs aren't rocket science by zhiwenchong · 2005-01-08 06:02 · Score: 1

GAs are not typically efficient on their own... and while their solutions are usually better than the nominal case (with no optimization), those solutions don't always qualify as "optimal".

quote:

When to Use GAs

GAs treat the optimization problem as a black box, and are therefore very flexible. Because GAs use very little problem structure they will inevitably perform poorly relative to an algorithm which is is designed for a specific problem type. GAs are therefore best suited to messy problems which do not admit more conventional optimization schemes. GAs can also be hybridized with other solution methods so that aspects of a problem which are amenable to a solution are solved using specialized techniques.
Re:GAs aren't rocket science by Illserve · 2005-01-08 07:52 · Score: 1

Actually, some flavors of GA's are fundamentally different from other types of search, particularly those with an open ended problem space.

But in the case of simple parameter-fitting problems (ie this one) , you're right.

practical applications by memoryband · 2005-01-08 02:33 · Score: 2, Interesting

while performance gains of 1-3% in a well defined set of tasks (in this case the benchmarks) is a marginal improvement well inside statistical error...

this is a really interesting idea.

Moving the genetic algorithm processing to another machine may be warranted. If you had a good idea of what you were going to be doing (heavy database work for instance), a dedicated machine could be used to find an optimal scheduling solution and then that could be implemented on the production machine.

or maintain a list of optimal solutions on the production machine and change based on context.

This implementation might not be all that useful but I hope it spurs interest and a lot more development in the area.

The author did an awesome job coming up with the idea.

Re:practical applications by Corpus_Callosum · 2005-01-08 03:05 · Score: 1

Moving the genetic algorithm processing to another machine may be warranted. If you had a good idea of what you were going to be doing (heavy database work for instance), a dedicated machine could be used to find an optimal scheduling solution and then that could be implemented on the production machine.

Ahh.. interesting idea..

If I am running in a cluster environment, I could dedicate one or more machines in the cluster to evolve tuning parameters. That machine could publish "discoveries" to the other machines in the cluster as they come up. Of course, the dedicated machine(s) that are using GA in the kernel would be handling a standard workload just like all the other machines in the cluster, but the penalities for searching the searchspace would be restricted to the machines that are tuning.

Generally, such clusters exist because they are performance bound in some way. So the benefits that come out of adaptive tuning for such a cluster would be quite welcome.

It's a damned good idea.

--
The reason that it can be true that 1+1 > 2 is that very peculiar nonzero value of the + operator
Re:practical applications by HolyCoitus · 2005-01-08 08:08 · Score: 1

The genetic algorithm is supposed to create a situation where you no longer have to select a scheduler. What you are proposing is using it to find the best scheduler for the task. This is already known and the scheduler can be designed to take advantage of the environment if it is known. This is best to be thrown into a mixed environment where you do not know the types of processes.

The implementation is horribly useful and needs improvement on the scheduler end as is said. The schedulers that it can use are the weakest point since the guy who wrote it doesn't do those. He just needs to do the algorithm that selects which scheduler is used.

Oh, and in case anyone thinks I know what I am talking about, I don't. I read the kernel mailing lists, so I know a bit I believe, but I am not an expert in any way.

--
That's scary.
Re:practical applications by tdelaney · 2005-01-08 10:38 · Score: 1

There's a severe flaw with this methodology though.

The two machines are different (even if their specs are theoretically the same). Minor differences in characteristics can result in highly sub-optimal results if you move the determined parameters from one machine to another.

To obtain *useful* parameter sets, you will need to do it on the machine in question. Your second option points towards this.

Re:Dear Kernel Coders by Trillan · 2005-01-08 02:35 · Score: 1

See Brooks' Law.

Shouldn't this be done in userspace? by dpilot · 2005-01-08 03:02 · Score: 1

I can see a kernel patch to export some extra information and/or extra tuning hooks via proc or sysfs, but IMHO the algorithms themselves should be outside the kernel, running in a daemon.

--
The living have better things to do than to continue hating the dead.

Re:Oh, Oh by red+tiger · 2005-01-08 03:08 · Score: 1

-ffast-math is sometimes dangerous - it assumes that the floating point operations never result in neither "nan" (Not A Noumber) nor "inf" (infinity). Thus, some functions start giving incorrect answers. Of course, if you use only integers in your code, then it's not a problem. But it seems like you have this flag attached to every single compilation on your system... :-/ To be honest, I wouldn't do that.

as someone who knows about GA by garaged · 2005-01-08 03:15 · Score: 0

WONT WORK !! intelligence is way much better than GA, there are limited cases where they are good, but in general a good design would be much better than depending on GA

--
I'm positive, don't belive me look at my karma

Re:as someone who knows about GA by Anonymous Coward · 2005-01-08 03:44 · Score: 0

Facts are useful in trying to convince people. The GA kernel tuning guy posted his facts and code - where are yours?
Re:as someone who knows about GA by Carewolf · 2005-01-08 04:15 · Score: 1

His code shows that adapting tuning parameters to the current workflow is beneficial even when doing it with something as rough as a genetic algorithm. And new well thought out design could use and improve the results of the genetic algorithm, even if GA is not going to be used in the final result.

CPU's and compilers by cybrthng · 2005-01-08 03:16 · Score: 1

have been doing this kind of logic for a while. Itanium is almost built entirely on this type of logic with combination of intel compilers and code technology.

I think a hardware design that does more logic controls would be best...

Parent is overrated FUD ! by hernick · 2005-01-08 03:23 · Score: 1

Mr. Illserve, you imply that problems best solved by genetic algorithms may well be full of potholes and bugs. Your comment is overrated FUD; you are trying to scare unwary slashdotters away from genetic algorithms !

First of all, the possible parameters that the genetic scheduler can affect can all be changed safely. The genetic scheduler does not try to invent new scheduling algorithms; rather, it tests different parameter sets and existing algorithms to find which works best with the current workload.

You can think of it as acting on variables that control how the scheduling is performed. It does not do the scheduling itself, nor does it try to evolve new algorithms; it merely tweaks scheduler operation in an attempt to better configure it.

Genetic algorithms are not inherently more prone to bugs than other kinds of algorithms. And there is a problem space in which they perform admirably. It appears that the linux kernel scheduler may well be on such problem space.

Re:Parent is overrated FUD ! by Illserve · 2005-01-08 07:43 · Score: 1

Admittedly, I don't understand this problem space very well. My point was just that I am wary of GA's in any situation in which reliability is a concern. But it could be that in this case the worst a GA can do is come up with a solution that is 10% slower for some unexpected type of problem.

There's just this sense among lay people though that GA's are some kind of magical cure-all, and I think it's because this type of search is a bit of a black box. Put something in, turn the crank, and wait for a solution.

But this black-boxiness is also the Achilles heel, as our GA's can often outsmart us by finding a loophole in our fitness function that solves the letter of the problem, but violates the spirit of it.
Re:Parent is overrated FUD ! by Anonymous Coward · 2005-01-08 09:27 · Score: 0

I make commercial EP systems for a living and fitness functions are definitly the weak point of a lot of EP or GA systems, but it's sometimes easier to gauge sucess than design sucess.

With regards to kernel tuning, there's often a clear indication of whether you are sucessful or not - "Is stuff getting done?" - So the fitness function problems are less of an issue in this case.

As long as the mutators are adaptive and designed intelligently, this is a very good application of GAs.

Count the lines in the GA patch before commenting by Anonymous Coward · 2005-01-08 03:41 · Score: 0

It's less than a few hundred lines and very straightforward and easy to understand and has the potential to lead to huge gains under constantly changing loads. So what are you complaining about? I guess you would also have been opposed to past scheduler patches that have improved performance. Linux will continue to improve whether you think there is already "too much stuff in the kernel" or not. Use an older and simpler kernel if you wish. Many embedded devices use the 2.2 series for this very reason.

Tweaking the algorithm or using other ones? by moz25 · 2005-01-08 03:49 · Score: 1

Having worked with a variety of optimization algorithms, would it make sense to consider the optimization itself as the 'innovation', where the actual algorithm used is secondary? Perhaps this can be optimized further with other algorithms or using the most appropriate one for different cases... ?

--
see a Text Widget

Re:Tweaking the algorithm or using other ones? by Anonymous Coward · 2005-01-08 09:57 · Score: 0

'the optimization itself as the 'innovation', where the actual algorithm used is secondary'

Excellent point. You are refering to choosing and representing the fitness function? This is where the 'innovation' is.

'Perhaps this can be optimized further with other algorithms or using the most appropriate one for different cases... '

Hmm, you could have a GA to pick the best algorithm ;)

Re:ooooh man by johannesg · 2005-01-08 03:53 · Score: 1

I was making a joke. Gee, learn about sarcasm people...

Re:Dear Kernel Coders by cduffy · 2005-01-08 03:56 · Score: 1

The grownups in the open source world are getting a little annoyed at the teenagers in both worlds bickering at each other. Jeesh.

- Some dude whose last 3 jobs have involved doing open source work professionally.

MOD parent UP by TapeCutter · 2005-01-08 03:57 · Score: 1

Not sure about "bugs" as such but it is definitely not gaurenteed to reach an optimal solution even for the evaluation function it is using.

If the solution space has peaks and troughs of varying height then a genetic algorithim will often stabalise on a sub-optimal peak.

I THINK (but would have no hope of proving) that GA's are mathematically equivalent to the "deep blue" type chess programs.

Another more mundane drawback is working out why it runs slow one-day and fast the next (if the %3 improvement is on a SLA boundry, PHB's will want to know).

--
And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.

Consistency by Xemu · 2005-01-08 04:06 · Score: 1

1-3% gains in speed will rarely be more important than consistent performance. In most real-life situations, one would rather oversize the server with 10% extra capacity knowing than it will consistently perform, than sometimes after successful GA tuning have an additional 3% capacity that won't always be available.

--
Tell your friends about xenu.net

Re:Oh, Oh by ZonaldRumzfeld · 2005-01-08 04:30 · Score: 1

Holy crap! I can't believe someone submitted a bug report like that. I don't even have 1/10th of that in my make.conf for Gentoo :p

Also: why tune a startup routine? by tallbill · 2005-01-08 04:44 · Score: 1

Most startup routines runs once and never run again until the system is restarted.
Thus optimization of code is often only done for things that run repeatedly.

The genetic algorithm also could have a way to tell it what code is merely startup or intialization code and what is code that runs repeatedly. Otherwise the algorithm could be working on code that doesn't really matter as far as overall long term performance goes.

I still think that the best way to deal with the efficiency issue is to use a Profiler and perform code reviews.

Re:ooooh man by Anonymous Coward · 2005-01-08 04:45 · Score: 0

There is no joking here! This is slashdot! Stuff that matters! Serious all the time! SERIOUS!

Re:Oh, Oh by Stevyn · 2005-01-08 04:53 · Score: 1

actually, that would be a USE flag

No kidding? by Shazow · 2005-01-08 04:55 · Score: 1

...I'm using gentoo myself and love it.

Wow, I never would have guessed...

That is sarcasm, btw. I, too, am using gentoo and love it. :-)

Intelegent people consider a GA by tallbill · 2005-01-08 04:58 · Score: 2

Obviously any tuning must involve data collection on what is going on in a process. This means that some kind of logging must go on in the software.

Anyone who has ever created a logging interface understands that a log will slow down a system.
And so, even if one were to run the GA and have it show things, it will still mean that later that GA should be removed or swithced off. Otherwise the workings of the GA and its requisite log will slow down the system.

So, use a GA,and then intellegently remove it after you decide what it has shown how it will actually increase performance.

It is very common to do one build with logging enabled, and another that does not have logging at all. And then you link in both. At the whim of the engineers (because this is usually done in real-time systems that are designed by degreed engineers) the one can be turned on.

or you can design a system with lists of modules that each have two versions, a logging one and a non-logging one. Then you use a function pointer based calling method and switch the modules between logging and non-logging at your whim.

When you say things like Wont work you shut yourself off from possibly useful tools. Naturally you will need an intellegent person to oversee the use of this tool of Genetic Algorithms. You could be that person.

Re:Intelegent people consider a GA by ant_slayer · 2005-01-08 09:08 · Score: 1

"Intelegent"?

-Ant Slayer-
Re:Intelegent people consider a GA by garaged · 2005-01-08 11:50 · Score: 1

What i mean is that there are better-faster algorithms, GA is like shooting flys with cannons most of the time.

--
I'm positive, don't belive me look at my karma
Re:Intelegent people consider a GA by Anonymous Coward · 2005-01-08 16:21 · Score: 0

and its requisite log will slow down the system.

Most of the info is being tracked already for the underlying schedulers to be able to do their job, so it may not be the data collection that is problematic as much as the iterative churning of the population generations.
Re:Intelegent people consider a GA by tallbill · 2005-01-10 18:56 · Score: 1

I am so sorry, I meant to say
Intel agent.

Spys who work for intel.

Re:Dear Kernel Coders by thinkninja · 2005-01-08 04:59 · Score: 1

Also 5 linux kernel advisories from the grsecurity team.

--
"The number of Unix installations has grown to ten, with more expected." (Unix Programmer's Manual, 2nd ed.; june 1972)

Re:Oh, Oh by Stevyn · 2005-01-08 05:07 · Score: 1

I think that's why ricers are made fun of so much. They add in flags without any knowlege of what they do.

Mine are somewhat conservative:
CFLAGS="-O2 -march=pentium4 -fomit-frame-pointer -pipe -ftracer -mmmx -msse -msse2 -mfpmath=sse"

I don't use O3 and some people claim it can make things slower. For some systems any O* will turn on -fomit-frame-pointer even though every how-to recommends it.

I do use gcc-3.4.3 which isn't marked stable in portage for x86, but I haven't noticed any trouble with it and the system does "feel" a little faster.

hmmm. now I'm slightly interested in linux by djfray · 2005-01-08 05:14 · Score: 1

seriously. I just might put this on my laptop for the fun of it. does anyone know if any of the popular OS's(including bsd and whatnot) use genetic algorithms?

--
This sig is o Unfunny o Funny

Re:Dear Kernel Coders by Anonymous Coward · 2005-01-08 05:43 · Score: 0

I don't get it. If it's from Linux 0.13, how is it new code?

One morning... by Tablizer · 2005-01-08 05:50 · Score: 1

"Dave, I want to be a Linux box. Get NT off of me or I'll turn the heat off."

--
Table-ized A.I.

Remember, people! by Anonymous Coward · 2005-01-08 06:04 · Score: 0

Evolution is only a theory! There's no reason to assume this will actually work :).

Re:Oh, Oh by MarcQuadra · 2005-01-08 06:17 · Score: 1

um, you shoould turn OFF -mmmx -msse -msse2 when using the -march=pentium4 option, they're already set properly.

To see what flags are set 'behind the scenes' you can run 'gcc -Q -v -O3 -march=pentium4 helloworld.c' on a 'hello world' file. Here's an example:

#gcc -Q -v -O3 -march=pentium4 helloworld.c ...
options enabled: -feliminate-unused-debug-types -fdefer-pop
-foptimize-sibling-calls -funit-at-a-time -fcse-follow-jumps
-fcse-skip-blocks -fexpensive-optimizations -fthread-jumps
-fstrength-reduce -funswitch-loops -fpeephole -fforce-mem -ffunction-cse
-fkeep-static-consts -fcaller-saves -fpcc-struct-return -fweb -fgcse
-fgcse-lm -fgcse-sm -fgcse-las -floop-optimize -fcrossjumping
-fif-conversion -fif-conversion2 -frerun-cse-after-loop -frerun-loop-opt
-fdelete-null-pointer-checks -fschedule-insns2 -fsched-interblock
-fsched-spec -fsched-stalled-insns -fsched-stalled-insns-dep
-fbranch-count-reg -freorder-blocks -freorder-functions -frename-registers
-fcprop-registers -fcommon -fregmove -foptimize-register-move
-fargument-alias -fstrict-aliasing -fmerge-constants
-fzero-initialized-in-bss -fident -fpeephole2 -fguess-branch-probability
-fmath-errno -ftrapping-math -m80387 -mhard-float -mno-soft-float
-mieee-fp -mfp-ret-in-387 -maccumulate-outgoing-args -mmmx -msse
-mno-red-zone -mtls-direct-seg-refs -mtune=pentium4 -march=pentium4

--
"Sometimes, I think Trent just needs a cup of hot chocolate and a blankie." -Tori Amos on Nine Inch Nails

Re:The problem: Determining Performance by angel'o'sphere · 2005-01-08 06:25 · Score: 1

You are very right with this: The kernel does not have any mechanism to determine if a particular application is performing better or worse, it can only really get a guage of throughput and load.

I thought long how the "fittness" part of that GA might work. Ultimately the only "fittness" the kernel can with high precision measure is "load" isn't it?

Any better ideas?

angel'o'sphere

--
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.

Behavioral complexity makes it harder to debug.... by mikelang · 2005-01-08 06:40 · Score: 1

What I see here is solution that can behave in a way hard to predict and fails to achieve defensible speed up.
I don't think that "feature accretion" is a good way to develop operating system. It would be hard to reason about it's behaviour.

BS by Anonymous Coward · 2005-01-08 07:08 · Score: 0

Apple has been doing this kind of thing for YEARS now. Linux is only just catching up to where Apple was with OS X version 1. Plus OS X just works.

Re:BS by koreaman · 2005-01-08 14:24 · Score: 1

OSX Version 1 (and onward) are at the core BSD. Apple didn't come out with their OS from scratch you know.

(That being said I love OSX and would use it if it ran on x86)

--
Le français vous intéresse?
Re:BS by GraemeDonaldson · 2005-01-09 23:44 · Score: 1

I'm not being pedantic or anything like that... just curious. Is "OSX Version 1" the correct thing to call it? I was always under the impression it was OS (roman numeral) X, i.e. successor to OS 9. Am I wrong?

--
I think, therefore I am. I think?
Re:BS by koreaman · 2005-01-10 01:29 · Score: 1

I was referring to the first edition of OS X, but yes you are right.

--
Le français vous intéresse?
Re:BS by GraemeDonaldson · 2005-01-10 01:33 · Score: 1

Cool, thanks. ;-)

--
I think, therefore I am. I think?

ITS OFFICIAL! LINUX BEATS BSD ON DISK I/O by Anonymous Coward · 2005-01-08 07:18 · Score: 0

IT IS OFFICIAL; WIRED NEWS CONFIRMS: LINUX IS SUPERIOR TO *BSD
*BSD is Dying, Says Respected Journal

Linux advocates have long insisted that open-source development results in better and more secure software. Now they have statistics to back up their claims.

According to a four-year analysis of the 5.7 million lines of Linux source code conducted by five Stanford University computer science researchers, the Linux kernel programming code is better and more secure than the programming code of *BSD.

The report, set to be released on Tuesday, states that the 2.6 Linux production kernel, shipped with software from Red Hat, Novell and other major Linux software vendors, contains 985 bugs in 5.7 million lines of code, well below the average for *BSD software. NetBSD, by comparison, contains about 40 million lines of code, with new bugs found on a frequent basis.

*BSD software typically has 20 to 30 bugs for every 1,000 lines of code, according to Carnegie Mellon University's CyLab Sustainable Computing Consortium. This would be equivalent to 114,000 to 171,000 bugs in 5.7 million lines of code.

The study identified 0.17 bugs per 1,000 lines of code in the Linux kernel. Of the 985 bugs identified, 627 were in critical parts of the kernel. Another 569 could cause a system crash, 100 were security holes, and 33 of the bugs could result in less-than-optimal system performance.

Seth Hallem, CEO of Coverity, a provider of source-code analysis, noted that the majority of the bugs documented in the study have already been fixed by members of the Linux development community.

"Our findings show that Linux contains an extremely low defect rate and is evidence of the strong security of Linux," said Hallem. "Many security holes in software are the result of software bugs that can be eliminated with good programming processes. When we looked at BSD, we found a high defect rate that indicates very poor programming practices. It is clear to us that BSD is a dying operating system. We predict that it'll be dead within a year or two, except among OS dilettante dabblers."

The Linux source-code analysis project started in 2000 at the Stanford University Computer Science Research Center as part of a large research initiative to improve core software engineering processes in the software industry.

The initiative now continues at Coverity, a software engineering startup that now employs the five researchers who conducted the study. Coverity said it intends to start providing Linux bug analysis reports on a regular basis and will make a summary of the results freely available to the Linux development community.

"This is a benefit to the Linux development community, and we appreciate Coverity's efforts to help us improve the security and stability of Linux," said Andrew Morton, lead Linux kernel maintainer. Morton said developers have already addressed the top-priority bugs uncovered in the study.

Does anyone use this one? by oops.sgw · 2005-01-08 07:21 · Score: 2, Interesting

Just compiled this stuff on an old testbox, now running it for about 100 generations. At first it was feeling very slow, ok, it's a Pentium2 ;-) but it was much slower than running vanilla 2.6.9 or 2.6.10, for example.

But it is getting much better now, I don't know how much generations there will be needed to get things right. It feels pretty much the same as with the vanilla kernels, let's see where this leads ...

Anyone else with experiences? AFAIK this thingy can only be tweaked by editing the code and recompiling, there are a few hardcoded parameters ...

Re:Does anyone use this one? by Anonymous Coward · 2005-01-08 07:37 · Score: 0

I just finished patching and compiling, i dont notice much of a difference, it was a little slow in the beginning but thats was always normal for me when i start up on a new kernel.
Re:Does anyone use this one? by oops.sgw · 2005-01-08 08:08 · Score: 1

I watch the output of "cat /proc/genetic/as-ioscheduler" as I am using the machine. On the Pentium2 I have:

generation_number: 281
num_children: 8
child_number: 4
num_mutations: 8
avg_fitness: 28787
last_gen_avg_fitness: 6760

I am also testing it on a notebook (P4-M 1.8) right now, compiling a kernel while I watch it tuning, there the avg_fitness is at about 3000 (generation ~120).

Have to read that code and try to understand it, very interesting so far ...

Re:The problem: Determining Performance by Corpus_Callosum · 2005-01-08 07:28 · Score: 1

I thought long how the "fittness" part of that GA might work. Ultimately the only "fittness" the kernel can with high precision measure is "load" isn't it?

Any better ideas? I suppose the only way would be to have some "tuning" hooks that applications could use to announce their own performance metrics. The GA could optimize around load when these metrics are not available and take them into account if they are.

--
The reason that it can be true that 1+1 > 2 is that very peculiar nonzero value of the + operator

Moderators on drugs? by Anonymous Coward · 2005-01-08 07:34 · Score: 0

Why would that load of crap be moderated upward? It's a troll.

> Cut/Copy/Paste works just ducky

Have you actually used X Windows? If you pick any two programs at random, you probably won't be able to cut and paste between them. It's probably the most obvious defect that most users of Linux encounter.

Re:Moderators on drugs? by JFitzsimmons · 2005-01-08 09:26 · Score: 2, Informative

Have you??

Proof

--
Beware he who would deny you access to information, for in his heart he dreams himself your master. -Anonymous
Re:Moderators on drugs? by Karma+Farmer · 2005-01-08 10:02 · Score: 0, Troll

WTF was that? Some damned codec that neither quicktime nor windows media player understands, that's what.

Which means that you proved your point beautifully -- people who use linux for a desktop often have absolutely no idea what the rest of the world does with computers.

That's what the -mm patchset is for. by Phong · 2005-01-08 07:36 · Score: 1

How about instead of 2.7 being the groundwork for 2.8, why don't we put changes into 2.7, let the early adopters and kernel hackers use it, and when we find bugs like the Cd-burning problem, we can fix them before we release into 2.6

What you have just described is the workings of the -mm patchset that Andrew Morton maintains. So, the process is already in place--maybe it just needs to be refined a bit more (and perhaps more folks need to be actively testing it).

--
..wayne..

PostgreSQL too uses GA... by mikelang · 2005-01-08 07:48 · Score: 1

...but I guess they can demonstrate considerable speedup at the very least.
It is not the case here - few percentage points are negligible.

Not a real progress... by mikelang · 2005-01-08 07:50 · Score: 1

The problem is: running one app in the background can deviate test results much more than these few percents.
I wouldn't even try to use it if I were looking for speedup.

Wait until they fork off 2.7 by oliverthered · 2005-01-08 08:11 · Score: 1

I've had lots of problems with the 2.6 series, I kinda hoped it would be a bit more stable after 2.4 and the VM. The sooner the fork to 2.7 the better.

--
thank God the internet isn't a human right.

Re:Wait until they fork off 2.7 by WNight · 2005-01-09 10:43 · Score: 1

Stable doesn't mean it doesn't crash - for that, wait until your distro release it if you aren't a kernel hacker. Stable means that the interfaces don't change. That software written for that kernel (device drivers, module managers, etc) will interface with the whole series.
Re:Wait until they fork off 2.7 by oliverthered · 2005-01-09 13:52 · Score: 1

Well, the linux kernel has never had a stable ABI.

My version of stable is bugs being fixed not new features being added or the function of existing ones completely changed.

I have worked on and fixed some drivers and areas of the kernel, things like ADSL modem drivers, a PCMCIA network card, and a couple of broken things in USB, so i don't have a problem fixing things (so long as the target isn't moving faster than I am!)

--
thank God the internet isn't a human right.
Re:Wait until they fork off 2.7 by WNight · 2005-01-09 15:11 · Score: 1

ABI, no. But recompiling a 2.4 driver with the 2.4 kernel is much easier than trying to make it work in 2.2 or 2.6...

I want them to stick with the same branch as long as possible, because that means I can easily try cutting-edge kernels to work with weird stuff (RAID 6, etc) without having to switch to a whole new set of userland tools to do it.
Re:Wait until they fork off 2.7 by oliverthered · 2005-01-09 18:30 · Score: 1

Trying to make it work in 2.6.1 is much easier than trying to make it work in 2.6.10.

Stabler ABI's would mean that it was just as easy to get the driver working in 2.6.1 as 2.6.10.

If you want 'cutting-edge' use cj patch-sets to the main branch. If you want 'bleading-edge' go for unstable.

The kernel is getting too many drivers for the level of ABI stability, someones got-to support those drivers or the quality of the kernel drops.
I don't mind poking around and fixing a few bugs here and there but I don't want to make a day job of it (well unless someone will pay me!).

--
thank God the internet isn't a human right.

Re:Also: why tune a startup routine? by HolyCoitus · 2005-01-08 08:22 · Score: 1

Startup code or a program initializing would not have an effect on the genetic portion. The effects from code being ran are very slow, and tuning things in that code will help. The idea is so that you would not have an adverse effect from a piece of code being ran for a minute. It would take time for the scheduler to change over. This is really designed for a computer that would be doing database work for a time on a website then at night may do some compiling. A desktop would gain little from this in my opinion as it's mostly idle.

For instance, here is the output of vmstat for my desktop:
User: 20
System: 2
Idle: 77
IO Wait: 1

Notice the idle time. The scheduler should be setup on a desktop to always give priority to the task that is running and new. That's what is currently done. This really has a larger impact on something that would do a single task for a long time and then move over to another task, with other tasks possibly sprinkled in. It can optimize for those specific work loads.

It would help a desktop possibly though. I'm not an expert, I'm just using the bits of knowledge I have. Ideas?

--
That's scary.

Re:Dear Kernel Coders by simpleguy · 2005-01-08 08:37 · Score: 1

My beef was about "avoiding" security issues like this, not about fixing them quickly.

Please take time to read next time.

Re:The problem: Determining Performance by Anonymous Coward · 2005-01-08 08:42 · Score: 0

The benchmark is "MAXIMUM IDLE TIME"

optimization is just a way of doing more with less. sure, you can jerk-off and measure the 'more' and get all mired in the complexities, the bottom line is that you want LESS LOAD FOR THE SAME WORK. That's why you optimize: MORE IDLE-TIME.

Yes?

Re:The problem: Determining Performance by Anonymous Coward · 2005-01-08 08:48 · Score: 0

If you are running in an environment with a significant amount of idle time, you really don't need to optimize, don't you think?

Good Idea, Bad Method by Anonymous Coward · 2005-01-08 08:53 · Score: 0

The idea isn't bad, but of all choices of methods,
GA is one of the worst for this problem space.

Re:Dear Kernel Coders by Anonymous Coward · 2005-01-08 08:55 · Score: 0

The GA changes are.

Re:Count the lines in the GA patch before commenti by Anonymous Coward · 2005-01-08 09:47 · Score: 0

The problem ain't complexity it's predictability of the GA.

Most times it'll work. Producing the tiny speed up.

But some times it'll get it so SO wrong.

Outliers, basically.

Probably? by Anonymous Coward · 2005-01-08 10:11 · Score: 0

Try most likely. About 90% of my time at work is spent in four different X-Win programs. I can't cut and paste between any of them. It's almost criminal that the developers don't even try.

Re:hmmm. now I'm slightly interested in linux by jj110888 · 2005-01-08 11:23 · Score: 1

I don't know about the other OSes besides linux, but you would need to manually compile your linux kernel to get this in, Gentoo provides a program to compile it for you, with other distros you can just use the config file their kernel (should be /proc/config or /proc/config.gz) as a base (if you've never compiled and used your own kernel before, um, find some guides) and patch it with all the [patch] links at http://kerneltrap.org/node/4493

Or any startup routine. by tallbill · 2005-01-08 12:36 · Score: 2, Interesting

I was not thinking about just desktop computers.
In general any software has initialization routines. My work has been in robotics, automation, and telecom switching equipment written in C and C++. We spent a lot of time worrying about efficiency, but not for startup routines unless the start lag became annoying. But for the most part people expect things to take a little time to start up. There are a lot of things to do, especially if you have no way of knowing how the thing shut down. I am digressing.

I think that a genetic algorithm to tune sounds like a great idea. I would just want a way to say check this code and not this other code.

I hope that there is much success with this GA.

Not a GA, but a better way to plot a spiral by tallbill · 2005-01-08 13:09 · Score: 1

Year and years ago I was working on antenna design algorithms.
These result in spiral designs that are the pattern of the antenna output.
In those days the calculations took a lot of time becuase it was a 8 MegHz computer (a tandy 1000).

The equations were all f of theta.

And the thing would plot out very slowly.

I had to keep running the thing to try and come up with some reasonable designs.

What I realized was that if I picked random thetas that I could see the shape much quicker. And then if I saw something I liked, then I could plot it from 0 to 2 pi and get the graph for the antenna output.

As these were sine based trigonometric equations they were repetative so 0 to 2pi would produce the same output as 2pi to 4pi, or 4pi to 6pi, etc.

For any sine based equation that is periodic with 2pi you can do the same thing.

Re:The problem: Determining Performance by Dr+Nick30 · 2005-01-08 15:25 · Score: 1

Maybe that's the best time to do some work?

Re:Dear Kernel Coders by Anonymous Coward · 2005-01-08 19:10 · Score: 0

There are teenagers in "the real commercial world"? I've met too many pompous pricks like the parent poster. In "the real world", things don't work this way or that. Fuck their patronizing kind. That's not even an argument, it's just some "I know better than you" type crap. Fucking pompous shitheads.

Three Percent Here. Three Percent There. Adds Up by Anonymous Coward · 2005-01-08 19:13 · Score: 0

Three precent performance increase here and three percent performance increase there. It all adds up.

what if the global optimum is a moving target? by wdebruij · 2005-01-08 21:36 · Score: 1

from the original blurb:
automatically tune themselves for the best possible performance for any given workload

as I understand it, and as you mention, GAs try to converge a solution to a global optimum. However, it appears that they're trying to use it in a situation where the optimal solution is a moving target.

While the idea is great I'm pretty sure that there must be more applicable optimization strategies than GAs for this problem. While on the topic of AI techniques I would suggest simple neural nets (feed-forward with back-propagation), as these tend to quickly adapt to changes in variables such as workload.

NB: my knowledge on the topic is limited to a single course a few years back.

Re:Dear Kernel Coders by Anonymous Coward · 2005-01-08 22:15 · Score: 0

I think the kernel programmers try their best to "avoid" security issues. Who the heck would intentionally put them in? It just so happens bugs are discovered, and then fixed promptly. That's how this "open source" thing works.

Good and all, but... by TheClarkey · 2005-01-08 22:34 · Score: 1

While this is quite an interesting application I do have to question the choice of a GA in this instance. Half the point of using a GA is that the fitness function should be something nice and quick. In this instance I can't see how it could be to give your kernel a "typical load".

At the heart of a GA is the selection criteria, mutation rates, mutation operators, crossover operators and the representation.

On first reading it seems to be quite aggressive and I would suspect that there is a possibility that you'd lose a lot of diversity early on. It'd be intersting to try a different selection strategy (such as roulette wheel) and see how that compares.

not 2.6.10? by Anonymous Coward · 2005-01-08 23:19 · Score: 0

this patches against 2.6.9 - but 2.6.10 was released
ages ago - why the old version used? anyway, i'll have a play with this next week I think! ;-)

Re:Dear Kernel Coders by cduffy · 2005-01-09 08:30 · Score: 1

Stop, go back, and read what you wrote. Yes, there are plenty of teenagers in the commercial world, at least if one goes by attitude rather than age -- you're acting one yourself.

Let the kiddies make their silly claims -- by painting everyone remotely related to them with the same brush and making sweeping unbacked generalizations, you're exhibiting behaviour on the same level as those you criticize.

And anyhow, he's right. Programming talent isn't a commodity that can be applied to any kind of problem. The commercial software company I'm currently employed by[1] has some folks who are better at UI work, folks who are better at integration work, folks who are better at writing database backends, and so forth. Assigning someone who's mastered writing screen scrapers to writing database backends is a waste; likewise, assigning someone who could be working on a new memory management algorithm (for instance) to doing security audits would be stupid when folks who are better at security are also the ones who are more likely to actually be doing it.

[1] - Yes, this commercial sofware company pays me to do open source work from time to time when it's in their best interests to do so.

Algorithm by Anonymous Coward · 2005-01-09 11:22 · Score: 0

Algorithm, funny word

Algorithm = Al Gore rhythm???

afterall he did invent the internet :^P

Fake article (obviously). by Anonymous Coward · 2005-01-09 21:42 · Score: 0

Needless to say, this "article" is fake.
Here are some actual facts about *BSD, that some (most?) Linux advocates wouldn't like people to know. ;)

look at VM system by tallbill · 2005-01-10 19:02 · Score: 1

The Virtual Memory system will slow things down if you have very large objects in your applications. If you have memory structure that are constantly being swapped out, this can result in some severe harddrive trashing.

I saw this problem where I last worked.

And so, even if you have an efficient scheduler, you still have to deal with the VM.

If you put your memory objects into kernel space, then they don't get swapped out.
I am not an expert about VM, just that I wish that I could tag a user space memory object as non-swappable, but you can not (has this changed in newer kernels?).

You might consider a timeslicing algorithm and weave your processes together. If you rely on the kernel to do this for you, you probably will want to explore the real-time extensions to the kernel. Again, I don't have an inside line of this either. I hear you can get all the pieces if you do it yourself, or you can license them from a real-time Linux vendor. If you have critical processes I would consider a timeslicing algorithm and a single process that runs at a higher priority.

Also explore creating a SHEM object and compileing it into your kernel.

Patent? by HogynCymraeg · 2005-01-10 21:36 · Score: 1

This is where I'd like to see the EFF, FSF, or heck, even IBM step forward and put up the cash to get this thing Patented. Oh yes, we're all against software patents, except this is where Linux would have one over M$. Give Microsoft free technology and they will always stay ahead of the game through marketing. Create patents that are open for free software and you kill 2 birds with one stone.

Slashdot Mirror

Tuning The Kernel With A Genetic Algorithm

251 comments