Slashdot Mirror


What Happens To -AC (And Other) Kernel Mods?

RedLeg wrote with this poser: "So, looking at the changelog for the 2.4.9 kernel release, I see a few '- Alan Cox: driver merges' entries. Intelligent consumers of (or those of us who modify them for our own uses) RedHat Kernel src.RPMs look at the patches in the RH kernel builds. Alan's (and other persistent RH) patches don't seem to be integrated into Linus' 'mainstream' kernel trees on any kind of a predictable basis, and this frequently causes projects like freeswan to have difficulty merging their patches (not intended for kernel inclusion) with kernels that appear 'in the wild' like the kernel RPMs from RedHat. Often, kernel patches for obviously older kernel versions continue to be applied (in the RPMs) to newer kernel versions. Alan is a RedHat-er, so he obviously has an inside track to RedHat kernel builds, but he's also Linus' Right-Hand man, but his patches are not (apparently) consistently making it into the 'mainstream' kernel. What am I missing?" Who better to answer this question than Alan Cox? Alan was kind enough to write an explanation of the (still complicated) process of merging -- and it's not as simple as who works for what distro maker ;)

Note: Here's what Alan passed on in response to this question. As usual, things aren't quite as simple as they first appear. -T.

Alan Cox: Probably the first thing to explain is the Red Hat kernel. That actually isn't something I am responsible for. Arjan van de Ven is the keeper of the distribution kernel, and has the unenviable task of getting a kernel together that will actually pass all the brutal QA testing. Arjan is perfectly entitled to (and sometimes does) throw out bits of -ac changes.

You'll see Red Hat patches being merged into -ac and Linus trees when appropriate, often from Arjan or Pete Zaitcev. Many of the other patches in the RH tree are considered "fixups" - they are workarounds for problems but not generalised or clean enough to feed into the main tree without further work. Others are RH specific patches for things like packaging.

With the -ac tree I try and do rapid rolling releases, sucking in new code to test it and also its interactions with other new code. By doing releases every few days I get a high number of people testing and reporting bugs before there are too many possible causes. This is how Linus trees used to work long ago, and I still think its the better technique.

At regular intervals I take stuff from the -ac tree and feed it to Linus. Sometimes Linus doesn't want to take other changes in case they confuse other things being done, sometimes they just vanish and fairly often they get applied.

I'm actually limited in the rate I can forward patches because I need to feed Linus blocks that are debuggable. Thus I don't want to feed Linus both file system and disk driver changes at once or I won't know which to blame if there are corruption reports.

I also don't feed Linus code that has active maintainers unless the maintainer has asked me to do so. Thus the USB diverges quite a lot because Johannes Erdfelt has chosen not to feed chunks of the USB and input changes on. Similarly, the user-mode-linux port in -ac has not been fed on to Linus because Jeff Dike wishes to improve it further before submitting it.

I have been concentrating on getting the driver code and some architectures synchronized with Linus, and that is now mostly done. The next big challenge is getting all the file system work on to Linus, and Al Viro has begun that and fed Linus the first blocks of the superblock handling cleanup.

Finally we have changes that are down to fundamental disagreements, perhaps in part stemming from the fact my background is real production systems rather than OS design work. Linus decided to update the 3D support without keeping back compatibility - I kept both. Linus I suspect will never accept a patch to do that. Secondly he decided that he didn't wish to allocate new device major numbers but look for a saner solution over time. Laudible, but not in the middle of a stable release. The -ac tree has drivers allocated "non-Linus" major numbers that are recognized by LANANA and thus common across vendors. These drivers like the HPT370 and Promise IDE raid will thus always be part of the -ac tree only.

The -ac tree also tries hard to avoid any incompatibilities. Having applications that require -ac or Linus trees is simply not an acceptable situation. The only specific exception for that right now for 2.4.x is deep at the system level and is for quota tools. That one was unavoidable to get 32bit uid quota working.

35 of 164 comments (clear)

  1. Post-Mortem debugging of multithreaded processes by sllort · · Score: 5, Interesting

    One thing that's been in the -ac kernels for quite some time is the ability to post-mortem debug multithreaded processes. That is, under the production kernel, when you core dump, all the threading information is lost. You can't get the call stack of each thread. With the -ac kernels you got one core file per pid, with each LWP (lightweight process) getting its' own core file.

    Considering that Solaris has had this (what seems to be BASIC) functionality for years, why do we see the continued insistence on keeping this functionality out of the production kernel? Are we waiting for the gdb team to catch up?

    Until this is fixed, multithreaded programming under Linux will remain a black art - only developers willing to apply hordes of -ac patches to a homegrown development kernel have a change of successfully developing a multi-threaded application under Linux. Considering that many commercial software development packages (RogueWave, for instance) won't even support you if you're not using a RedHat released kernel, this puts multi-threaded development "out-of-bounds" for many.

    Merge the -ac kernel mods!

  2. Wait a minute by bconway · · Score: 2

    With the -ac tree I try and do rapid rolling releases, sucking in new code to test it and also its interactions with other new code. By doing releases every few days I get a high number of people testing and reporting bugs before there are too many possible causes. This is how Linus trees used to work long ago, and I still think its the better technique.

    Perhaps he meant the unstable series Linus releases. I sure as hell would NOT like to see a new "stable" kernel release every few days. The current faux-schedule of a new release every couple of weeks seems a bit too quick for decent testing to me, to tell you the truth.

    --
    Interested in open source engine management for your Subaru?
    1. Re:Wait a minute by Alan+Cox · · Score: 4, Informative

      You make an assumption that the right way to test code is in big lumps. That is somethiny any engineer will tell you is bogus.

      You test continually, you test each changeset, and then every so often you run a several day shakedown test.

      You are right that you can't QA a kernel to vendor production grade in two weeks. Some of the RH test runs take several days per run for example.

    2. Re:Wait a minute by bluGill · · Score: 2

      Sure, but sometimes you have to test all at once.

      I've worked on projects before where the hardware was only partially stable, and the rom code was changed daily and normally couldn't boot. Those doing higher level work on the system were forced to write code for the specs, and debug latter. It isn't the best, but it works. (Yes there is simulation, but simulation tends to comment out all the hardware interaction, which changes things drasticly)

    3. Re:Wait a minute by spudnic · · Score: 2

      So you do that in the deveopment tree, not the stable one.

      It's crazy to have stable releases coming out as often as we do. There is no way that you could change enough in the kernel to justify a new stable release 2 weeks (or less) later unless it was a to fix a major oversight (read: bug) in the previous stable release. You just don't have time to test things out.

      --
      load "linux",8,1
  3. Re:Post-Mortem debugging of multithreaded processe by scorpioX · · Score: 3, Informative

    Max OS X has this feature as well. If you set CRASHDEBUG=-YES- in /etc/hostconfig, you will get a dump of all thread stacks and the CPU(s) registers when a process bombs out. Very handy. I believe that HP-UX also has this feature. Surprising that Linux doesn't.

  4. from the cyfrifiadurol dept... by JJGreenaway · · Score: 4, Informative

    In case anyones wondering 'cyfrifiadurol' isn't a typo. It's Welsh roughly meaning 'to do with computers'.

    And before anyone says it, yes, computers have reached Wales now...

    1. Re:from the cyfrifiadurol dept... by scrytch · · Score: 3, Funny

      > And before anyone says it, yes, computers have reached Wales now

      yeah but the cost of classified ads in the paper is prohibitive when they're looking for programmers in llyncyrfdlywrfldycrlycywwcrynrfrwnr...

      i'd like to buy a vowel.

      (oh crap i think i just called someone's mother a really nasty name)

      --
      I've finally had it: until slashdot gets article moderation, I am not coming back.
  5. I think by wiredog · · Score: 2

    That their database is still Having Issues. We need a story on what's going on there.

  6. Re:I had no idea so much still needed work by Mr.Phil · · Score: 2

    well, everyone was pretty pushy about getting the 2.4 kernel out the door. Maybe linux suffered from the hype of the internet economy too?

  7. Re:Yeah but by Alan+Cox · · Score: 5, Interesting

    The goal there is to make it unneccessary. 2.4.8-ac7/ac8 have slightly smarter VM merging behaviour done by Ben LaHaise for example.

  8. I can appreciate the problem by jd · · Score: 4, Interesting
    The FOLK project (gratuitous plug!) runs into all sorts of problems, all the time, from inconsistancies, patch for A being out of sync with patch for B, etc, etc, etc.


    That's one of the reasons I started that project, in the first place. Because it's mind-numbingly tedious to massage patches from different groups together. If you can get the whole thing in one gigantic gloopy splodge, life would be much easier.


    Unfortunately, I've discovered a number of things along the way:

    • Debugging said gloopy splodge is a Royal Pain!
    • Finding others who will help debug said gloopy splodge is not easy.
    • Finding others who will even -report- bugs in said gloopy splodge isn't easy, ether.


    That's not to say that FOLK is a disaster. Quite the opposite! I'm learning a huge amount about the Linux kernel, for a start, and the sheer complexity of juggling hundreds of patches is really giving my C coding skills a workout and a half!


    My hat is off to Alan Cox who not only manages his patch set with far more grace than I ever could, but actually keeps it so that it runs!


    I know the Royal Web Admin uses Linux (cos that was on an interview, some time ago), so if he's reading & has any influence, I honestly think Sir Cox would not be an undeserved title for his amazing computing skills and his contribution to both computing and Britain.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    1. Re:I can appreciate the problem by 4of12 · · Score: 2

      Lameness filter encountered. Post aborted!

      Thanks. I'll keep my thoughts to myself from now on. Between this Innovation® and the new unavoidable link to the front page of the site that started to afflict "older stuff" I'm about ready to fugged about /. entirely.

      --
      "Provided by the management for your protection."
  9. Promise IDE RAID by blackwizard · · Score: 2

    I didn't know the -ac trees had Promise IDE RAID support. I looked all over for a good solution for using that thing, after I carefully soldered on my resistor, only to find out that there was no good Linux support for the darned thing. Has anyone had any success using this thing in Linux (not using the closed-source drivers?) This is something I'd really like to see in a stable kernel.. does anybody know how to tell what -ac releases are stable, if any? **pounding head on cubicle wall** I suppose I can use software RAID if that's what it comes down to.

    1. Re:Promise IDE RAID by Tony+Hoyle · · Score: 3, Informative

      The Promise RAID *is* software RAID. All the kernel can give you is access to the extra IDE ports (which is does).

    2. Re:Promise IDE RAID by Sunthalazar · · Score: 2, Informative

      I don't know specifically about the Promise IDE RAID, but I do know about the HPT370 RAID. They are actually included together.
      All I can really say is that it's not quite perfect. I'm using an on-board HPT370 [it's part of my Abit VP-6], and under Win2K I don't think I've had any crashes [well other than some reproducable ones that are things that I've done]
      But I've used the -ac patches in 2.4.6-2.4.8 and so far there is still a couple of times when my machine will just lock up. It seems to be related to disk access, but it also only happens when I'm running X. Without X, I haven't had any problems [although I can't run Mozilla or XMMS, etc without X]
      In general, though, it recognizes and runs fine. I haven't had any general data inconsistency [I run ReiserFS on the RAID partitions]
      Again, this is for the HPT370, not the Promise IDE RAID, but since they are in the same kernel patch, I figured their results would be similar.

    3. Re:Promise IDE RAID by EvlG · · Score: 2

      That is because the highpoint chips are crap.

      I had a HPT366 in my Abit BE6 and have had nothing but problems with it. I won't buy another board that uses a highpoint - they are junk.

  10. Re:I had no idea so much still needed work by NNKK · · Score: 2, Interesting

    "Competing" ?
    Without these alternate kernel trees, nothing would ever get done. the -ac trees really aren't a tree I'd recommend for a production server unless it has a fix or driver that the server desperately needs... the -ac's are to test and impliment things in advance to KEEP the "stable" tree "stable", and keep Linus happy that the patches he's putting in the mainstream tree have been tested.

    And no, I don't feel the 2.4 series deserves to be called stable, but I damn well use it anyway on my primary desktop box :)

    Linus, GIVE US 2.5 TO PLAY WITH!

  11. Don't Feed the Linus by grammar+fascist · · Score: 2, Funny

    At regular intervals I take stuff from the -ac tree and feed it to Linus.

    ...because I need to feed Linus blocks that are debuggable. Thus I don't want to feed Linus both file system and disk driver changes at once...

    I also don't feed Linus code that has active maintainers unless the maintainer has asked me to do so.

    So Linus eats code. Everything is so clear now...

    --
    I got my Linux laptop at System76.
    1. Re:Don't Feed the Linus by Red+Moose · · Score: 2, Funny

      Didn't you know? Linus is like that dolphin from Johnny Mnemonic, only he's a penguin instead. And he's got loads of hi-tec stuff strapped on his head and can track submarines and doing linux is only a part-time thing, really.

      --

      Acting stupid isn't much fun when there's someone around who knows better

    2. Re:Don't Feed the Linus by ethereal · · Score: 2, Funny

      So does that mean Alan Cox is really Ice-T? [shudder]

      --

      Your right to not believe: Americans United for Separation of Church and

  12. Re:Post-Mortem debugging of multithreaded processe by n0ano · · Score: 5, Informative
    The thread core dump patch was originally put into Alan's tree around the 2.4.3 time frame. It was quite correctly labeled experimental at the time (it took a few iterations to get it right.) The intent is to merge it into Linus' tree at some time, it just hasn't gotten there yet.


    In the mean time, if you're desperate, I can give you a patch that provides this capability to any Linus tree.

    --
    Don Dugger
    "Censeo Toto nos in Kansa esse decisse." - D. Gale
  13. Re:Post-Mortem debugging of multithreaded processe by cnkeller · · Score: 2

    Someone mod this parent up. Seems like a worthwhile patch.

    --

    there are no stupid questions, but there are a lot of inquisitive idiots

  14. All my confidence in Linux is lost forever by fobbman · · Score: 4, Funny

    What Happens To -AC (And Other) Kernel Mods?

    I'm sorry, but if the kernel has a bunch of modifications done by people who find it necessary to be referred to as the initials for Anonymous Coward then how can we trust the security of the kernel?

    They get modded down on /. but then get merged into the kernel source? Let's make a stand and stick to it!

    Oh, and I copied these comments to a text file so I can repost it in the event that /. pukes up it's guts again.

  15. Re:I think ... it ate my journal entry! by Sun+Tzu · · Score: 2

    Yes, indeed it is having problems.... My journal entry was made about 24 hours ago. It was gone this morning so we've had maybe over a day of problems now.

    Give us an update, CT!

    Oh, and ignore the 'game client' link below -- if it is still there. Yesterday it pointed to my journal entry.

  16. A Canonical Response by llywrch · · Score: 2

    > "At regular intervals I take stuff from the -ac tree and feed it to Linus."

    > Take that out of context and think about it.

    Just because Alan Cox is east of the US doesn't mean he is a snake.

    And why are all of you handing me my coat?

    Geoff

    --
    I think I see a trend here. Maybe for them it really would be easier to muzzle the entire internet than to produce p
  17. Re:Athlon CPU support should (EXPERIMENTAL) by OblongPlatypus · · Score: 2

    I thought that was a bug in a VIA chipset? I'm pretty sure the Athlon optimization stuff is pretty rock solid and definitely not experimental. I'd hate for them to start slapping the (EXPERIMENTAL) label on anything which might cause problems or conflicts on some specific platforms. (Although that's what MS did with their latest OS, didn't they...)

    --
    -- If no truths are spoken then no lies can hide --
  18. Re:I had no idea so much still needed work by gaydot · · Score: 2, Insightful



    Now I don't want to start a flame war here, but when someone says "Linux" what do they mean?

    There are so many distributions, each with little kernel tweaks (RedHat IMHO is especially bad) and different userland applications. So, I have kernel 2.4.9 with what patches, and what userland apps, what compiler, etc.? I'm totally confused. What *is* Linux?

    If I go to kernel.org and download a 30MB tarball of the kernel.... now what about the userland apps? Where do those come from? Who wrote "ps"? What version of "netstat" should I use? Which gcc compiler should I use?

    However, with an OS like FreeBSD... "FreeBSD 4.3-RELEASE" refers to one and only one kernel & userland. No questions. No confusion. If there are problems with a particular version, I can retrieve a CVS snapshot of the kernel and userland sources at any point in time since the beginning of the project. Very impressive.

    ~g.

  19. Re:Post-Mortem debugging of multithreaded processe by Mark+Kettenis · · Score: 2, Informative

    As a member of the GDB team (maintainer of the Linux/i386 port and co-maintainer of the threads support in GDB) I'm not aware of any coordination between the kernel folks an GDB at all. On top of that I'm not inclined to add support for this to GDB until it ends up in Linus' kernel. Anyway, the one-core-file-per-pid approach seems wrong to me. It's a waste of disk space since you're duplicating the VM for every pid. And isn't well suited to how GDB deals with multi-threaded core files on other platforms. A better approach would be to add an additional note with the register contents for each LWP to the same core file.

  20. Re:I had no idea so much still needed work by Cramer · · Score: 2

    If anyone remembers, back when linux rolled to the spankin' version "1.0", Linus said, "Calm down children. It's just a number." The quote may not be 100% correct as this was many years ago, but the part about the version being "just a number" is as true today as it was then.

    Linus and company may know how to write code, but it's very obvious they were never taught to manage all of their code. Talk about a black art. Linus' hatred of CVS doesn't make things any easier.

  21. Alan's Kernel is becoming more "official" by trevorcor · · Score: 2, Interesting

    This is very interesting:
    http://www.uwsg.indiana.edu/hypermail/linux/kernel /0108.2/0416.html

    It seems Alan Cox is considering his -ac kernel tree to be a legitimate alternative to the official "Linus" tree, rather than a playground for testing patches. He's actually perpetuating a difference between -ac and the official tree, in a way that breaks source compatibility between the two (albeit in a very small way.) The fact that RedHat's kernels are all based on -ac now bears this out.

    Alan has forked the Linux kernel.

    ::meyhem::

    I think he has Linus' blessing in this though. Reading between the lines, I think Alan has been taking on more and more work in the past year or so that had previously been Linus'. Linux is ten years old now; I suppose Linus is burning out.

    And Alan works for RedHat too, which is one of the two distributions that I *know* will be around in ten more years -- they have a solid business plan. Alan is voracious, just tireless, and RedHat would hire the entire core kernel development team if they had to. Linux will not die for lack of a maintainer if Linus gets hit by a bus tomorrow.

    --
    "That's all I have to say about that" --Forrest Gump
  22. Re:I had no idea so much still needed work by Cramer · · Score: 2

    Well, there's stable and then there's stable... 2.4 was, what, over a year past due for release? How many 1.3 kernel revisions were there before 2.0 was declared?

    A lot of things could be done differently. They, of course, won't because Linus hates CVS. Personally, I'd prefer something along the lines of ClearCase, but I'm paying for the licenses :-) It's time consuming work -- over 50% of work load at Make Systems was configuration management related stuff and there were only a few dozen people working on code.

    Anyone who knows anything about code management knows what a "code freeze" is. They also know to begin working within new branches (long) before freezing previous branches. 2.5 should have been open for development when 2.3 was "frozen". And certainly parts of the current 2.4 tree don't belong there -- bug fixes and isolated back ports from the development branch are all that should be going into the 2.4 line. Additionally, it's hard to construct a schedule when you have no clear direction -- a list of features to be in X and which features are push further down the line.

    Of course, I don't what the kernel versions to start looking like Cisco IOS tags :-)

  23. Re:How about getting XFS in -ac? by Brian+Knotts · · Score: 2
    Thanks for the information! I'd sure like to hear from the other side; why do they seem to be ignoring the XFS team?

    I really am very impressed with XFS; it seems like solid, proven code. I think XFS has the best chance at being the heavy-duty file system for Linux.

  24. HPT366 needs hdparm for stability by korpiq · · Score: 2

    This is what I use with an external HPT366 to have it run stable:

    /sbin/hdparm -c 3 -m 16 -u 0

    cheers

    --

    I think, therefore thoughts exist. Ego is just an impression.
  25. Re:I had no idea so much still needed work by Hard_Code · · Score: 2

    CVS has well earned it's hate. I'm all for version control, but CVS is an ugly hackish piece of crap. I sure wish Subversion would get completed (or at least releasable). As it is, CVS has too much momentum behind it because it "just works" (or "sorta" works). One of the curses of the Unix mentality (have separate tools which only do one specific thing), is tools that just *barely* do the job enough to scratch the particular developer's itch. Unfortunately version control is not that sexy.

    --

    It's 10 PM. Do you know if you're un-American?