The 25-Year-Old BSD Bug
sproketboy writes with news that a developer named Marc Balmer has recently fixed a bug in a bit of BSD code which is roughly 25 years old. In addition to the OSnews summary, you can read Balmer's comments and a technical description of the bug.
"This code will not work as expected when seeking to the second entry of a block where the first has been deleted: seekdir() calls readdir() which happily skips the first entry (it has inode set to zero), and advance to the second entry. When the user now calls readdir() to read the directory entry to which he just seekdir()ed, he does not get the second entry but the third. Much to my surprise I not only found this problem in all other BSDs or BSD derived systems like Mac OS X, but also in very old BSD versions. I first checked 4.4BSD Lite 2, and Otto confirmed it is also in 4.2BSD. The bug has been around for roughly 25 years or more."
Except that the bug had been triggered many times before, seeing as how Samba had code in place to work around it.
This bug has been around for a long time, but is only visible if you have large directories and delete files from them in between calls to readdir and seekdir.
But that's exactly my point, isn't it? The bug was only "visible" through its behavior, not its manifestation in code. The shallow bugs argument basically says that if enough people stare at the code, they will find the bugs. Clearly that did not happen here.
Whether the bug fix can propagate rapidly has nothing to do with what I'm talking about. I'm not trying to disparage the concept of open source, I'm arguing that the shallow-bugs argument should be rejected.
It never takes very long for the BSD zealots to start finger pointing. Yes, folks, BSD had a bug that was there for 25 years that many people knew about and noone bothered to fix, but it is SAMBA's fault for not doing.... what exactly.... to force the theo the rats of the world to acknowledge it? You guys really are shamelessly arrogant.
What really intrigues me is that this had been discovered, years earlier, by the Samba folk.
Did the Samba folk not tell the BSD folk of the issue?
Hmmm....if the bug was found in 4.2BSD, then how do we know that that bug was not also in original AT&T UNIX that 4.2 BSD is derived from? One could always look in the source released by Caldera (now known as "The SCO Group") some years back.
My blog
Yes, Samba did pass on what it found and it appears they were promptly shot down by someone on the *BSD side.
The Samba e-mail archives contain a message from over 3 years ago, but it doesn't give attribution to the *BSD source.
The Samba Bugzilla also has a bug reported more recently involving the same issue. Reading through the bug history, you can see there was one FreeBSD dev involved in the bug discussion, and he referenced a prior conversation between Tridge (Samba) and PHK (FreeBSD) where PHK said there was no bug in FreeBSD.
I am sure you will agree that the correct statement sans flamebait modifications does not warrant a "clear contradiction" as many detractors of FOSS who are jumping at this opportunity to point out a example of a fixed bug that was not necessarily a security risk and saying "see, the OSS model is clearly flawed! BSD has a 25year old bug that was only fixed now!"
Take off your paranoid hat. Holy crap. I am an open source author myself. I just have always hated this particular argument.
Am I the only one who thinks it's quite impressive to have 25 year old code still being used and employed on new systems?
What?
Is the MPEG Chroma bug. That was created by someone who wrote one of the original MPEG decoders that was eventually sold/distributed to most of the companies making the first DVD players (pre-1993). This one just won't go away either - initially most of the DVD manufacturers refused to acknowledge it even existed (probably because they didn't want to recall millions of DVD players with non-upgradeable firmware). I still see it every now and then on TV (indicating one of the upstream broadcasting companies is still using equipment afflicted with the bug). I notice it most often when diagonal red lines end up staircased like they're poorly interlaced (see pictures in the above link).
Let's take Microsoft's word that it did: Try find "California" < %windir%\system32\ftp.exe
Nothing shady about it either; that's the beauty of BSD code.
Yes.
People were aware that the notion of a directory being a sequence of entries, with each entry having a position such that you can get the position of an entry and later seek to that position and have the next read return that entry even if changes were made to the directory in the interim, was wrong.
That doesn't mean that they were just trying to avoid having the standard imply that particular bug was fixed - it means that they were trying to avoid making a promise that some reasonable implementations of directories can't keep even if those implementations have no bugs.
The problem is that the stdio directory scanning routines cache multiple directory entries with a single getdirentries() system call, but then may try to 'seek' into the middle of that buffer later on.
Any filesystem based on a non-linear-file directory format, such as a B-Tree, will simply never produce consistent offsets or indices within such a buffer.
The only way to *REALLY* fix this is to add a cookie field to the filesystem-independant dirent structure (and if your BSD isn't using a filesystem-independant dirent structure, it needs to be first fixed to do that). lseek()ing to a directory cookie works just fine, and always will (or at least will far more robustly then trying to scan a re-cached buffer from getdirentries()).
When DragonFly went to a filesystem-independant dirent structure I very stupidly only added ~40 reserved bits to the dirent structure, instead of the 64 we need to properly implement per-entry directory cookies. I'm still pissed at myself for that gaff.
In anycase, a per-entry directory cookie effectively solves the problem. The only other way to get such cookies, if it can't be embedded in the dirent structure, is to create a new system call similar to getdirentries() but which also populates an array of directory cookies. FreeBSD and DragonFly have kernel implementations of readdir which supply per-entry directory cookies so it is really just a matter of creating the new system call and then making libc use it.
-Matt
I wonder if one could sue SCO for crappy code?
I mean, a burglar can sue the owner of the property they're burgling for leaving it in a dangerous condition, so why not this too?
That's if it were true, of course.
Max.
Thanks for the link, it made for an interesting read. The "allowed by POSIX" argument is amusing given that if you flip the manual page (in the linked POSIX site) over to seekdir it clearly states that seekdir(telldir()) works as the Samba code needed it to. Ah well, at least the fix finally happened three years later.