Interview with Matthew Dillon of DragonFly BSD

← Back to Stories (view on slashdot.org)

Interview with Matthew Dillon of DragonFly BSD

Posted by michael on Saturday March 13, 2004 @02:06PM from the laziness-impatience-and-hubris dept.

JigSaw writes "Well-known FreeBSD/DragonFly/Linux/Amiga system hacker Matthew Dillon discusses a number of interesting points regarding where the BSDs are going, the status and goals of his latest project DragonFly BSD, the status of his innovative Backplane distributed database, his exciting plans to develop DragonFly into a transparently cluster-capable system implementing native SSI (Single System Image) which is something that no other operating system can do today, and more."

5 of 233 comments (clear)

Min score:

Reason:

Sort:

Re:Different threading model by Mr.+Darl+McBride · 2004-03-13 14:14 · Score: 5, Informative

It looks like the gist of the threading model for Dragonfly is that threads all stay on one processor. I assume this is for user processes only, and that this isn't pervasive through the kernel?
Nevermind, found an overview here.
Re:Different threading model by Kaladis+Nefarian · 2004-03-13 14:14 · Score: 5, Informative

No this is to do with kernel threads. The userland threading is the same as in FreeBSD 4.x atm, AFAIK. The idea is to keep the model simple, unlike in FreeBSD 5.x where they are having trouble keeping it all sane with their fine-grained mutex model. Have a look at the dragonfly.kernel newsgroup, in nntp.dragonflybsd.org for more details on the SMP model, Matt talks about it regularly earlier on.

--
* Several monkeys are here, playing banjos and wearing small hats.
Something no other OS can do? by fmayhar · 2004-03-13 14:34 · Score: 5, Informative

It's simply not true that "a transparently cluster-capable system implementing native SSI" is "something that no other operating system can do today." We were doing it at Locus in 1994 with SVR4 then with Tandem in 1996 with NonStop Clusters for Unixware. Now some of the same folks at HP have introduced OpenSSI, which is essentially the same code, less all the Unixware-related bits, ported to Linux and placed under the GPL. They are coming up hard on their 1.0 release, which is not bad for five people and such a large task.

OpenSSI is the real thing, it has processes that migrate from node to node, distributed file systems, the works. And it's running now on clusters literally all over the world. (Not many clusters, true, but maybe that will change if the Slashdot crowd finds out about it.)

I'm happy to say that there's a lot of my code in that system, as well.

I know a little about what Matt wants to do with his SSI in Dragonfly, but he should certainly take a look at OpenSSI; we had to solve a lot of the problems you run into when you build such a beast.

(And a beast it is. As complex as a kernel can be, when you have what is essentially a distributed kernel across several nodes, the complexity goes up by orders of magnitude. Makes tracking down those weird hangs pretty exciting, in a painful, time-consuming kind of way.)
Re:SSI? by Kaladis+Nefarian · 2004-03-13 14:36 · Score: 5, Informative

If you read the article, Matt says (about SSI): "It is something that no non-commercial system today can do"...

--
* Several monkeys are here, playing banjos and wearing small hats.
Re:Different threading model by m.dillon · 2004-03-13 15:30 · Score: 5, Informative

Not exactly. All this means is that threads do not migrate preemptively, nor do they migrate while blocked or switched out while in kernel mode. Threads only migrate if (a) the thread itself wants to move to another cpu or (b) the thread is returning to user mode and the userland scheduler decides to migrate the thread to balance the load out (which only applies to threads associated with user processes since no other type of thread can 'return to usermode').
Kernel threads almost universally stay on the cpu they were originally assigned to. High performance threaded subsystems, such as the network stack, are replicated. That is, the network stack creates multiple threads (one per cpu) and those threads do not migrate because, obviously, they do not need to.
Generally speaking, the purpose of making thread migration explicit instead of automatic is to partition a larger data set across available cpu caches rather then cause the same data to be shared amoungst all cpu caches. The processors operate a lot more efficiently and SMP scales a lot better. Most people do not realize the horrendous cost of moving threads between cpus because the cache mastership change is invisibly handled by hardware, but the cost is still there and still very real.
-Matt