Protothreads and Other Wicked C Tricks
lwb writes "For those of you interested in interesting hard-core C programming
tricks: Adam Dunkels' protothreads library
implements an unusually lightweight type of threads. Protothreads are
not real threads, but rather something in between an event-driven
state machine and regular threads. But they are implemented in 100%
portable ANSI C and with an interesting but quite unintuitive use of the switch/case
construct. The same trick has previously been used by Simon Tatham to implement
coroutines
in C. The trick was originally invented by Tom Duff and dubbed Duff's
device. You either love it or you hate it!"
I used a Lifeboat lib back in the late 80's that this reminds me of. Cooperative multitasking. Eventually ported the whole thing to OS/2 and used that threading instead. All the code pretyy much worked as-is.
The revolution will NOT be televised.
The PPC architecture has a special-purpose count register with specialized branch instructions relating to it; e.g., the assembly mnemonic 'bdnz' means "decrement the count register by one, and branch if it has not reached zero." I've used this in some pretty weird loops, including this one that broke the Codewarrior 9.3 compiler (fixed in 9.4.) This computes the location of the n'th trailing one in a 32-bit integer. Pardon my weak attempt at formatting this in HTML:
static uint32 nth_trailing_one(register uint32 p, register uint32 n) { end: }
return __cntlzw(p ^ (p - 1));
}
The idea was that the instruction stream should stay as linear as possible; most of the time the branches are not taken, and execution falls through to the next line of code. Ironically (siliconically?), the entire function could probably be implemented in a single cycle in silicon; shoehorning bitwise functions like this into standard instructions tends to be extremely wasteful. Perhaps FPGA's will make an end run around this at some point. I've also tried this function with a dynamically-calculated jump at the beginning, similar to the case statement logic in the article.
Hmm, I had a point I was trying to make with this post, but now it's escaped my mind...
Weeks of coding saves hours of planning.
I got to this little gem:
My English parser thread shut down at that point . . .
Seriously, this looks like a handy little thing for low-memory systems, though I'd be a bit hesitant about pushing at the C standard like that--the last thing you need is a little compiler bug eating your program because the compiler writers never thought you'd do crazy things to switch blocks like that.
Weightless threads in Python:
y thrd.html
http://www-128.ibm.com/developerworks/library/l-p
They are cooperative but far more efficient than Python's own threading model. You can easily create hundreds of thousands of concurrent threads.
Get the book Obfiscated C and Other Mysteries by Don Libes. Explanations of various Obfuscated C contest entries, and alternate chapters illustrate neat corners of C, including a few things similar to this little library. Occupies a place of honor on my shelf.
PHEM - party like it's 1997-2003!
FYI this technique is heavily exploited in the programming language Felix:
.. just a tad faster than Linux. Both MLton and Haskell also support this style of threading with high thread counts and switch rates (although the underlying technology is different).
http://felix.sf.net/
to provide user space threading. The main difference is that all the 'C tricks' are generated automatically by the language translator. If you're using gcc then the switch is replaced by a computed jump (a gcc language extension). On my AMD64/2800 time for creating 500,000 threads and sending each a message is 2 seconds, most of the time probably being consumed by calls to malloc, so the real thread creation and context switch rate is probably greater than Meg/sec order
John Skaller mailto:skaller@users.sf.net