Protothreads and Other Wicked C Tricks
lwb writes "For those of you interested in interesting hard-core C programming
tricks: Adam Dunkels' protothreads library
implements an unusually lightweight type of threads. Protothreads are
not real threads, but rather something in between an event-driven
state machine and regular threads. But they are implemented in 100%
portable ANSI C and with an interesting but quite unintuitive use of the switch/case
construct. The same trick has previously been used by Simon Tatham to implement
coroutines
in C. The trick was originally invented by Tom Duff and dubbed Duff's
device. You either love it or you hate it!"
I used a Lifeboat lib back in the late 80's that this reminds me of. Cooperative multitasking. Eventually ported the whole thing to OS/2 and used that threading instead. All the code pretyy much worked as-is.
The revolution will NOT be televised.
Duff's device is a way of forcing C to do a form of loop unrolling. It has nothing to do with coroutines.
Protothreads are quite limited in what they can do - they are just a way of expressing a state machine using co-routines.
Real threads are much better for things like interpreters. JIT compilation with native threads is even better.
The PPC architecture has a special-purpose count register with specialized branch instructions relating to it; e.g., the assembly mnemonic 'bdnz' means "decrement the count register by one, and branch if it has not reached zero." I've used this in some pretty weird loops, including this one that broke the Codewarrior 9.3 compiler (fixed in 9.4.) This computes the location of the n'th trailing one in a 32-bit integer. Pardon my weak attempt at formatting this in HTML:
static uint32 nth_trailing_one(register uint32 p, register uint32 n) { end: }
return __cntlzw(p ^ (p - 1));
}
The idea was that the instruction stream should stay as linear as possible; most of the time the branches are not taken, and execution falls through to the next line of code. Ironically (siliconically?), the entire function could probably be implemented in a single cycle in silicon; shoehorning bitwise functions like this into standard instructions tends to be extremely wasteful. Perhaps FPGA's will make an end run around this at some point. I've also tried this function with a dynamically-calculated jump at the beginning, similar to the case statement logic in the article.
Hmm, I had a point I was trying to make with this post, but now it's escaped my mind...
Weeks of coding saves hours of planning.
I got to this little gem:
My English parser thread shut down at that point . . .
Seriously, this looks like a handy little thing for low-memory systems, though I'd be a bit hesitant about pushing at the C standard like that--the last thing you need is a little compiler bug eating your program because the compiler writers never thought you'd do crazy things to switch blocks like that.
Is this similar to Stackless Python and green threads? I spend most of my time with interpreted languages, so my C is very lacking.
random underscore blankspace at ya know hoo dot comedy.
Weightless threads in Python:
y thrd.html
http://www-128.ibm.com/developerworks/library/l-p
They are cooperative but far more efficient than Python's own threading model. You can easily create hundreds of thousands of concurrent threads.
From the above Duff on Duff's Device: I have another revolting way to use switches to implement interrupt driven state machines but it's too horrid to go into.
Perhaps this is the Duff's Device equivalent of a proof of Fermat's Last Theorem? Or is my ignorance of the history of Evil Computing showing?
//Information does not want to be free; it wants to breed.
Get the book Obfiscated C and Other Mysteries by Don Libes. Explanations of various Obfuscated C contest entries, and alternate chapters illustrate neat corners of C, including a few things similar to this little library. Occupies a place of honor on my shelf.
PHEM - party like it's 1997-2003!
"The competent programmer is fully aware of the limited size of his own skull. He therefore approaches his task with full humility, and avoids clever tricks like the plague." -Edsgar Dijkstra
Fetch Text URL - Firefox Extension
As the prothread homepage says, it's for extremely small embedded systems, where there are no operating systems, with tiny amount of memory (You can't use DRAMs on systems that cost something less than $1). Want to use threads on those kind of systems, you have no choice.
Another advantage is its portability. Small embedded systems, whether they have operating systems or not, usually can't support some fully-blown threading standard. Those operating systems seem to implement some kind of 'specially tuned' thread APIs.
Using these kind of threads on a full-blown PC (or servers) would have almost no benefit. However, in the embedded software engineer's perspective, it's great to see a ultra-lightweight thread library without any platform-dependent code.
I downloaded it. But the version that is there does not, in fact, compile: there's an obvious typo at example-buffer.c, line 137:That aside, I believe your claim is wrong now that I've read the code.right? The reason is that you are simply case-statementing into the code and bypassing the whole subroutine calling mechanism entirely -- so the variable initialization just isn't called.
Now consider the following code:Suppose the compiler decided to extract out the common y*(2*pi)/360 term as an optimization. The assignment to that common term will not occur when you jump into the middle of that routine -- so your code will not work as it appears.
Okay, I'll play the n00b. I understand most of this, but my coding background is not that great, and mostly in C++, Java, and PHP, and I'm having problems with the switch from Duff's Device...
switch (count % 8)
{
case 0: do { *to = *from++;
case 7: *to = *from++;
case 6: *to = *from++;
case 5: *to = *from++;
case 4: *to = *from++;
case 3: *to = *from++;
case 2: *to = *from++;
case 1: *to = *from++;
} while (--n > 0);
}
What the hell is up with that do { applying only in case zero? It's in several places on the net just like that and Visual Studio compiles this just fine, so it's not an error. I checked K&R, and they don't even hint at what could be going on there... I'm lost. Help?
Telltale Games: Bone, Sam and Max
FYI this technique is heavily exploited in the programming language Felix:
.. just a tad faster than Linux. Both MLton and Haskell also support this style of threading with high thread counts and switch rates (although the underlying technology is different).
http://felix.sf.net/
to provide user space threading. The main difference is that all the 'C tricks' are generated automatically by the language translator. If you're using gcc then the switch is replaced by a computed jump (a gcc language extension). On my AMD64/2800 time for creating 500,000 threads and sending each a message is 2 seconds, most of the time probably being consumed by calls to malloc, so the real thread creation and context switch rate is probably greater than Meg/sec order
John Skaller mailto:skaller@users.sf.net
My experiences contradict your statement. State machines are both easy to implement, and easy to debug, if you do it the right way. I have seen many entirely wrong implementations, including one where you can go from any of about two dozen "states" to any other. I have seen some that just switch states when they feel like it, or switch states based on complex decisions, which makes debugging difficult. Put another way, you can make a "state machine" degenerate into something else, and nullify its benefits, if you refuse to follow the rules.
A well-implemented state machine has an important characteristic: it is clear to see why you are where you are. This means that state transitions are checked (against unexpected events) and traced, so debugging the machine is literally a matter of reading a log that looks like this:
in state 0, received event A so went to state 2
in state 2, received event B so went to state 1
in state 1, received event C so went to state 5
in state 5, ignoring event A
in state 5, received unexpected event C
and so on. In this particular example, the question to answer is why we're not handling event C properly in state 5, or why we went to state 5 in the first place. Either should be pretty obvious when you consult the original design. The fix is likewise obvious. Figuring out state machines, in my experience, has always been easier than figuring out multi-threaded code.
This isn't to say that all programs should be implemented as a state machine. Simple Unix-style pipe programs, for instance, are generally unsuitable. If you don't know how to design a state machine properly, it's also going to be unsuitable.
This is bad, lame, faux cooperative threads.
Local variables are not preserved.
A protothread runs within a single C function and cannot span over other functions. A protothread may call normal C functions, but cannot block inside a called function.
Well,
I think it it is a simple idea but very well implemented and especially useful for microcontroller programs. Mr. Dunkels other project is uIP, which I do find very useful in this area, too.
I.e. this is neat if you don't have the resources at hand to run a even a small multitasking kernel! Maybe 2kB flash for program storage or similar.
As one of my hobby projects, I'm implementing a software DCC encoder (a protocol used for digital model railroads) that has to talk concurrently to the PC and to another bus with various connected gadgets. It has to do buffering and status displays and whatnot. Going the traditional way of
a chain of 'if's is hard and an explicit state-machine non-obvious. Protothreads work, are easy and small (few bytes).
Cheers,
Onno