Protothreads and Other Wicked C Tricks
lwb writes "For those of you interested in interesting hard-core C programming
tricks: Adam Dunkels' protothreads library
implements an unusually lightweight type of threads. Protothreads are
not real threads, but rather something in between an event-driven
state machine and regular threads. But they are implemented in 100%
portable ANSI C and with an interesting but quite unintuitive use of the switch/case
construct. The same trick has previously been used by Simon Tatham to implement
coroutines
in C. The trick was originally invented by Tom Duff and dubbed Duff's
device. You either love it or you hate it!"
Duff on Duff's Device:
http://www.lysator.liu.se/c/duffs-device.html
--- These are not words: wierd, genious, rediculous
This looks very similar to the implementation technique used for the Squeak programming language (not the Smalltalk Squeak). Squeak is a preprocessor for C that makes it very easy to use this technique.
http://citeseer.ist.psu.edu/cardelli85squeak.html
Doug Moen
I have written a truly remarkable program which this sig is too small to contain.
Duff's device was the first convoluted form of a switch() statement which became well known.
All these C "tricks" employ the same technique (though more elegantly) for different goals. Nonetheless, Duff's device can be said to have inspired such code.
LL
SGI had state threads library since long http://oss.sgi.com/state-threads
...not bound to any particular OS.
0 .x/ggAddTask.3.html
If that's what folks are looking for, another option is the tasks added to LibGG a while back. Tradeoffs either way -- LibGG's requires at least C signals (but will use pthreads or windows threads if detected during compile time), whereas this can be used in OS-less firmware. But on the positive side you can use switch() in LibGG tasks -- what
you can't use are a lot of non-MT-safe system calls. It's an OK abstraction but of course there are so very many ways to accidentally ruin portability that it is far from foolproof.
http://www.ggi-project.org/documentation/libgg/1.
Someone had to do it.
This is bad, lame, faux cooperative threads.
It's also not even particlarly new [1998].Unless memory is at an absolute premium, just use cooperative threading instead. If you try to use prototheads, you'll quickly discover how unlike "real" programming it is. Even just a 4K stack in your cooperative threads will get you way more than protothreads does.
It most certainly is the Duff's device, or at least is very close to it. Duff's device is, indeed, a way to unroll loops; specifically, a way to unroll loops that uses a peculiarity in switch statement syntax that allows case to point inside a loop body. Now, take a look at lc-switch.h in the Protothreads tarball. It contains macros that use the same peculiarity to jump inside functions instead of loops.
-- Stanislav Shalunov
It's too clever to be really useful unfortunately. The big issue is of course the no "local variables". Trouble is, if you are writing in C, the compiler may well be creating local variables for you behind your back. In C++ for example there are many cases where this will certainly happen, like
void DoSomething(const string&);
DoSomething("hollow, whirled");
where a local variable of type string will be temporarily created to pass to routine DoSomething.
You need to read the article.
It only says you can't use local variables across functions that block. Actually, it doesn't even say that you can't use them, it only says don't expect their value to be preserved.
In your example, even if the compiler does create a local variable to call DoSomething, and even if DoSomething does block, who cares if the value of that local variable is preserved, since it's impossible to reference it again after that statement?
But that was an awfully long time ago. Now it's hard to find memory chips below 1Mbit.
I can help you with this problem! Is 16 bytes small enough?
And since you can't use local variables, you can't use things like the C libraries or pretty well any library ever written, which is teh sux0r.
But you can use the C libraries. Just don't use local variables across functions that block. Only a very few C library functions block.
You can implement a simple memcpy function like this:So far, so good. Now Duff's problem was that this was too slow for his needs. He wanted to do loop unrolling, where each iteration in the loop does more operations, so that the entire loop has to iterate less. This means the 'is count > 0? if so, go back, otherwise go on' part of the loop has to execute fewer times.
Now, the obvious problem with this is that you don't know how much you can unwind this particular loop. If it has 2 elements, you can't unwind it to three elements, for instance.
This is where Duff's Device turns up:First, we check to see how much we can unroll the loop - for instance, if count is perfectly divisible by 5, but not 6, 7, or 8, in which case we can safely have 5 copies inside our loop without worry that the copy is going to move past the end of the array. Then - and here's the magic trick - we use switch to jump into a do loop. It's a perfectly ordinary do loop; the trick is entirely in the fact that if count==6, for instance, then C considers the do-loop to begin at 'case 6:', causing 6 copies of '*to++ = *from++' to be executed before the 'while' returns the loop position to the 'case 6:' point which is where, as far as C is concerned, the do-loop began.
Thus, the loop is unwound to a level that it can handle.
I think.
Feel free to correct/amplify/mock.
cheers,
Gaurav
[...]the 'while' returns the loop position to the 'case 6:' point which is where, as far as C is concerned, the do-loop began.
No, it returns to the 'case 0:' point where the 'do {' is. (Otherwise the loop wouldn't be executed count times, and somehow I think this Duff guy would have thought of that...)
I disagree with your assessment, although you were on the right track; The loop doesn't return back to the case label where the loop was entered, it always jumps back to the 'do' statement (synonomous with the case 0:).
:)
The way you describe it is that the loop is unrolled to a size that is safely divisible into the 'count' value, which is an interesting idea, but would not be as efficient (large prime number counts would not get unrolled, for example, and a more complex computed got would be required at the loop end).
My take is this: with loop unrolling, one always has to take care of the 'remainder'. In the above example, the loop is unrolled to be a fixed size (8 repeated copy instructions, instead of one), and any count not divisible by 8 has to handle the remainder of the count after dividing by 8. Conceptually, you could imagine handling this remainder with a separate case section after the unrolled loop. In Duff's device, the remainder is actually dealt with first, by intially jumping into the loop somewhere other than the beginning, then letting the fully unrolled loop finish up.
In answer to the previous poster's question, the 'do' could (probably) be put on it's own line, before case 0:, but that wouldn't look nearly as bizarre.
Of course, maybe I'm wrong too. I hope not.
"It's overkill, of course. But you can never have too much overkill." - Anonymous Slashdot Coward
So in a shorter description, what happens is that:
1) you determine how many groups of 8 you will need, rounding up to count the remainder block as well (if there is one)
2) code enters switch statement based on the remainder value, hits the correct case and falls through (note that if there was no remainder we start at the top of the cases and fall through, consuming an entire 8 block)
3) code hits the while, decrements the number of 8 blocks (as we just finished off the partial "remainder block")
4) return to do, fall through to finish this 8 group
5) loop back to 3
Took me a few minutes of staring at it (and I admit, some tme looking at above descriptions) to get over 4 years of no C in my diet, but now I have to admit that is beautiful.
Whee signature.