a thread on threads
by
Spootnik
·
· Score: -1, Redundant
Multitasking involves the computer doing more than one thing at the
same time. You can produce this situation in a variety of ways,
but they all amount to the same thing: multiple executution
contexts (PCs, SPs, & other registers) with some degree of sharing.
When you use
$kid = open(CHILD, "cmd args |");
your begotten child process shares a few file descriptors, umasks,
process group, and credentials with the original process. In fact,
nearly everything is shared. Only text, data, pid, and register
set vary.
When you use
$kid = open(MYSELF, "|-");
or
$kid = fork();
You end up sharing a bit more -- the backing store text is now the same,
but data, pid, and register set are distinct. You can open particular
communications mechanisms between the entities, such as pipes and
shared memory, but the default on data space is separate but equal.
Text pages are still shared.
People have been using fork() to write multithreaded (read: multitasking)
applications on Unix for about twenty-five years. It is a simple and
powerful model that has withstood the test of time. It is also highly
optimized, using such cleverness as copy-on-write for the data and
making use of special machine hardware designed to facilitate
context switches and page table manipulations.
Another scenario is the so-called "light-weight process". Various
implementations have taken different approaches to these. One that stands
out as different from many of the current crowd was the Convex C-series,
where FORK and JOIN were hardware instructions to create a new execution
thread without any operating system intervention whatsoever. This was
beneficial in cases where you in effect wanted highly parallizable
vector processing distributed across several CPUs where everything else
was shared. Each thread had separate thread registers, and the operating
system could consult special thread state accumulators when the whole
process ended. This tightly-knit relationship between the hardware
designers, operating system designers, and compiler designers produced
some intriguing possibilities not seen on commodity systems.
The typical use of the word "thread" now has varied greatly from its
original use, which was essentially synonymous with a fork()d process.
(And these new light-weight processes aren't always particularly light in
weight, either.) I assume you are using "thread" in the vernacular,
which most often seem to mean to create a new context where everything
is shared just as in a fork, except this time, all data space will also
be jointly accessible.
In short, with fork(), data space is protected unless you say otherwise
(such as with IPC::Shareable) whereas with Thread::->new, it's an
uncontrolled free-for-all. I can certainly see why you might wish to
have several active PCs running concurrently, but I am dubious that you
have an application for which uncontrolled, unmediated sharing amongst
these threads would be a natural solution not easily accomplishable
using the protected threads you get when you fork.
Prisoners of Microsoft tend to leap to solutions that involve
(data-)unprotected threads rather than those that involve (data-)protected
ones. This is the wrong thing to do. But they do it because an
efficient implementation requires a sound underlying theorectical model,
and Microsoft has never shown itself adept at operating systems design.
Linux protected threads (read: "processes") context-switch many times
faster on the same hardware than do Microsoft's unprotected ones (read:
"threads").
This old statement of mine is applicable in many contexts, including
the current one:
"In short, just as the Multics mentality of careful access controls
shows up throughout Unix, the cretinous CP/M mentality of uncontrolled
havoc shows up in DOS and all its mutant children." --tchrist
I'll finish up by including an interesting article that turned up in
a thread on comp.unix.programmer. There's a good bit of noise there,
but also some interesting technical data as well.
I cannot speak for other BSDs, but in BSD/OS 4.0, the model (the
code is not yet fully implemented) is that the sfork() call --
essentially the same as Plan 9's rfork() and Linux's clone() --
is the "one true" process creator. (Process 0, which forks init
and does other similar work during bootstrapping, is never really
"created": it exists in the/bsd image as loaded, modulo some
hookups needed at boot time.) There are compatibility routines,
as in any other system-with-historical-baggage, of course.
Putting aside the fundamental misunderstanding that "Ed@eds.net"
apparently had when he started trol-- er, I mean began this
discussion,:-) one way to view the whole idea is this. Note that
the terminology differs from that used when describing traditional
Unix systems.
There is no such thing as a "process", there is just a "thread".
Here a "thread" is really just an ID, some optionally-shared
resources like memory and files, and maybe a kernel stack or
some such -- pretty much the same thing as a Unix "process".
"Fork" just creates a new thread; everything is shared. [Note
that the new thread *does* need a way to know that it is in fact
the new one. How this is achieved is not particularly important
in terms of the model.]
Any thread can request that some of its currently-shared
resources can become "unshared" (have a private copy made, so
that changes to them will not change someone else's copy).
[In Plan 9, Linux, and the various BSDs, these operations are
rolled into the argument to [rs]fork/clone; logically, they
should probably be separate. One can make efficiency
arguments, but see below.]
In addition:
Any thread can wipe itself out and run some other executable
image instead (Unix "exec"). This wipes out any parts of memory
not marked "preserve across exec"[%], partly just to match
tradition, partly for ease of implementation, and partly for
security when executing setuid binaries. (In other words, it
does not *have* to be done this way. Given that current hardware
tends to implement binaries that tend not to be entirely
position-independent, we make concessions to efficiency here.
We say: "We could not figure out how to make the old image
hang around while the new image is loaded, and then give the
old image a chance to remove itself if that is what it intended.
So even though it looks like this action is made of a bunch of
simpler atomic actions, we could not figure out a nice way to
break it down.")
[% Some systems have no way to mark any such sections, in which
case *all* memory "goes away" on exec. This is why exec takes
argv and envp: to pass *some* data across the operation. "Leave
memory region R" alone would suffice for the same purpose, and
could be more efficient -- perhaps substantially so.]
Now, one can argue that "make a thread-copy and immediately exec",
which is the pair of primitives that emulates other OS's "create
new process running program P" operation, ought to be less efficient
than simply having a "create new process running program P" operation
(whether that is spelled spawn() or CreateProcess() or SYS$CREPRC
or whatever -- let me use spawn() here for short). That argument
makes some sense to me. Given a thread+exec model, fundamentally,
spawn() is just "create thread, then exec". If there is any overhead
in returning to user code, then having the user code immediately
turn around and exec, all of that overhead can be eliminated.
But if you take a look at otherwise "reasonably comparable" systems
that *do* offer "spawn" instead of fork+exec, something strange
happens. The overhead for processes in those systems turns out to
be *greater* than it is in the Unix-like systems. (Examples: VMS
vs 4.1BSD on a VAX; NT vs Linux on an x86.) This overhead leads
people to write "all in one" packages on the "spawn" systems (e.g.,
the directory lister is usually built into the user interface),
while the relative lack of overhead in the fork+exec systems leads
people to write smaller "tool" programs (the directory lister is
usually a separate program, and hence works with pipes and "wc"
word counters and "lpr" printer spooling and so on[%]). Perhaps
the people who wrote the spawn() systems were just not as good as
the people who wrote the fork+exec systems, but in any case, it
seems odd.
[%] A friend of mine suggests that you try "telling your mom" (or
any suitable non-computer-geek type) how to print out a listing
of files in some particular directory. On Unix systems: run
"ls -l | lpr".
Now for many side points, even if one of them was the main point
of the original article. First, removing all the biasing wording:
There are two reasons for having a parent/child relationship in
fork().
First, we need to distinguish between the original thread (process)
and the new thread (process). We can arbitrarily label the original
the "parent" and the new clone the "child". Or we could label them
"old" and "new", or "Fred" and "Barney", or whatever, but "parent"
and "child" works.
Second, any time any program runs any other program, it *might*
care whether that other program succeeds. If there is *no*
relationship between old and new, as suggested above, there is also
no way for the creator to get information on the createe. So there
must be *some* relationship: even if it is not a tree-structured
parent/child thing, at the very least there is a "creator/createe"
relationship, from the very fact that one process spawned the other.
Using a parent/child relationship, and tying it to the return value
from fork() (the thread/process id), is probably the simplest way
of expressing this. Even when you are not doing cloning -- even
when the only call you have is spawn() -- there is still a "creator"
(spawner) and "createe" (spawnee), and those remain "parent" and
"child", whether or not the OS tracks it.
"Natural" is one of those "marketing" type words that people use
to sell one shampoo vs another (ours contains 2,4-di-poly-... so
ours is More Natural!). But if you want to have a ProcessManager
do the creating, you can code that yourself on any modern Unix
system. Just write a program that listens for IPC requests to
create new processes. Run it once, then send those IPC requests.
The listening program is your ProcessManager and the IPC requests
invoke it.
Problem solved. Next?:-)
If you like the taste of ice cream more than that of vegetables,
nothing anyone can say about the health qualities of either is
going to change your mind about which tastes better.
In some systems, there are definite "right" vs "wrong" answers.
"Everyone knows" that objects are naturally at rest -- if I roll
a baseball on the ground, it always stops rolling eventually --
but in physics, Aristotle was wrong; Newton supplied a "righter"
model. In this case, however, you have set up a system in which
only you can possibly be right: "By `right' I mean something fits
my intuition. A process manager fits my intuition, so any other
model is wrong." There is nothing to prove or disprove here; you
are letting your intuition dictate the answer, and your intuition
is yours alone.
If so, you will have to come up with more objective criteria.
This is not a good mental model for Unix systems, and indeed, not
a good mental model for many other non-Unix-like systems. To stick
with traditional Unix systems (where fork() is "create new kernel
thread, and immediately unshare all user memory"), a "process" is
made up of:
an identifier (Process ID or "pid");
an address space (which might map to a.exe, or might not;
the exec() call always creates such a mapping, but once exec'ed,
a process can generate and/or load new code and data [a la a "DLL"
in Microsoft-ese], and/or toss out its existing code and data,
as long it is careful);
file descriptors (references to open files/pipes/IPC);
credentials (~= privileges);
resource limits;
any other things I cannot think of off the top of my head.
Any process can change or rearrange any of these except its PID:
the PID is the outside world's "true name" for the process. (The
kernel might use a data pointer inside, if that seems more efficient
and/or appropriate, but the outside-world name is the PID.)
At the OS level, there is only fork() and exec(); system() is a C
library (non-OS-level) thing. Clone() is just fork() spelled
sideways, and the sideways spelling is just to work around the fact
that, historically, fork() implies "copy (do not share) the
resources", which tends to be slow. (Copy-on-write helps a lot,
but not so much that people are not still drawn to some other
mechanism, whether it is the limited vfork() or the flexible
sfork()/rfork()/clone(), to avoid the copy altogether.) If you
prefer, fork() is just clone() spelled sideways; as noted way at
the top, the true primitive is "make a new thread with everything
shared".
Now, once you have made the new, all-shared thread, you can do this
in the "new" thread (as a shell might, to do "ls | wc"):
1 split off the file descriptors
2 create a pipe
3 make another new, all-shared thread
[new thread only, again]
3a split off the file descriptors
3b close the read side of the pipe
3c move the write side of the pipe to STDOUT_FILENO
3d exec "/bin/ls"
[NOTREACHED]
4 close the write side of the pipe
5 move the read side of the pipe to STDIN_FILENO
6 exec "/usr/bin/wc"
The "split off the file descriptors" steps are required so that
the new thread does not change or close the old thread's (shared)
descriptors, which in step (2) would make the pipe available to
the shell itself (the shell does not want it), and in step (3b)
would remove "wc"'s access to the read side of the pipe (which
would be disastrous).
(Since rfork/clone take an argument, steps 1 and 3a are actually
implicit, via "pid = xyzzyfork(SHARE_ALL_BUT_DESCRIPTORS)". That
means the process above is actually 8 steps long -- 9 if you add
in the initial fork.)
Note that a parent/child relationship occurs between "ls" and "wc"
here, even though neither program wants it. This is not a *problem*,
but it has no advantage either; it is just a side effect of the
order of user-level manipulations involved in connecting the output
of "ls" to the input of "wc". If the shell wants to be the parent
of both ls and wc, the "make a pipe" step has to occur earlier and
the sequences needs to be slightly different. (In fact, old versions
of the Bourne shell used the above sequence, while modern shells
*do* create the pipe earlier and remain the parent of each process
in the pipeline.)
The "traditional spawn()" model looks like this:
1 create a pipe
2 move the current STDOUT_FILENO out of the way
3 move the write side of the pipe to STDOUT_FILENO
4 spawn "/bin/ls"
5 close the write side of the pipe
6 move the current STDIN_FILENO out of the way
7 move the read side of the pipe to STDIN_FILENO
8 spawn "/usr/bin/wc"
9 close the read side of the pipe
10 move the saved STDIN_FILENO back
11 move the saved STDOUT_FILENO back
In other words, it takes 11 steps, not 9, with a "spawn" model.
Even though the fork and exec steps are combined (saving one step
per fork+exec pair on the Unix shell side), the number of steps
does not go down. The reason is that the spawner must do all the
manipulation up front: there is no chance to change things around
between the "create new process" step and the "take flying leap
into new object file" step.
There is really very little to draw. Whoever calls fork() is a
parent. The clone copy is a child.
Good Parts Version (TM)
by
PeterClark
·
· Score: 0, Redundant
It's way to late, and I should be in bed, but I thought I would c/p the relevant parts of the article for those at work tomorrow who find the site/.ed. Sorry that I didn't preserve the HTML formatting. The rest is amusing, too, so you might want to come back to it once the servers have recovered. (If you're the type of person who gets excited over perl modules, that is. The rest of you might just want to watch CSPAN.:)
:Peter
---
Moving swiftly on, we come to Date::Tolkien::Shire, a king amongst date modules. Most newspapers carry an ``on this day in history'' column -- where you find, for instance, that you were born on the same day as the man who invented chili-paste -- but no broadsheet will tell you what happened to Frodo and his valiant companions as they fought to free Middle Earth from the scourge of the Dark Lord. The undeceptively simple:
use Date::Tolkien::Shire;
print Date::Tolkien::Shire->new(time)->on_date, "\n";
outputs (well, output a few days ago):
Highday Winterfilth 30 7465
The four Hobbits arrive at the Brandywine Bridge in the dark, 1419.
What better task could there be for crontab but to run this in the wee hours and update/etc/motd for our later enjoyment. Implementing this is, as ever, left as an exercise for the interested reader.
There is a more useful side to Date::Tolkien::Shire or, at the very least, it does light the way for other modules. As well as the on_date() method it provides an overloaded interface to the dates it returns. This allows you to compare dates and times as if they were normal numbers, so that:
print 'time is '.( $date1 > $date2 ? 'later':'earlier' ).
"than time -1e6\n";
prints time is later than time -1e6, the more prosaic Date::Simple module provides a similar interface for real dates and ensures they stringify with ISO formatting.
When you use
$kid = open(CHILD, "cmd args |");
your begotten child process shares a few file descriptors, umasks, process group, and credentials with the original process. In fact, nearly everything is shared. Only text, data, pid, and register set vary.
When you use
$kid = open(MYSELF, "|-"); or $kid = fork();
You end up sharing a bit more -- the backing store text is now the same, but data, pid, and register set are distinct. You can open particular communications mechanisms between the entities, such as pipes and shared memory, but the default on data space is separate but equal. Text pages are still shared.
People have been using fork() to write multithreaded (read: multitasking) applications on Unix for about twenty-five years. It is a simple and powerful model that has withstood the test of time. It is also highly optimized, using such cleverness as copy-on-write for the data and making use of special machine hardware designed to facilitate context switches and page table manipulations.
Another scenario is the so-called "light-weight process". Various implementations have taken different approaches to these. One that stands out as different from many of the current crowd was the Convex C-series, where FORK and JOIN were hardware instructions to create a new execution thread without any operating system intervention whatsoever. This was beneficial in cases where you in effect wanted highly parallizable vector processing distributed across several CPUs where everything else was shared. Each thread had separate thread registers, and the operating system could consult special thread state accumulators when the whole process ended. This tightly-knit relationship between the hardware designers, operating system designers, and compiler designers produced some intriguing possibilities not seen on commodity systems.
The typical use of the word "thread" now has varied greatly from its original use, which was essentially synonymous with a fork()d process. (And these new light-weight processes aren't always particularly light in weight, either.) I assume you are using "thread" in the vernacular, which most often seem to mean to create a new context where everything is shared just as in a fork, except this time, all data space will also be jointly accessible.
In short, with fork(), data space is protected unless you say otherwise (such as with IPC::Shareable) whereas with Thread::->new, it's an uncontrolled free-for-all. I can certainly see why you might wish to have several active PCs running concurrently, but I am dubious that you have an application for which uncontrolled, unmediated sharing amongst these threads would be a natural solution not easily accomplishable using the protected threads you get when you fork.
Prisoners of Microsoft tend to leap to solutions that involve (data-)unprotected threads rather than those that involve (data-)protected ones. This is the wrong thing to do. But they do it because an efficient implementation requires a sound underlying theorectical model, and Microsoft has never shown itself adept at operating systems design. Linux protected threads (read: "processes") context-switch many times faster on the same hardware than do Microsoft's unprotected ones (read: "threads").
This old statement of mine is applicable in many contexts, including the current one:
"In short, just as the Multics mentality of careful access controls shows up throughout Unix, the cretinous CP/M mentality of uncontrolled havoc shows up in DOS and all its mutant children." --tchrist
I'll finish up by including an interesting article that turned up in a thread on comp.unix.programmer. There's a good bit of noise there, but also some interesting technical data as well.
I cannot speak for other BSDs, but in BSD/OS 4.0, the model (the code is not yet fully implemented) is that the sfork() call -- essentially the same as Plan 9's rfork() and Linux's clone() -- is the "one true" process creator. (Process 0, which forks init and does other similar work during bootstrapping, is never really "created": it exists in the
Putting aside the fundamental misunderstanding that "Ed@eds.net" apparently had when he started trol-- er, I mean began this discussion,
In addition:
Any thread can wipe itself out and run some other executable image instead (Unix "exec"). This wipes out any parts of memory not marked "preserve across exec"[%], partly just to match tradition, partly for ease of implementation, and partly for security when executing setuid binaries. (In other words, it does not *have* to be done this way. Given that current hardware tends to implement binaries that tend not to be entirely position-independent, we make concessions to efficiency here. We say: "We could not figure out how to make the old image hang around while the new image is loaded, and then give the old image a chance to remove itself if that is what it intended. So even though it looks like this action is made of a bunch of simpler atomic actions, we could not figure out a nice way to break it down.")
[% Some systems have no way to mark any such sections, in which case *all* memory "goes away" on exec. This is why exec takes argv and envp: to pass *some* data across the operation. "Leave memory region R" alone would suffice for the same purpose, and could be more efficient -- perhaps substantially so.]
Now, one can argue that "make a thread-copy and immediately exec", which is the pair of primitives that emulates other OS's "create new process running program P" operation, ought to be less efficient than simply having a "create new process running program P" operation (whether that is spelled spawn() or CreateProcess() or SYS$CREPRC or whatever -- let me use spawn() here for short). That argument makes some sense to me. Given a thread+exec model, fundamentally, spawn() is just "create thread, then exec". If there is any overhead in returning to user code, then having the user code immediately turn around and exec, all of that overhead can be eliminated.
But if you take a look at otherwise "reasonably comparable" systems that *do* offer "spawn" instead of fork+exec, something strange happens. The overhead for processes in those systems turns out to be *greater* than it is in the Unix-like systems. (Examples: VMS vs 4.1BSD on a VAX; NT vs Linux on an x86.) This overhead leads people to write "all in one" packages on the "spawn" systems (e.g., the directory lister is usually built into the user interface), while the relative lack of overhead in the fork+exec systems leads people to write smaller "tool" programs (the directory lister is usually a separate program, and hence works with pipes and "wc" word counters and "lpr" printer spooling and so on[%]). Perhaps the people who wrote the spawn() systems were just not as good as the people who wrote the fork+exec systems, but in any case, it seems odd.
Now for many side points, even if one of them was the main point of the original article. First, removing all the biasing wording:
There are two reasons for having a parent/child relationship in fork().
First, we need to distinguish between the original thread (process) and the new thread (process). We can arbitrarily label the original the "parent" and the new clone the "child". Or we could label them "old" and "new", or "Fred" and "Barney", or whatever, but "parent" and "child" works.
Second, any time any program runs any other program, it *might* care whether that other program succeeds. If there is *no* relationship between old and new, as suggested above, there is also no way for the creator to get information on the createe. So there must be *some* relationship: even if it is not a tree-structured parent/child thing, at the very least there is a "creator/createe" relationship, from the very fact that one process spawned the other. Using a parent/child relationship, and tying it to the return value from fork() (the thread/process id), is probably the simplest way of expressing this. Even when you are not doing cloning -- even when the only call you have is spawn() -- there is still a "creator" (spawner) and "createe" (spawnee), and those remain "parent" and "child", whether or not the OS tracks it.
"Natural" is one of those "marketing" type words that people use to sell one shampoo vs another (ours contains 2,4-di-poly-... so ours is More Natural!). But if you want to have a ProcessManager do the creating, you can code that yourself on any modern Unix system. Just write a program that listens for IPC requests to create new processes. Run it once, then send those IPC requests. The listening program is your ProcessManager and the IPC requests invoke it.
Problem solved. Next?
If you like the taste of ice cream more than that of vegetables, nothing anyone can say about the health qualities of either is going to change your mind about which tastes better.
In some systems, there are definite "right" vs "wrong" answers. "Everyone knows" that objects are naturally at rest -- if I roll a baseball on the ground, it always stops rolling eventually -- but in physics, Aristotle was wrong; Newton supplied a "righter" model. In this case, however, you have set up a system in which only you can possibly be right: "By `right' I mean something fits my intuition. A process manager fits my intuition, so any other model is wrong." There is nothing to prove or disprove here; you are letting your intuition dictate the answer, and your intuition is yours alone.
If so, you will have to come up with more objective criteria.
This is not a good mental model for Unix systems, and indeed, not a good mental model for many other non-Unix-like systems. To stick with traditional Unix systems (where fork() is "create new kernel thread, and immediately unshare all user memory"), a "process" is made up of:
- an identifier (Process ID or "pid");
- an address space (which might map to a
.exe, or might not;
the exec() call always creates such a mapping, but once exec'ed,
a process can generate and/or load new code and data [a la a "DLL"
in Microsoft-ese], and/or toss out its existing code and data,
as long it is careful);
- file descriptors (references to open files/pipes/IPC);
- credentials (~= privileges);
- resource limits;
- any other things I cannot think of off the top of my head.
Any process can change or rearrange any of these except its PID: the PID is the outside world's "true name" for the process. (The kernel might use a data pointer inside, if that seems more efficient and/or appropriate, but the outside-world name is the PID.)At the OS level, there is only fork() and exec(); system() is a C library (non-OS-level) thing. Clone() is just fork() spelled sideways, and the sideways spelling is just to work around the fact that, historically, fork() implies "copy (do not share) the resources", which tends to be slow. (Copy-on-write helps a lot, but not so much that people are not still drawn to some other mechanism, whether it is the limited vfork() or the flexible sfork()/rfork()/clone(), to avoid the copy altogether.) If you prefer, fork() is just clone() spelled sideways; as noted way at the top, the true primitive is "make a new thread with everything shared".
Now, once you have made the new, all-shared thread, you can do this in the "new" thread (as a shell might, to do "ls | wc"):
- 1 split off the file descriptors
- [new thread only, again]
The "split off the file descriptors" steps are required so that the new thread does not change or close the old thread's (shared) descriptors, which in step (2) would make the pipe available to the shell itself (the shell does not want it), and in step (3b) would remove "wc"'s access to the read side of the pipe (which would be disastrous).2 create a pipe
3 make another new, all-shared thread
3a split off the file descriptors
3b close the read side of the pipe
3c move the write side of the pipe to STDOUT_FILENO
3d exec "/bin/ls"
[NOTREACHED]
4 close the write side of the pipe
5 move the read side of the pipe to STDIN_FILENO
6 exec "/usr/bin/wc"
(Since rfork/clone take an argument, steps 1 and 3a are actually implicit, via "pid = xyzzyfork(SHARE_ALL_BUT_DESCRIPTORS)". That means the process above is actually 8 steps long -- 9 if you add in the initial fork.)
Note that a parent/child relationship occurs between "ls" and "wc" here, even though neither program wants it. This is not a *problem*, but it has no advantage either; it is just a side effect of the order of user-level manipulations involved in connecting the output of "ls" to the input of "wc". If the shell wants to be the parent of both ls and wc, the "make a pipe" step has to occur earlier and the sequences needs to be slightly different. (In fact, old versions of the Bourne shell used the above sequence, while modern shells *do* create the pipe earlier and remain the parent of each process in the pipeline.)
The "traditional spawn()" model looks like this:
2 move the current STDOUT_FILENO out of the way
3 move the write side of the pipe to STDOUT_FILENO
4 spawn "/bin/ls"
5 close the write side of the pipe
6 move the current STDIN_FILENO out of the way
7 move the read side of the pipe to STDIN_FILENO
8 spawn "/usr/bin/wc"
9 close the read side of the pipe
10 move the saved STDIN_FILENO back
11 move the saved STDOUT_FILENO back
In other words, it takes 11 steps, not 9, with a "spawn" model. Even though the fork and exec steps are combined (saving one step per fork+exec pair on the Unix shell side), the number of steps does not go down. The reason is that the spawner must do all the manipulation up front: there is no chance to change things around between the "create new process" step and the "take flying leap into new object file" step.
There is really very little to draw. Whoever calls fork() is a parent. The clone copy is a child.
It's way to late, and I should be in bed, but I thought I would c/p the relevant parts of the article for those at work tomorrow who find the site /.ed. Sorry that I didn't preserve the HTML formatting. The rest is amusing, too, so you might want to come back to it once the servers have recovered. (If you're the type of person who gets excited over perl modules, that is. The rest of you might just want to watch CSPAN. :)
:Peter
/etc/motd for our later enjoyment. Implementing this is, as ever, left as an exercise for the interested reader.
---
Moving swiftly on, we come to Date::Tolkien::Shire, a king amongst date modules. Most newspapers carry an ``on this day in history'' column -- where you find, for instance, that you were born on the same day as the man who invented chili-paste -- but no broadsheet will tell you what happened to Frodo and his valiant companions as they fought to free Middle Earth from the scourge of the Dark Lord. The undeceptively simple:
use Date::Tolkien::Shire;
print Date::Tolkien::Shire->new(time)->on_date, "\n";
outputs (well, output a few days ago):
Highday Winterfilth 30 7465
The four Hobbits arrive at the Brandywine Bridge in the dark, 1419.
What better task could there be for crontab but to run this in the wee hours and update
There is a more useful side to Date::Tolkien::Shire or, at the very least, it does light the way for other modules. As well as the on_date() method it provides an overloaded interface to the dates it returns. This allows you to compare dates and times as if they were normal numbers, so that:
$date1 = Date::Tolkien::Shire->new(time);
$date2 = Date::tolkien::Shire->new(time - 1e6);
print 'time is '.( $date1 > $date2 ? 'later':'earlier' ).
"than time -1e6\n";
prints time is later than time -1e6, the more prosaic Date::Simple module provides a similar interface for real dates and ensures they stringify with ISO formatting.