Information for Managers - Understanding pthreads?
dnotj asks: "The boss (who is very technically astute) says: NO to using pthreads in any of our production applications. He wants us to do things the old fashion way (fork(), exec(), shared memory, etc). His reason for this is that he doesn't understand pthreads (by his own admission). Hence, he is limiting us to using methods and techniques that he understands. He is reasonable and would see our side (the developers) if presented with enough understanding in a satisfactory format. What I'm looking for are document technically detailed yet directed more towards management. Not something on the level of 'pthreads for Dummies', but more along the lines of 'pthreads for Managers'. Any suggestions? URLs or Books are fine."
You share just what you need to share , and keep everything else private. With threads all memory is common. Didn't they teach you to use protection, or something?
For you managers out there, it's good to listen to your developers and let them show you things. Sometimes when they wish to use a technology or method, it's for VERY good reason.
There are a lot of managers out there who distance themselves... so be happy they are learning the technology vs a 2 minute shpeel on why they should use something.
-
ping -f 255.255.255.255 # if only
By all accounts, if your perception is that:
- boss is technically astute
- boss hates pthreads cuz he doesn't understand `em
Then there's inherent contradictions.You need to learn how to talk to your boss more, listen more, and, after listening, patiently explain with about 2 viewgraphs of bulleted items, the key features of threads and processes, with an even-handed listing of their respective pros and cons.
Then let him make a decision. Tell him your opinion is that pthreads is a better choice for this project, but you'll go with whatever he decides. In turn, express your appreciation for him holding up whatever decision and supporting you whichever way you go.
It wouldn't hurt your case if you explained that you've programmed in pthreads before, are familiar with the pitfalls and have encountered them previously, and think they are outweighed by the advantages. Tell him that only if it's true, though:)
"Provided by the management for your protection."
Ah, but the cost of fork is higher in the process creation.... process :) What's worse, is the copying of memory on the changing of variables in a fork'd process. With threads, you know ahead of time what will be changing and can compensate for it.
-
ping -f 255.255.255.255 # if only
I'm thinking that if you can't give your manager a brief synopsis, then maybe he is right and you shouldn't be using pthreads...
Well I'm not an expert on anything but I studied them in my third year of college...
Pthreads stands for POSIX threads.
Threads are different "paths of execution" through a program that can be run in parallel. They have both advantages and disadvantages, without getting into a lot of technical details, they are generally regarded as "lighter" in terms of resource consumption and easier to code for.
The general idea is that when launching a new thread you launch a function in your program, so you can launch the music function in a game and then the graphics engine function and the keyboard reading function, etc, having them run simultaneously.
Could be your manager is making the right call? Obviously, the roles in your group aren't clear to me. In my team, I'm the lead programmer, and I'm pretty conservative about adopting tech.
I don't just need to understand a technology, I need a deep undertanding of its strengths and weaknesses before I can start including it in our programs in intelligent ways. In practice, it means that I experiment with new things before I'll let the team use them.
I applaud your boss for his decision. There are many disadvantages of using threads, that I'm more than sure your boss knows about. See e.g. this discussion for more detail.
Now, granted, some applications will also benefit enough from threads to outweigh those disadvantages. It is rare, but it happens.
What I'm looking for are document technically detailed yet directed more towards management. Not something on the level of 'pthreads for Dummies', but more along the lines of 'pthreads for Managers'. Any suggestions? URLs or Books are fine."
If you have a boss that already knows enough technical stuff to micromanage your project in this way, I'm sure he is capable of googling out a pthread introduction himself. What you should be focusing on is explaining exactly why your projects needs threads, and why fork()/exec() won't cut it. This may also result in you accepting his decision, but that is as it should be.
Threads are considered "lightweight" processes, in that they run within a process concurrent to each other and the process that owns them. They share address space, making implementation of the alternative easier. Going with your 1-4 above:
(1) Depends on what platform and how the threads are implemented. In some cases, you do see multiple entries in ps for a process running threads. Typically you see this in network deamons, but often that's because they use fork().
(2) They've been around a very long time. Dijkstra (the guy who developed the shortest path algorithm for graphs) also developed a lot of stuff for threading in the 1960s. Though, fork() and exec() have probably been around longer.
(3) Correct. Threads share memory. The trick is to make sure you eliminate race conditions so that one thread doesnt blow away what another thread is working on while it's working on it. There's two major ways of doing this: Semaphores and Monitors. pthreads use semaphores because they're a little easier to work with in C, whereas monitors generally need support from the programming language (eg, Java implements monitors, but can add your own semaphores to Java code if you wish).
(4) POSIX has a Threading specification. iirc, pthreads in one such implementation.
Uses: Within an OS Kernel, in GUIs, server apps, less overhead than using fork() et al. Useful in massively parallel computing applications where a lot of data gets shared. Like everything else in programming, it really depends on what's being developed.
The One Rule Of Chess You'll Ever Need: Don't play someone who carries a kit in their bookbag.
Managers aren't always wrong.
Laugh, but it seems to be a common attitude that a manager is never right, and never will be.
Your manager is not there to code, he's there to keep all of you developers on task. Apparently, for some reason he needs to understand your code, in order to do his job to his satisfaction.
Actually, he may not NEED to understand pthreads at all. But I believe his approach is pretty smart in this case. Why should he allow you to adopt a new method, when you cannot explain it to him?
I wouldn't let you use pthreads either, if my paycheck depended on you getting your job done.
...
may no be what you'r looking for, but gives a lot of insight into different ways of dealing with this (prosesses, thread etc) problem. from Dan Kegel The C10K problem
Acts@core.mailboks.com Acrux@core.mailboks.com Adam@core.mailboks.com Adar@core.mailboks.com Ada@core.mailboks.com
except the stack. That's pretty darn simple, and
probably all he needs to know.
However, he's probably right. Threads are horribly
abused in most applications. They create bloat
and spaghetti code opportunities that lead to
user dissatisfaction and unreasonable maintenance
costs.
Threads should be used if you need SMP scaling
for a I/O intensive application, or shared memory
decomposition of a problem space dominates the
structure of the code. Otherwise, small
is beautiful, and asynchronous I/O is far more
efficient and elegant (which means maintainable).
-I like my women like I like my tea: green-
As someone who does know what pthreads are and uses them frequently in a professional setting, I would recommend the O'Reilly book mentioned above. It contains a decent introduction to threading in general, a healthy dose of usage scenarios with detailed explanations, and a programmers' reference. If I remember correctly, it was written by some of the folks at DEC who designed most of the pthreads standard and wrote the first implementation. However, although it does a nice job of explaining the difference between threads and forked processes, it does not attempt to make any bias-free recommendations as to when forked processes might be more appropriate for a task.
But my grandest creation, as history will tell,
Was Firefrorefiddle, the Fiend of the Fell.
...to your manager:
1) cost savings short term
2) cost savings long term
Anything else is just typical geek elitism. "Hey, let's use foobar technology because it's, well, COOL!"
If you are developing on Linux the new threads implementation shall kick ass anyway, so it should become sort of moot.
It's 10 PM. Do you know if you're un-American?
Processes get you a number of advantages above.
God that is stupid. OK, let met put it another way.
Computer work with machine code. High level languages are for people who cannot program in machine code.
OK, Alan Cox gets my het off because DOES program in Machine code, and I have too, but the thing is, threads are easier to maintain than a state machine because usually the two things are separate while in a state machine the bloody states have a complicated interaction.
Obviously the model where you fork 10000 threads off to anything is not viable. But for many apps threads make it easier to maintain the thing and keep separate things separate. A UI which has a separate thread for the UI and a thread for the program for instance is a VASTLY better model to program in than the stupid Event based crap that you see everywhere. See Java Swing vs. Java AWT for an example. And definitely see BEOS for an example.
Thread reduce latency because a single process with a state machine has to wait in blocking IO while a thread can go on. See Netscape that hangs in all windows because a TCP connect blocks in pone window for a good and extremely irreitating example. And Mozilla does the same. For that matter, IE too but in IE you can switch it off!
In general, if things are logically different things that run on their own (The UI vs the program that does something in the background) they belong in different threads. If you do not believe me, install BeOS one day ons a 200 Mhz machine and see how much more snappy it than Linux on your multi Ghz machine because the UI things (in their own thread) repond immediately even then the background thing that they control do not. State machines are simply to difficult to handle than seprate threads but because everryon uses them all our programs have godawful latency in the User Interface.
Also, one important thing. ALL computer programs that have an active GUI have two processors in it. The computer AND the user's brain. The part that the user controls is the GUI. That belongs in another thread.
And one last thing. Thread spawning on Linux is very, very good. Mucho better than under most Unixes AFAIK and therefore the "too much overhead" is crap. On Linux at least. And most program will get along just fine with 2-10 threads.
The dangers of excessive individualism are nothing compared to the oppressiveness of excessive collectivism
You know, there are a lot of practical disadvantages to threads.
...except that the read thread is usually written with blocking reads and it's hard to recover when you want to kick that thread out of the read to terminate or restart cleanly.
Threads can be a lot harder to debug.
I don't think gdb really understands threads very well. Other application support is spotty as well. Actually, Solaris has pretty good thread-aware debugging and has good threading tools like deadlock tools.
People who write threaded applications also seem to think it makes life easier. They say - I'll just have a read thread and a write thread and a processing thread.
So people end up using select() anyway. And then they still need interprocess communication to kick the read thread out of the select sometimes.
I think you have to be a better programmer to use threads - you have to be aware of the normal issues plus all the threading issues. Avoid cancelling threads. You have to be very clear how you do signal handling. You have to lock all your data. Don't use fork+exec AND threads. You have to keep track of multiple threads, especially during error conditions.
And you have to be a much better debugger to debug someone else's threaded application.
Not that there aren't good applications for threads, just be aware of the complications.
Linux is for people who hate Windows, BSD is for people who love UNIX
Just a short rant on my experience and conclusions regarding threads. Me, having been formed in the "new age" school of programming (trained on the Commodore Amiga), always used threads. Used to design all my software with threads in mind. But not anymore. Since my jump to Unix, some five years ago, I've experienced a rather deep change of perception on this issue.
IMO, threaded code is much, much harder to implement correctly, and to debug, than single-threaded code. I used to blame the crappy thread support of many Unix libraries and tools. But now that gdb supports threads, along with pretty much everything else (except gprof, but the workaround is simple), I've found that it still is a PIA to debug all but the simplest threaded apps.
After years of hard work trying to determine what library calls are not reentrant, how are asynchronous signals delivered in each platform you support (God! Never, ever, ever mix threads and signals when writing portable code, not if you can avoid it), and which objects should be synchronized to get the best compromise between sync overhead and avoiding trashing the heap, you just fall in love with a design in which no locking is necessary at all. No ambiguities in library calls, no undocumented whatever_r() calls, no weird signal handling. A design in which a single core dump can tell you EXACTLY what was your program doing and why did it crash. Really, the simplicity and elegance of Unix programming without threads is just beautiful. The best adjective I can come up with is "liberating".
Of course, you do have to switch to a state-machine kind of mentality when designing your applications. It is not easy (was not easy for me, at least). But it can be done very cleanly and elegantly (for an example, look for the "State" pattern, the one in which you derive classes for representing your state machine states). If portability is not an issue, it can also be done very efficiently, if you use IO signals instead of select().
Now, having said all that, I'll now point out that I still use threads, because my code usually has to run on WIN32 too, and programming without threads there is hell (not that programming with them is much better). But I avoid threads as much as I can. I still worry about writing reentrant functions, because I find that to be a very good practice, even if no threads ae involved. But threads are used only when no other portable solution cuts it (like when checking asynchronusly for activity on non-socket descriptors, on WIN32).
Not trying to flamebait here. But if your boss is really technically astute, he should be able to do the research, assess the pros and cons, and make a decision.
He should be able to pick up a book, google for some benchmarks, and talk to some people either face-to-face or on Usenet and then make a decision. A manager should be able to make this decision based on the facts over the weekend.
A very good manager would be able to make the decision in an afternoon. The real difference between good management and bad management is the rate at which you can make well-informed decision.
"I understand you're lack of support for pthreads comes from an admitted lack of understanding of them. But having used them, I can tell you the only differences are A, B, C. If you're sure it would be done better without pthreads, then ok, I'LL code up a testcase, and we'll see how readable the code is, and how well it runs."
Keep in mind, because someone has the title 'Manager' doesn't mean they don't know a damn thing (I'm a 'Manager', but I'm also the only IT person. For me, it includes everything from Network Admin, to PBX Admin, to Programmer, to Invoice authorizer). Rather, it means they can explain themselves without coming across as a total ass.
Keep that in mind. (I always have a problem explaining things in an easy-to-understand way without making my opponant look like a moron.)
"I can't give you a brain, so I'll give you a diploma" - The Great Oz (blatently stolen sig)
Sun Tzu:
:).
"Do not oppose those with their backs to the wall."
"Leave an escape route for a surrounded army."
Maybe the boss really knows that pthreads aren't the way to solve the particular problems they have. I'm not that technically astute so I wouldn't use threads for a production app. But from the little I know, if I'm the boss I'll require someone _very_very_ good at pthreads to implement a production app.
So what if this guy doesn't seem good enough to the boss?
So he gives him a way to prove he is good enough (explain why pthreads is better to the boss), AND also a way to save face if he isn't.
Better than risking:
a) Misjudging this guy capabilities and thus create resentment and making him less useful.
OR
b) Correctly judging his current capabilities but in the process obliterating this guy's ego/confidence and thus potentially making him less useful to you.
The tradeoff is a little loss in respect at technical matters by claiming to not understand pthreads (do recall that the subordinate still does believe that the boss is very technically astute).
There might be a better way to handle such a scenario. But I'm going to sleep
You say Netscape/etc hangs in all windows when a TCP connect blocks in one window. That only happens if you are using Netscape with threads. Not if Netscape is in a separate process.
Your point that 2.5 different threaded browsers have that "one window blocks others" problem, seems to be more an example against using threads to me.
I think in most situations on Linux/FreeBSD it is best not to use threads. Use fork - default unshared memory, and then just explicitly share memory if necessary. Safer and easier to debug than "share everything".
And if one browser instance dies, it doesn't take the other 30 with it.