Slashdot Mirror


Good Books On Programming With Threads?

uneek writes "I have been programming for several years now in a variety of languages, most recently C#, Java, and Python. I have never had to use threads for a solution in the past. Recently I have been incorporating them more in my solutions for clients. I understand the theory behind them. However I am looking for a good book on programming threads from an applied point of view. I am looking for one or more texts that provide thorough coverage and provide meaningful exercises. Anyone have any ideas?"

30 of 176 comments (clear)

  1. real world haskell by j1m+5n0w · · Score: 2, Informative

    Probably not what you're looking for, but Real World Haskell is soon to be released and has chapters on concurrent and multicore programming and software transactional memory. Even if you're not interested in Haskell per se, STM is kind of an interesting idea.

  2. Language/Environment specific by MikeRT · · Score: 3, Informative

    Pthreads, Java threads and .NET threads are implemented differently. If you need a good Java book, just pick up one of the "Core Java" books that covers threading in one of its chapters since Java threads aren't that complicated. That said, with Java applications (the platform I know pretty well), if you're doing "enterprise" development it's best to avoid using them and let the application server do its black magic for you.

    1. Re:Language/Environment specific by Cyberax · · Score: 2, Informative

      On the contrary, Pthreads, Java threads and .NET threads are mostly the same thing in different packages.

      There are _really_ different ways to implement multithreading: fork-join model, pi-calculus, STM, message-passing model, etc.

    2. Re:Language/Environment specific by TheRaven64 · · Score: 3, Informative

      There are _really_ different ways to implement multithreading: fork-join model, pi-calculus, STM, message-passing model, etc.

      No, there are different ways of implementing concurrency. Threading, in particular, means shared-memory concurrency with a private control stack. Pi-calculus, STM, Linda and CSP are all examples of other models for concurrency, not of multithreading. They differ in many respects (although pi-calculus and CSP have a lot in common), but share one feature - they are all easier to reason about (and therefore to debug) than multithreading. The only valid use for multithreading is to provide an efficient implementation of one of the other models.

      --
      I am TheRaven on Soylent News
  3. Free eBook on Threading in C# by Deffexor · · Score: 4, Informative

    I'm still getting the hang of Threading in C# myself, but I found this eBook immensely helpful in getting me understand some of the difficult issues such as Thread Safety, Cross-threading issues, Race Conditions, and Event-Delegate pairs.

    http://www.albahari.com/threading/

  4. Concurrent Programming in Java by progressnerd · · Score: 5, Informative

    Concurrent Programming in Java is more or less *the* book on good practices for multi-threaded programming for Java, with many lessons that apply to other languages as well.

    1. Re:Concurrent Programming in Java by K.B.Zod · · Score: 5, Informative

      I recommend Java Concurrency in Practice as well. It's an updated, in-depth look at Java threads. Doug Lea, author of Concurrent Programming in Java, is a co-author of the newer book. A great read.

    2. Re:Concurrent Programming in Java by Anonymous Coward · · Score: 1, Informative

      The problem is is that this book is now dated as it was written before the java concurrency library was made part of the basic java environment. Granted, most of the code for the library came from this book, but you have to keep that in mind when reading it.

    3. Re:Concurrent Programming in Java by Anonymous Coward · · Score: 2, Informative

      Don't forget Brian Goetz's "Java Concurrency In Practice", which covers the changes they made to the JVM memory model in Java 5. Also check out the Java Theory and Practice section in IBM's developerworks site.

  5. Re:PThreads & Java Threads by Anonymous Coward · · Score: 2, Informative

    The Addison-Wesley book mentioned by the parent is "Programming with POSIX Threads" by David R. Butenhof. It's what I used when I needed to get up to speed on p-threads in a hurry - clear and easy to follow. P-threads are what's in Darwin, (and so BSD) Linux, and I'm guessing based on POSIX compliance, just about every commercial flavor of UNIX. (Presmuably, OpenServer uses fraying threads)

  6. Re:PThreads & Java Threads by Anonymous Coward · · Score: 3, Informative

    Most important rule of thumb of multi-threaded programming is to avoid it if possible. Maybe hardware (multi-core) will change that, maybe you feel the scheduler can't do its job as well as you can and maybe you feel it's more intuitive. But, often is the case, that you're just adding more complexity to your code resulting in more difficult bugs and harder maintenance for others. Keep it simple.

    Man, I have to disagree with you. That kind of dinosaur thinking will hold back progress. Multi-core is the future and multi-threaded apps are exactly what's needed to fully utilize its potential. I'm sorry if its too hard for you to debug but its just the way the cookie crumbles.

  7. Re:PThreads & Java Threads by ByOhTek · · Score: 2, Informative

    for the morbidly curious, there's even a pthreads library for windows. LGPLed

    http://sourceware.org/pthreads-win32/

    --
    Self proclaimed typo king, and inventor of the bear destroying coffee table (patent not pending).
  8. not covered in books on threads by bugi · · Score: 3, Informative

    The thread model has some fundamental problems, but since they seem here to stay there are some things you should keep in mind, nicely summarized in this article(pdf).

    Article also available in html if you click on the first computer.org link from google. Hmm, why does it work from google and not from slashot?

    1. Re:not covered in books on threads by TheRaven64 · · Score: 2, Informative

      Threads are a very good tool for building tools for building concurrent applications. They are not, themselves, a good tool for building concurrent applications and should not be treated as such. If you are building an application, stay away from using threads directly, and instead use a high-level concurrency API. If you are building a concurrency API, then by all means use threads (I have done, and so did the Erlang guys), but you probably don't want to be doing this just after reading a book on threads. In short, if you are the kind of person who needs a book on threads to understand them, you probably are not the kind of person who can safely use threads. You would be better off picking up a book on operating systems or distributed systems theory and reading the chapters on concurrency. This will give you a deeper understanding of the problems of concurrency and give you a much, much better overview of the tools which can be used to solve it (threads, message passing, transactional memory, and so on), and how they are implemented.

      --
      I am TheRaven on Soylent News
  9. Multithreading Applications in Win32 by GogglesPisano · · Score: 2, Informative

    Here's one I found useful: Multithreading Applications in Win32 by Jim Beveridge and Robert Wiener. It's a little dated (no coverage of .NET, for example - it's more focused on C/C++), but it still provides a good introduction to threading and synchronization on Windows.

    If you can find an inexpensive used copy, it's worth a read.

  10. oldie but a goodie by fred+fleenblat · · Score: 2, Informative

    Some background in parallelism is helpful for mastering threads.
    I learned from this book:

    http://www.lindaspaces.com/book/

    C-linda never caught on, but it's not hard to read the examples and apply them to pthreads, java, MPI or whatever framework you're using.

  11. I haven't found a decent book, but... by EWIPlayer · · Score: 2, Informative

    Herb Sutter has been doing a lot of work on this stuff over the last 10 years and his blog is full of stuff on what you should do... it's not too nitty gritty in terms of languages and stuff, but it's very informative in terms of understanding the issues and what not. Check out http://herbsutter.wordpress.com/.

    Some rules of thumb that I've found useful:

    • Hide mutexes and locks at (nearly) all costs. If you have a queue class, for example, that has a locking push() function, and someone needs to lock for a series of pushes, don't expose the lock to let them lock things for the series of pushes, but provide a push function that takes a list of items instead. Keep thinking of ways to hide your locking strategies. If your class is deadlock-free then you can be reasonably sure (I've always said "reasonably" but I've never seen it not work either) that you'll never see a deadlock in real life either. Race conditions are a different story, however.
    • Trying to figure out a solution where you never have to think about the concurrency of things is a scary place to go... Have a logical concurrent model instead. For example, if you work with user's and user's get events, rather than just letting them process any number of events in parallel, it may be reasonable to sequence events per-user and let the users run in parallel.
    • If you do have to expose locks, use a locking hierarchy. Herb shows this here: http://herbsutter.wordpress.com/2007/12/11/effective-concurrency-use-lock-hierarchies-to-avoid-deadlock/
    • Avoid any concept of being impolite among your threads (i.e. forced interrupts or kills). Be polite. Herb has this here: http://herbsutter.wordpress.com/2008/04/10/effective-concurrency-interrupt-politely/
    • Locking sucks, but it's necessary. If you think you can get away without having to lock in a dubious situation, you're probably wrong.
    • Unit test, unit test, unit test. If your classes hide all of your locks, then unit tests cover a ton of cases.

    I believe that following strict OO guidelines is even more important when dealing with concurrency than when dealing with general ideas in software... and let's face it, it's extremely important even when not dealing with concurrency :)

    --
    This sig used to be really funny...
    1. Re:I haven't found a decent book, but... by TheRaven64 · · Score: 2, Informative

      Locking sucks, but it's necessary. If you think you can get away without having to lock in a dubious situation, you're probably wrong.

      There are lots of good, reusable, lockless data structures around if you know where to look. Keir Fraser's PhD thesis contains a really nice lockless ring buffer design (which he implemented for Xen) and several other useful things (including a transactional list and some other shiny stuff). If you have implementations of these in a library somewhere, then you can often get away without locks. There is one rule you should always obey when writing parallel code though:

      No data may be aliased and mutable.

      As long as you remember that, then it's easy to write concurrent code. In Erlang, for example, this is enforced for you, since the only mutable data structure is the process dictionary, which is not ever shared. This rule actually applies in a lot of serial code too, but in parallel code failure to apply it is the cause of a great many bugs.

      --
      I am TheRaven on Soylent News
  12. Re:Python by Vornzog · · Score: 2, Informative

    Howsabout books or sites on Python threaded programming? I'm going to be working on a project in a short while which will require the use of GTK and twisted together in a sort of network scanner system with asynchronous results.

    As much as I love Python, it does have some weak points, and threading is one of them. From the python documentation:

    The Python interpreter is not fully thread safe. In order to support multi-threaded Python programs, there's a global lock that must be held by the current thread before it can safely access Python objects.

    Threading is there, and I'm sure some decent documentation exists somewhere. But the GIL (global interpreter lock) generally means that there are better ways to approach the problem in python, i.e. processes instead of threads.

    It's a point of contention in the community, and the GVR-BDFL point of view is that any attempt to remove it makes Python a lot slower, so he won't.

    While I don't use twisted, I am given to understand that it does most of its asynchronous stuff using callbacks - you may be able to leave most of the concurrency to it and avoid the process all together...

    --

    -V-

    Who can decide a priori? Nobody.
    -Sartre

  13. Re:PThreads & Java Threads by discord5 · · Score: 3, Informative

    Multi-core is the future and multi-threaded apps are exactly what's needed to fully utilize its potential.

    For each application you name that is benefited by threading, someone else will be able to name one that isn't. Some processes simply are not parallelizable in a meaningful way, where meaningful is defined as in speed of execution not as in the interactive extravaganza of "looky how I can clicky the button while it's still doing hard maths".

    There's a good bit of reading about the subject, although much of it is boring and is often difficult to apply to real-world situations. Amdahl's law in many situations can predict if it's worth bothering with multithreading (or other forms of parallelizing) quite easily.

    A tool like cat or grep has no benefit of being threaded since it's a simple sequential task. Suppose you were to multithread "cat" into one thread that reads from disk, and another that displays a line of text on the screen. Thread 1 will spend most of its time waiting for I/O, and thread 2 will spend most of its time waiting for thread 1 to pass data. Except now, your multithreaded cat has a somewhat complicated synchronization mechanism on top of it that makes it a bit harder to debug and probably eats some extra cycles as well.

    While the previous example is overly simple, there are plenty of tasks that are a lot more complicated but simply have no benefit of being threaded, because they spend more time waiting for I/O than actually calculating or because the algorithm is simply not worth parallelizing because there is no benefit in speed.

    Another example would be an application divided in 3 steps. Step A and B can be executed at the same time independently of each other, while step C depends on step A and B. Both step A and B can be written to use two threads, and if they'd use two threads they'd run in half the time of their non-threaded equivalent. On a dual core machine (or 2 CPU machine) running step A multi-threaded and then step B multi-threaded takes 1 hour. In the other case, running step A and at the same time (on the other core/CPU) running step B single threaded also takes 1 hour. At this point you gain nothing by threading. Of course here I assume that I/O by both processes at the same time doesn't create some sort of delay. But if you're working with large enough data sets (more than you can keep in memory) this becomes less and less of an issue since the I/O overhead will already be there anyway.

    If you add to that the fact that threading (especially synchronization) is a subject that is not well understood by everyone (in the "find me out of 200 programmers fresh from school, 10 who can write a program that benefits from multi-threading and actually works" sense), threading suddenly becomes less appealing if there aren't any clear benefits for the application you're working on.

    The reason I mention that last part is that because so many schools give kids the "make two threads count to 100 then exit" exercise but fail completely at getting the message across of the fact that most of the time the threads actually need to synchronize with each other. They'll give this long lecture about the dining philosophers problem without actually SHOWING them what that means.

    In conclusion: it depends on a lot of factors (size of your dataset, how well your algorithm can be split up in parallel tasks, ...) if your process benefits from threading or not, and you should evaluate at design time using Amdahl's law if there's an advantage or not. If your results in a multithreaded environment are only marginally better, the economical factor of cost of development time suddenly weighs in very heavily.

    Having said that: if you're a programmer, have fun with threads at least once. Write something silly in your spare time, it can be an amazing amount of fun and often offers an interesting way of approaching future problems.

  14. Re:PThreads & Java Threads by greenbird · · Score: 2, Informative

    Erm, the tenets of programming usually involve the general concept of "Eliminate the unnecessary." Therefore, the parent is correct: if multi-threaded processing is unnecessary, avoid it.

    Although unnecessary, threading usually simplifies a program rather than adding complexity. The only caveat is that you understand threading. In my experience I've used threading to greatly reduce the size and complexity of solutions that either were or could have been implemented without them.

    --
    Who is John Galt?
  15. Use Erlang by toby · · Score: 1, Informative
    --
    you had me at #!
  16. Re:Python by Anonymous Coward · · Score: 2, Informative

    There is a new multiprocessing module 2.6 (and eventually 3.) that does allow for true multiprocessing. It uses an interface similar to but not exactly like threads, but actually is forking off new processes and using pipes to communicate in the background. It is possible to use shared memory between processes to give you all the benefits (and pitfalls) of threads.
        On systems where spawning processes is cheap (like Linux) I think it could be pretty useful. While certainly not as low-overhead as pthreads, it should finally allow Python to utilize multicore CPUs if you code your program the right way.

  17. Re:PThreads & Java Threads by ELProphet · · Score: 2, Informative

    Most important rule of thumb of multi-threaded programming is to avoid it if possible. Maybe hardware (multi-core) will change that, maybe you feel the scheduler can't do its job as well as you can and maybe you feel it's more intuitive. But, often is the case, that you're just adding more complexity to your code resulting in more difficult bugs and harder maintenance for others. Keep it simple.

    I'm going to have to disagree with you on this one. Especially in Java client side rich GUI apps, background threads are one of the most useful components to ensure a responsive interface when dealing with asynchronous requests. They really only need two and a half pieces to implement them easily and efficiently. The first component is the request itself, either a subclass of java.lang.Runnable or javax.swing.SwingWorker. The second is a callback handler. The half piece is the shared data structure, and it's only a half piece because you'll want to use the synchronized collections wrapper to get a (you guessed it) synchronized collection.

    Brushing up on those pieces will give you the background you need to not block the UI whenever something needs to happen. Threads aren't hard, they just take a little thought.

  18. mods on crack by Anonymous Coward · · Score: 1, Informative

    Look, moderators, that's offtopic. Not flamebait, offtopic.

    Flamebait: being an asshole/provoking others into being an asshole. eg "Linux threads suck ass."

    Troll: posting shit in order to get a response by people who think you're serious. eg - "I've been a lunix users for years. I wish it had thread support. That's why I'm switching to vista. "

    Offtopic: anything unrelated to threading.

  19. Threads != bad. Badly programmed threads = bad. by Anonymous Coward · · Score: 1, Informative

    Those who just say "don't use threads" are just saying "I don't know how to use threads or I haven't been careful when using them, so stay away from them." The same could be said about certain computer languages.

    Specifically, you need to be VERY careful of what you do with threads:

    * ALWAYS Lock shared resources.
    * Be careful of what order you put the locks into, otherwise you can stumble upon deadlocks.

    * NEVER, EVER, update the GUI from a secondary thread unless you really know what you're doing (hint: You don't know). Gui programming doesn't mix up well with multithread programming.

    * Dealing with detached threads is a headache. Use joinable threads instead.

    * If you're developing a high-performance, low-latency app (i.e. multimedia), you may need to research on lock-free and wait-free programming (not for the faint hearted, tho). But if you're just starting with thread programming, I don't recommend it.

    * Also, start making simple threaded programs as exercises. Change the order of the locks to see how you can screw up.

    For research, before purchasing a book, search Wikipedia for the following fundamental topics:
    - mutex
    - critical section
    - condition variable
    - semaphore (these are the four fundamental synchronization primitives)
    - read/write lock
    - memory barrier
    - RAII (How to make your threads exception-safe)
    - dining philosophers problem
    - Priority inversion (a typical problem with multithreading)
    - Deadlocks
    - Concurrent programming

    Also Google for:
    - joinable threads
    - detached threads
    - thread execution barrier

    Good luck!

  20. Take a look at Erlang by Anonymous Coward · · Score: 1, Informative

    I don't necessarily mean the language, but the principles therein, they can be used in any language. The main point is that it avoids low-level locking (mutexes) and ensuing contention by never sharing memory between threads ("processes" in the Erlang VM) which would require it. Instead, data is passed as messages between threads, which work like event handlers then. In languages like C, you can also model this non-shared data by simply using read-only shared data, when you make sure that data isn't modified, you don't need to copy it between the threads.

    Other than that, in event-based environments (typically window-based programs) you can often replace threads with a timer, which fires an event with the desired frequency. If all you need is to do something repeatedly and sleep in between, that works pretty well and and completely without locking. If you need a worker thread to do intensive computations in the background, that is not the way though.

    cheers

    Uli
    (Ulrich Eckhardt)

  21. Taming Java Threads by jed_reynolds · · Score: 2, Informative

    I took some classes taught by Allen Holub. Very smart guy, and I certainly enjoyed his book.

    He provides good solid explanation on functional models for queue design and listener patterns. He also discusses some pitfalls of threads in Java.

    http://www.holub.com/training/java.threads.html

    http://www.amazon.com/Taming-Java-Threads-Allen-Holub/dp/1893115100

    --
    # for x in `find '.' -name "*.c" -print`; # do perl -pie "s/==/=/ig" $x; done
  22. Python doesn't have threads by Secret+Rabbit · · Score: 3, Informative

    That might seem wrong given that Python lists threading modules, but just look at Python's GIL to know what I mean. As in, no matter what you do, Python will still be running on one core. So, if you just want a performance boost because of a lot of I/O, then threads can get you there. Unfortunately, if you want to take advantage of a multi-core CPU with Python, Python's threads won't get you there. There has actually been a lot of discussion on this topic, but Guido just refuses to do it. The interpreter has no threads and the lib is not thread safe.

    If you want to do multi-processing with Python, look at its subprocess module.

    Guido's blog post on the GIL:
    http://www.artima.com/weblogs/viewpost.jsp?thread=214235

    The FAQ entry on a (fallacious) reason why they won't remove it:
    http://www.python.org/doc/faq/library/#can-t-we-get-rid-of-the-global-interpreter-lock

    1. Re:Python doesn't have threads by Secret+Rabbit · · Score: 2, Informative

      Actually, the conclusion is not supported by the reasoning. For those that don't like clicking links, Guido's reason is that there exists a patch the removed the GIL and replaced it with fine grain locks. This failed miserably. BUT, when one thinks about it, this implementation would certainly be doomed to fail for obvious reasons.

      When one implements fine grain locks, every time something is accessed, it is locked accessed and released. Clearly, this will impact performance on even a single threaded application. Clearly, this impact will be more and more significant with an increasing number of threads. So, the only thing that can be said by Guido's reasoning is that:

      That SPECIFIC IMPLEMENTATION failed to remove the GIL.

      Now, if one put everything that's currently global into the specific interpreter, there would be no reason for locks and thus the performance wouldn't suffer. Then each thread could run independently without including (many) locks. Lua has the ability to do this. So, don't tell me that this is impossible WHEN ANOTHER LANGUAGE HAS ALREADY DONE IT.