Multi-threaded Programming Makes You Crazy?

Does this article have any useful info? by dennism · 2006-05-02 02:22 · Score: 1, Informative

It looks like it's just a link to a sourceforge project... was there supposed to be something else?

--
dennis

Link to the home page would be nice by technoextreme · 2006-05-02 02:28 · Score: 4, Informative

I was wondering what was this program was about. Fortunately, here is there website. http://nptltracetool.sourceforge.net/

--
Ooo man the floppy drive is broken. No wait. The computer is just upside down.

Re:Multi-threaded Programmation Makes Me Crazy? by zhiwenchong · 2006-05-02 02:33 · Score: 3, Informative

The poster is a French-speaking person. Programmation is the French word for "programming".
Notice also: take a deep *breathe*.

But I agree... multi-threaded programming can drive people crazy. Message passing-based programming is less prone to nastiness than shared-state concurrency. (Languages like Erlang come to mind).
http://en.wikipedia.org/wiki/Concurrent_programmin g_language
http://www2.info.ucl.ac.be/people/PVR/bookcc.html

You can also do Erlang-style message passing in Python using Candygram
http://candygram.sourceforge.net/

Java no panacea -- must know what you're doing by Chuck+Messenger · 2006-05-02 02:51 · Score: 5, Informative

Java's "builtin thread saftey" is simply a poor hack. The idea is to give _every_ structure a mutex. Any access to the structure requires a mutex lock.

First off, that in itself will not prevent deadlock. Secondly, it's damned inefficient.

Look: there's just no way around it. If you want to do effective (i.e. low bug, high performance) multithreaded programming, you simply have to understand what you're doing. Ultimately, the tools of your trade will be mutexes, condition variables, semaphores, etc -- the O/S primitives. Don't rely on your programming language to "automatically" use these for you, blasting out mutexes machinegun-style. Instead, figure out the logic of your program. You probably need only a small number of mutexes.

A key to effective multithreaded programming is to adhere rigidly to certain programming practices. It must _NEVER_ be the case that 2 threads have write access to a given item at the same time. Duh. But you can use fancy programming tricks to, in effect, automatically add run-time assertions to your code which assure that this practice is being adhered to. In production mode, you remove these runtime assertions.

Another good practice is, if you really need to have multiple mutexes, to arrange them into a hierarchy. When a top-level mutex is locked, no other mutex can be locked. When a second-level mutex is locked, only top-level mutexes can be locked. Etc. This hierarchy can be verified at runtime, in debug mode. Adhering to this regime will go a long way to removing the possibility for deadlocks.

Bottom line: you really have to know what you're doing in order to write good multi-threaded code. You should take the time to really study that problem space. An excellent book I've found for this purpose is "Concurrent Programming in ML". (I know -- nobody uses ML. So what? Learn the language just for the purpose of understanding the book. Then, you can apply your knowledge to any domain you're working in).

Re:Java no panacea -- must know what you're doing by 0xABADC0DA · 2006-05-02 03:16 · Score: 4, Informative

No, Java does not give every structure a mutex (as if Java even had structures). It only creates a mutex if synchronization is used on an object, so if you never use locking there are no locks. And no, it is actually perfectly fine for 2 or more threads to have read or write access to the same variable as long as it's atomic read/write... you just have to know what that means for your program.

Batman was right that after using Java's threading this NTPL trace looks pretty lame. Not only is the threading and locking in Java braindead simple, but the JVM actually tells you what is wrong. For instance it detects deadlocks and gives you the complete call trace of each deadlocked thread.

Other languages have good locking too (ruby for instance), so it's more that everything is difficult and crappy in C and its kind. I guess if you are stuck writing a threaded application in C in the first place this tracing library could be useful. Of course if you use the heap you're going to also want to replace malloc/free with a fast multithreaded version and then do a bunch of hacks so that it isn't ridiculously slow (locking on every free()... now *that's* inefficient).
Re:Java no panacea -- must know what you're doing by swillden · 2006-05-02 04:17 · Score: 4, Informative

It must _NEVER_ be the case that 2 threads have write access to a given item at the same time.
Two clarifications:
First, it's okay to allow multiple threads write access as long as the writes are guaranteed to be atomic and as long as the order of atomic writes doesn't matter. In practice, that second restriction usually means you need locking.
Second, it's often important that one thread not be writing an object while one or more threads are reading it. In other words, multiple writers aren't the only problem.

--
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Re:Java no panacea -- must know what you're doing by swillden · 2006-05-02 05:33 · Score: 2, Informative

Yes what you say it true, but these are all implications of atomic read/write... It follows logically from having caches
Umm, no, it has nothing to do with atomicity, nor is it because of caching.
Atomicity just guarantees that the atomic change is seen fully or not at all. It makes no guarantees about what other changes will be seen. And the problem doesn't have anything to do with caching, assuming the caches work properly (and they do). The caches maintain coherency as needed between each other and with main memory. The issue is all about what goes into main memory, and when.
The problem actually arises from operation *reordering*. There are at least three levels of reordering that may happen. First, the compiler may reorder things. In theory, given:
x=45; f=true;

The compiler may choose to set f before setting x. Even assuming both are atomic, that's a problem if another thread is doing something like:
while (!f) ; do_something(x);

Compilers take care not to reorder things in ways that might cause problems, though, so that's not generally an issue. Even if the compiler doesn't reorder like that, though, the processor might. Most processors these days do operation reordering. But even if the processor doesn't do it, the memory management unit *also* reorders writes. I'm speaking here of the unit that writes to main memory, not to any cache management logic (though I suppose it's entirely possible that they may reorder writes as well).
This stuff is taught in school no?
I think the better question is whether or not it is *learned* in school :-)
I really don't think it is the concept that is hard, it's the implementation in C.
I think it's the concept that's hard. The difficulties in correct implementation almost never come from the specific programming language APIs that are provided, and most bad multithreading designs in C are just as bad and just as tempting in any language, including Java.

--
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Re:Java no panacea -- must know what you're doing by the_hoser · 2006-05-03 21:23 · Score: 2, Informative

C++ is a nightmare to work with when multi-threading. After a while of mucking with posix threads I just gave up.

Multi-threading in Java is just too easy. Its not bullet-proof, but it has such a set of functionality that it makes it hard(er) to write deadlocking code.

And Id continue to write about it, except firefox just decided that an apostrophe means ctrl+f

*sigh*

Bah by grindcorefan · 2006-05-02 02:52 · Score: 3, Informative

Aha, so I can only do multithreaded programming on GNU/Linux with NPTL'ed glibc or what? Other programming langunguages than C/C++ don't exist or don't do threading. What about other operating systems? Specific solutions to general problems only apply to specific manifestations of the general problem and are therefore useless for most of us.

The only good general advice about learning how to develop software on distributed systems I can give is: Read some of Andrew S Tanenbaum's books about operating systems and distributed systems in particular. The books contain knowledge you'll be able to apply to almost every system you develop software for.

Re:Use the right tool by stienman · 2006-05-02 03:18 · Score: 4, Informative

Use the right tool

The correct tool is called a brain, but first the brain must be configured properly.

Deadlocks are one symptom of poor program logic, and are designed into the program due to lack of proper controls. They frequently occur when a program is not designed before it is written.

See "dining philosophers" for an explanation of this, and several methods to prevent this situation.

Tracing tools are all well and good, but if one starts out with correct logic in the first place then one won't spend more time debugging than programming.

Always remember that a digital computer is a logical computing device. If you give it a series of instructions which do not ALWAYS have a logical solution, then it will choke ... eventually.

-Adam

Multithreaded Haiku by pauljlucas · 2006-05-02 03:25 · Score: 2, Informative

There was a time when I was doing a lot more multithreaded programming (using the pthreads API, FYI). At the time, I was inspired to write some Haiku:

Threaded programming
is not for the faint of heart
or for the sober.

A small locked mutex
mistakenly left alone
results in deadlock.

--
If you reply, do so only to what I explicitly wrote. If I didn't write it, don't assume or infer it.

Re:Use the right tool by Anonymous+Brave+Guy · 2006-05-02 03:28 · Score: 2, Informative

There are some good tools for the job. Relatively speaking, Java isn't one of them.

While Java does include some built-in support for multithreading primitives, its underlying model (using locks on data to prevent simultaneous access) is the same as many other mainstream languages today. Thus it suffers from the same weaknesses, including deadlocking, and potentially also things like potential priority inversion, depending on how clever the implementation of the concurrency tools is.

If you want serious multithreaded programming, there are languages practically built around it (Erlang, for example), and for that matter whole approaches that lend themselves to it much better than those used by today's popular commercial languages (e.g., functional programming without side effects). But right now, you still have to look out of the mainstream to find them, and be prepared to accept a fundamental shift in how you look at program construction if you want to take advantage of them.

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.

Re:Use the right tool by Anonymous Coward · 2006-05-02 03:39 · Score: 1, Informative

FYI, as of 1.5, there's a new batch of concurrency libraries. I particularly like the flexibility of ReentrantLock and it's support for non-blocking (or even time-limited blocking) lock attempts.

Re:valgrind used to be able to do this by wfeick · 2006-05-02 05:09 · Score: 2, Informative

At my previous company we built a system with on the order of 10 threads working on a combined dataset consisting of many hundreds of thousands of objects and occupying a couple hundred meg of memory in a large installation. There could be hundreds of thousands of instantiated locks in the system, although they fell into maybe 30 classes of lock. The large number of objects and locks was manageable because there were a small number of objects (say on the order of 25) that modeled something in the real world, but the system would then model a few hundred thousand of those real world things.

Anyway, suffice it to say it was a reasonably complicated realtime system with different threads having different responsibilities (receiving change data from the real world, work queues between different computation threads, pushing results back out to the real world).

To address the deadlock problem, we took the standard approach of defining an order in which locks must be taken to avoid deadlock. In our case, that really meant defining an ordering for the 30 or so distinct classes of locks and then defining a suborder for the few cases where we needed to lock multiple objects within a given class. Think of a class of lock/data as being a particular bit of subinformation for each of the real world things we were modeling.

Defining that order is hard, so I built a perl script that would extract partial orderings from comments in the code (e.g. "A : B C" would state that lock class A is taken while holding lock classes B and C) and then build a system wide ordering that obeys all the partial orderings. It's the same algorithm that make goes through to build things in the correct order.

Given that ordering, I then created a system to ensure we never took a lock out of order. First off, we wrapped the standard POSIX locking primitives with C++ classes, and created a smart lock handle we could instantiate on the stack and know that the lock would be released in its destructor no matter how you left the function (either through a return or an exception). We even went so far as to ensure that if you canceled a thread the locks would be released.

Given that wrapping, it was pretty straight forward to create a per-thread array to count how many locks were currently held at each level, and verify at lock time that no locks later in the hierarchy were currently held. In the event of an out of order lock attemt, it would throw a LockError exception. All of our exception classes included a stack trace, making it very easy to identify lock order violations just by looking at the logs.

Note that we weren't detecting actual deadlocks, we were detecting out of order locking that could potentially lead to deadlock at some point. This showed design errors very early on in development, rather than having the system deadlock on us at a customer site where things are difficult to debug.

I was going to open source the tools, but we were acquired by a large phone company and I ran head long into bureaucracy when I tried. I decided I'd rather just reimplement it all at some later point in time. I think I've given enough of a description here that you could do something similar if you want. I highly recommend it if you're building anything serious. It's very easy to accidentally take locks out of order if you're not careful, and that needs to be caught early on in development rather than at a customer.

Re:Here's your problem... by Kupek · 2006-05-02 05:14 · Score: 3, Informative

Yes, there is a real reason. Sometimes it's inherent in the algorithm that the amount of data that must be shared is impractical to send using messages. Parallelization does not come for free; there are communication costs. If the communication costs are greater than the benefit you get from doing the computation in parallel, then you get no benefit from parallelization. Message passing will always have more overhead than shared memory multithreading. Hence, shared memory multithreading allows you to exploit finer grain parallelism than message passing.

Your point that message passing is generally a cleaner design choice is valid, but it's not always a practical option.

Re:Use the right tool by fm6 · 2006-05-02 05:17 · Score: 2, Informative

If you're going to tout Java as a concurrent programming platform, the books you should link to should be about Java. Such as:

Concurrent Programming in Java: Design Principles and Pattern (2nd Edition), by Doug Lea.
Java Concurrency in Practice by Brian Goetz, Tim Peierls, Joshua Bloch, Joseph Bowbeer, David Holmes, and Doug Lea.
Effective Java Programming Language Guide, by Joshua Bloch. Though this is a general programming guide, its chapter on Threads contains essential "best practices" for concurrent programming.
Concurrency: State Models & Java Programs (2nd Edition), by Jeff Magee & Jeff Kramer.

I would also call attention the impending Fourth Edition of the Java Tutorial, which is supposed to be in print at the same time as the general release of Java 6 (Mustang), and will be available on the web slightly before. Its chapter on Concurrency is not nearly as thorough or authoritative as the above books, but it has a certain accessibility. Besides, it was written by (ahem) me.

You should also note that your complaints about the limitations of Java concurrency are addressed in JSR 166, which has been part of the Java platform since 5.0. This defines a new package with many nifty features. There's excellent coverage in this JavaOne talk.

Explanations... by gduranceau · 2006-05-02 06:18 · Score: 3, Informative

Firstly, I apologize for my English (I'm doing my best).

I perfectly agree with some of you: this article is a slashvertisment! The main reason for that is that I previously tried to submit something more descriptive, but it was rejected. That's why I tried again with a slightly different style.

This tool (PTT) inserts trace points into the NPTL to help you to analyze multithreaded applications behaviour. He's not designed for beginners, but for people facing complex multithreaded issues. I also agree with some of you: you can use Java or some others high level languages for programming. But some applications require performance and have to be written in C. That's why PTT can be useful for some developers.

PTT has been presented at the Ottawa Linux Symposium last summer. You can find the paper here (NPTL Stabilization Project, page 111).

Regards...

Slashdot Mirror

Multi-threaded Programming Makes You Crazy?

17 of 166 comments (clear)