Threads Considered Harmful
LBR9 writes "James Reinders compares native threads with the goto statement so famously denounced 40 years ago by Edsger Dijkstra. Paraphrasing Dijkstra, he says they both 'make a mess of a program,' and then argues in favor of a higher level of abstraction. A couple of people commenting on the post question whether or not we should be even be treading into the 'swamp of parallelism,' echoing the view recently espoused by Donald Knuth."
Alright, then all responses to this article need to fall under this one post.
Now if only I could get rid of these handles.
Yikes!
Donkey dick considered harmful.
and some can't. My programs were readable with goto. If yours were not, that's your own fault.
Because really, multithreading doesn't have to be hard
Cretin - a powerful and flexible CD reencoder
Thread is bad? The Pernese could have told you that Long Intervals ago.
I use threads a fair amount, because they are there. But I kinda wish path expressions would catch on. Let the compiler sort out the scheduling given the constraints - that's the kind of scut work computers are good for anyway.
PHEM - party like it's 1997-2003!
I'm all for getting rid of threads, but what are you going to replace them with? Traditional functional languages may be the most obvious solution, but they're also among the most impractical of solutions. Is there anything else out there that can replace threading needs, without throwing out the book on programming? It seems like what we need hasn't been invented yet.
Threads are fine if you have clean abstractions around them. Sort of like gotos are fine when abstracted out as for/while/do/etc.
--
Feed Weed: Feed your web addiction
Goto's and global variables are not inherently wrong or evil. They are tools. Granted, they are tools that, if misused, will wreak havoc on your code's stability and maintainability. The same could be said, however, for pointers. Threads are dangerous, and require special care. This is not a reason to avoid them; it is only a reason to be incredibly careful with them.
Use the best tool for the job, regardless of whether your CS professors demonized it or not.
Threads have been considered a "bad idea" by the CompSci profession for a little while now. So there is definitely nothing new about the author's statements. That being said, there is a fundamental difference between Dijkstra's paper 40 years ago and this summary: Dijkstra started his paper by holding up examples of better practices. Only after establishing their existence did he go on to suggest that the GOTO keyword was "too primitive" to be of practical use in software development.
The author of this "article" (and I use the term loosely) doesn't really present such options. He hand waves a few work-in-progress solutions at the end, compares threads to GOTO statements, then asks the readers to fill in the (rather sizable) blanks.
Long story short, it's a good topic of discussion, but the comparison to Dijkstra's famous paper is just an advertising point. Nothing more, nothing less.
Javascript + Nintendo DSi = DSiCade
Get a better programming language.
And if don't like the taste of that one (what? Dennis Ritchie & Brian Kernighan not good enough for you!) there are other CSP languages available (what? Sir Charles Hoare not good enough for you!)
Seriously, this problem has been solved for 30 years.
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
I wonder how long this trend in this discussion will continue of most every post being its own post as opposed to replying to a previous post.
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo!
The motivation for widespread parallel programming seems to be that there is this upcoming glut of multicore PC chips that will get wasted if we all don't start writing concurrent programs. But is that really true? Most programs don't get any speedup from parallelization; at best a UI/core split helps the responsiveness of an app. Chances are a SMP OS would be able to reap most of the available gain.
Tsunami -- You can't bring a good wave down!
The problem is not threads per se, but the way they are generally used in programming languages like C and C++. Although const correctness is understood by some C++ programmers, they appear to be a minority if I judge by the code I regularly review. There is also memory management which is a much bigger issue in threaded C/C++ applications than in applications written in Java. The Java library provides good examples of immutable classes, most prominently the String class, that remove a number of problems often encountered with their mutable cousins like std::string. Unlike std::string, I don't have to remember to make it immutable by constifying it or wrapping it. The presence of immutable classes, and the more adequate coverage given them along with threading in Java textbooks means that I disagree with the articles' author who lumps Java threads in with pthreads as a bad thing. What we need is more coverage of threading issues and how to alleviate them in intermediate level C/C++ textbooks, because despite the fact that threading is not built into those languages or their standard libraries, concurrency has become too important to ignore once you go beyond the basics.
You know the big difference between TFA and Edsger Dijkstra's paper?
The second one made an argument, showed alternatives that were at least summarili demonstrated to be better and used reasoning.
The first one just says "Edsger Dijkstra's paper said goto was harmful and he ended up being right, thus if I say threads are harmful, I'm also right. Oh and here are some threading libraries I've found in a quick google search, they might be better."
Well, for starters, there's processes, which were invented in the 1960s. These may not handle every case, but in my experience they'd cover 95+%...
"Not an actor, but he plays one on TV."
Lots of things in programming have the -potential- to be bad. Pointers and references, constructors and destructors can be horrible if not handled properly. So an alternative to manually handling those elements came about: automatic garbage collection to free up unused memory. The problem is, garbage collection itself can be a pig.
The same goes for threading. If not handled properly it can be a nightmare, but in a time where processors are growing cores faster than rabbits can breed it makes the most sense to take advantage of that with parallel programming. Whenever there's options that have pros and cons, the best alternative is chosen based on them.
If you have a good programmer that understands threading, an application that has a lot of concurrent connections and/or multiple, concurrent CPU intensive tasks, and a server that can take advantage multi-threading it makes the most sense to go that route.
If you have a rinky-dink, single user application and a developer that has trouble using negatives in conditional statements, by all means don't use threading.
Just don't claim that threading is unequivocally always the wrong option.
"Always forgive your enemies; nothing annoys them so much." - Oscar Wilde
Erlang is multi-process, not multi-thread. A particular runtime may use threads, but that's certainly not the logical model of the language, just an implementation detail.
Threads are good and bad. Same with Goto. I use both, but avoid using goto when it's not needed and avoid using threads when they don't make sense. Most programming languages that people use directly have several branching statements that hide the Gotos to make them less likely to be harmful when dealing with common types of their uses. The erlang language was created just to deal with threads, and makes them a lot more difficult to hurt yourself with.
The articles' author explicitly mentions Erlang as a potential solution to threading issues in other languages. In fact he's mainly concerned about POSIX pthreads, Boost threads, Java threads (and presumably Windows low level thread libraries). As I point out in another post below, I disagree with him lumping Java threads in with those used in most C/C++ libraries, as threading support is integrated into the language along with increasingly sophisticated locking support in the library which can be used if the simple object lock is insufficient. In my experience, most data shared across threads is immutable (read only), and the Java libraries encourage use of immutable types such as String. Once you appreciate the value of immutable types, then they can be used just as easily in C++ (with C it's a little harder). Writable shared data can be cleanly hidden behind a decent interface, with the locking within the getters and setters, but again, this approach is applicable in C++ as well as Java.
Slowaris is behind threading. Because it was so slow to create new processes, the only way they could compete with Linux (which forks very quickly) was to create threading. Threading *is* faster than forking, but it also creates HUGE synchronization problems. You can overcome these problems, at the cost of more complicated, more fragile programs that take more time to write and more time to get right.
Linux doesn't need threading.
Don't piss off The Angry Economist
since you know anything entitled 'considered harmful', as they are the ones actually harmful.
...Thread are vicious.
The pain was excruciating and the scarring is likely permanent, but that just means it's working.
A combination of specific implementations of threading and developers who use programming methodologies that were outdated already when the PDP-11 actually reached the market is.
As a comparison, I learned to program properly on the Amiga, and its OS was natively threaded, and the architecture actually encouraged it(Likewise, unlike so many people who grew up with PC's, I also feel at home with programming for something like the PS3 or similar), and therefore I inherently think about how things can be split up, suitable algorithms etc. That's not to say that I always use threads, however.
By Edward Lee of the EECS department at Berkeley: http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.html. Worth reading if you work with threads.
Those people who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)
exactly. I've written some very complicated threaded programs in java. The java.util.concurrent package has some really nice abstractions that make it much easier.
I don't get all the dissing of threads around here (my 5 or so years of experience with threads are in the java world). with a little planning, they are not that hard to program. It almost sounds like the folks that couldn't understand pointers now matter how many different ways you explained them.
PHP is the solution of choice for relaying mysql errors to web users.
Like Gotos, threads are a tool not a panacea.
If you have a task that requires parallel operations, you should pick the best tool for the task. This might be separate computers, separate virtual computers, separate processes, separate threads, or even a roll-your-own system for managing your task where the operating system sees your task as a single thread in a single process.
Of course, if you want to take advantage of multiple cores or multiple CPUs, the roll-your-own approach may not be an option in your OS.
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
is that they may put threading problems beyond the control of the developer. What happens when your language environment has bugs in how it generates and manages threads? How much control would you have over what is being done behind the scenes? For that reason alone, I think it would be better to keep the use of threads purely optional, preferably as part of a library, rather than put into the core language itself as a series of features and keywords.
I've been working with Cocoa/Objective-C for a while, and I'm starting to develop some of its habits in C++ (immutable strings, smart pointers, Copy-on-Write objects). I don't even need TLS most of the time.
Of Code And Men
The problem is that programmers are generally untrained in them or trained very poorly.
Writing a safe threaded application is not a difficult task, but it is a different task then writing a single-threaded app. And unfortunately CS programs, books, tutorials, etc, still train people in the single-thread mindset and yes the programs they produce end up being buggy.
And I'm not sure these 'high abstraction' languages are really the 'answer'. I have found that often in higher level solutions the results become even less predictable and tracing what is actually happening when becomes either extremely difficult, extremely inefficient, or just back to the single-thread mentality.
I think the OP talking about how one might be next writing a parrell app shows the real flaw here... the author is going from one mentality, entering another without really thinking it through, and then complaining when old methods don't work well. Take a programmer who STARTED in parrell space and you don't run into these problems.
The problem with Java concurrency and threading is, all the locks are advisory. The synchronize statement is a nice bit of syntax, and making it apply to whole blocks of code was the Right Thing to do.
:)
The problem simply comes in that a program is not obligated to *use* synchronize, or any locking, when it accesses objects. Which means the code is totally unsuitable for integrating into a multithreaded program. And trying to backport thread-safety in is (currently) too difficult, as there are no tools to tell you when you've got it right.
I haven't studied Erlang yet, but threads (or more generally concurrency) done securely would require mandatory locking of all data..except when you know two threads must share data and sync between each other. Ada95 seems to understand this, although it probably needs refinement.
Which makes threads functionally equivalent to UNIX processes & shared memory
Practice Kind Randomness and Beautiful Acts of Nonsense.
That way you CSers won't take over my turf as a FPGA designer.
I'm sure a diet would help :P
One swallow does not a fellatrix make
The problem is not threads per se, but the way they are generally used in programming languages like C and C++
Right. C and C++ provide zero help in dealing with the isolation issues of threading. The languages have no concept of parallelism (there's "volatile", but that's about it.) There were 1980s languages that did offer some help, such as Modula I/II/III, Ada, and Occam. Java has some minimal concurrency support, although it's not well thought out.
There's nothing wrong with multithreaded programming, but some help from the language would be nice. Major issues with threads are "which thread owns what", "which locks lock what data", and "what can I safely call concurrently". C and C++ do not help here. They push the problem off to the operating system, which has no idea what to do about thread level data ownership.
At the operating system level, we're not doing too well either. One good way to write concurrent programs is with multiple intercommunicating processes. Unfortunately, the Unix/POSIX/Linux mechanisms for interprocess communication are awful. You have byte-oriented pipes, sockets, and the seldom used "system V IPC" mechanism. None of these let you do something like an inter-process subroutine call. Subroutine call mechanisms built on top of these stream-like channels tend to be slow, clunky things like CORBA and SOAP. Windows does a bit better, but their fast approach is a legacy from OLE, which was designed for Windows 3.1 on DOS, and their slower approachs are more like SOAP. QNX has usable interprocess messaging, but few non-real-time systems are designed for QNX.
I've been writing heavily concurrent programs with threads since the 1970s. It's possible to do it well, but the tools have not improved much.
Agreed. Threading isn't especially hard if you think about it first. If you stumble around like a donut then you're going to break thread-safety all over the place. This guy seems very alarmist to me. What's next? Space shuttles considered harmful?
Yeah, why write programs that can actually take advantage of multicore processors. That is just crazy talk.
Just like BeOS was a crappy, low performance O/S.
There is no "-1 offended" or "-1 you don't agree with me" mod options for a reason.
I may have misunderstood (I'm not exactly an expert in threading), but I believe that Erlang handles this is a scarily elegant manner... once assigned, a variable can not be changed.
The = operator in Erlang should be looked at in the mathematical sense, so the following (pseudo) code would fail:
a = 2
a = 1 + 3
Because 1+3 != 2
(Disclaimer: I've briefly dabbled in Erlang, but anything I say about it should be taken with several rocks of salt)
I don't understand why threads are useful, except if you have multiple processors (and then, I don't see why you would need more threads than how many processors you have). OK, maybe for GUI, it is useful if the work of the program and the GUI are handled by separate threads. It always seemed to me that, on single processor, it is easier and faster to do the work required in sequential fashion than to use multiple threads (for online applications, you would split the work into time-bounded atomic operations, so one long transaction won't block the others from finishing).
Unix's select/poll mechanism avoids all that. See, e.g., here.
PHEM - party like it's 1997-2003!
Telling girls their threads are harmful is the fastest way to get their clothes off
I use them routinely on MS platforms. Background threads for write-behind mechanisms, for self-tuning caches, for animation. The sharing between threads is the more-precise problem, not threads itself. If one knows how to examine the context of a thread, one can see all shared pints and code accordingly. This is no different that knowing what pieces of data are eventually exposed as public data of a component.
That said, there is a clumsy set of constructs around threading still. Most modern languages do not have the atomic test-and-flip operation around an object as you wish. For example, in the C# realm, I see this routinely: ...I'd much more appreciate the OS supporting a thread-level operation that allowed for ..where above clause was skipped if (sharedMemInitialized==true), and if not, it waited for the "sem" semaphore concept to be unlatched.
In Erlang, variables aren't variable -- they're single assignment. (There is a process dictionary that is mutable, but it isn't usually used and other threads don't have access to it). Inter-thread communication is done via message passing (which may be local or over tcp/ip).
Do you even lift?
These aren't the 'roids you're looking for.
I know nobody born in the last 30 years has bothered to read his memo, but he doesn't pretend gotos are "evil". Just that people should adopt structured control flow structures instead. Meaning, design and use languages with such advanced features as "if/else" statements, and "while" loops, and "functions". Goto considered harmful was written in a time when most people were not using the fancy new languages that offered these features, and he was suggesting that they do so, in order to improve the quality of their code.
Unless you seriously think people should use gotos instead of loops and if/else statements, then you don't disagree with Dijkstra.
Every web server that can handle more than one client simultaneously is basically multi-threaded. Its painfully clear that threads have been an enormously successful programming model and are here to stay. Concurrency is difficult to understand but that doesn't mean it's not necessary.
I've already said this about a dozen times on /., but here goes anyway ;)
In 2001 I worked at CERN, writing simulation and analysis software on a dual P3 machine. The language was Fortran 90, and the compiler made use of SIMD (MMX/SSE) on both processors to parallelize matrix algebra.
The parallelism was abstracted away quite nicely, just as the article suggested. There was probably some thread/process creation under the hood to make use of both CPUs, but the calculations were basically SIMD in nature. F90 handles matrix math natively, so there's no guesswork involved with the "parallelizing" compiler. (I put that in quotes, because the code is already parallel, unlike loops in C.)
There's a lot of processing that has this SIMD-like parallelism, for example sound and video work, and with the correct tools it's easy to expand SIMD from one CPU to many.
Escher was the first MC and Giger invented the HR department.
If we are to have any hope of using computers to solve the most interesting problems then we will eventually be using parallelism on a massive scale. This is how the physical world works. All those atoms and particles swirling around you are not a serial batch process.
The ratio of people to cake is too big
CreateProcess() in Windows is very heavy weight. exec() in *NIX is heavy weight.
fork() in *NIX is light weight in any modern (post 1980) OS. Doesn't exist in Windows.
SpawnThread() in Windows is light weight. In *NIX it is a special case of fork()
My point is that a thread is a thread. If using multiple concurrent threads is harmful, so is using a single thread. Single threading is less harmful than multithreading but harmful nonetheless. The thread is the reason for every ill that ails computing, from the reliability crisis to the parallel programming crisis. There is a way to design and program computers that does not involve threads at all. It's called the non-algorithmic software model. This is the way we should have been doing it in the first place. To find out why algorithmic software (threading) is the work of the devil, read the articles at the links below:
Parallel Programming, Math and the Curse of the Algorithm
Why Software Is Bad and What We Can Do to Fix It
Nightmare on Core Street
150 years after Babbage and Lady Ada introduced the algorithmic computing model, it is time to change. The longer we wait to realize the folly of our ways, the worst our problems are going to get.
The phrase "considered harmful" implies that there is a large community consensus that something is considered harmful.
But it's never used like that; it's always used by one opinionated loudmouth who is the only one in the world who considers the practice harmful. You want another example? Look at that crock of an essay "Reply-to munging considered harmful"; the only one who considers reply-to munging harmful is the opinionated loudmouth who wrote that essay, yet the title falsely implies that the community as a whole has decided reply-to munging is harmful.
It's a phrase filled with sound and fury, an attempt to deceive people into thinking there is a consensus when there is none.
I support the Center for Consumer Freedom
But you had to have "lightweight processes" aka threads, because apparently even copy-on-write forking was "too expensive". Now what's expensive, you dickbags? Huh? Programmer time or time spent debugging weird implicit sharing bugs?
Of course threads are still quite nice. Problem is, most "programmers" today are little more than monkeys who hurl feces at the computer. These people cannot be expected to understand threads. Most of them do not understand database transactions, or any kind of concurrency!
However, the technology should not be blamed for its misuse by idiots.
Threads on Linux (using NPTL) are actually a result of clone().
I used a "goto" in a pthread. A Time-space bubble formed and transported me back in time, just to read this article. So now, thanks to the author I can now avoid my own "Ground Hog Day".
Damn. Thanks for solving all the bugs in my current project ;-)
Of Code And Men
The root of the problem is shared state, operations on shared state need to have ACI properties - atomic, consistent, and independent. Some languages / environments, like Erlang and QNX, solve this problem by basically getting rid of shared state and making all threads communicate with each other over socket-like abstractions. With common programming languages the solution is mutually exclusive locks. You lock up the memory you're working on then unlock it when you're done.
Locks have problems. In order to get good concurrency, you need fine grained locking. Once you have multiple resources protected with multiple locks you run into gotchas like deadlocks and race conditions. There are also issues with exceptions and programmer mistakes causing locks to remain locked or unlocked. All of this can make it very difficult to reason about what is happening in your program and you can end up with bugs that are very difficult to reproduce and/or fix. Locking is also pessimistic - you pay for the cost of locking even if nobody else is looking at the same pieces of memory while you are. Finally, you can't necessarily compose two operations that use their own locks.
A relatively new approach is (Software) Transactional Memory. This is an optimistic method that doesn't use locks. Instead its more like an in-memory database transaction that provides atomic, independent, and consistent access to a set of shared variables. Shared variables have versions associated with them. When you start a transaction the runtime records the versions of the variables. Then after reading and/or writing all the variables in the transaction the runtime compares the current versions with the originally recorded versions. If they are the same the transaction commits and you continue, otherwise the runtime "rolls back" the changes and retries (or doesn't retry, you get control over that if you want.) The cost for uninterrupted reads and writes is very low, you only pay a higher recovery cost when there is an actual conflict.
There are STM libraries available for many popular languages (C#, C++, Java, Ruby) but the most mature and elegant support seems to be in Haskell. Since Haskell is a pure functional language, all side-effects are encapsulated in monads. Because of this encapsulation enforced by the language, it's impossible to actually read or write any transactional variables outside of a transaction. Reading or writing a transactional variable "infects" that function and every calling function until the data is safely brought out of shared memory locations by a transaction. The rest of your code is already thread safe because it doesn't have shared state. The compiler keeps you safe and type checks everything.
There are a bunch of papers you can check out as a starting point. I think this is definitely the future of multithreaded programming.
JoCaml is an extension of OCaml that supports concurrent and distributed programming through the use of the Join Calculus. Unlike other, more well known, process calculi such as CCS and Pi-Calculus, Join Calculus was design to allow efficient implementations.
The basic model is message passing (ala Erlang), but the killer feature is that you can specify "joins" -- think of them as functions that get called when a specific combination of messages are received. I managed to convert a single-threaded simulation into a parallel version that distributes over an arbitrary number of cores/computers in less than three hours. I'm not saying it's a silver bullet or that it's ideal for your next project, but it worked damn well for me. The underlying calculus could be implemented in other languages, although it probably works best in languages with good match statements, such as OCaml and Haskell.
The PowerPC includes for this purpose two instructions called SYNC and EIEIO.
The Actor model, where each object is a separate thread, is the way to the future. When an actor sends a message to another actor, the message is stored in the target actor's message queue and the thread that represents the target actor is woken up to process the message. Results are delivered with future values.
With the Actor model, whatever data parallelization is there in a program is automatically exposed.
Functional programming is hard, non intuitive and even plain distasteful to me. Now I know I'm an idiot, but the problem is most programmers are idiots. The language has to make parallelism easy for us, and if it starts out being functional it's already lost that battle.
There are circumstances where threads are completely inappropriate. Let's say that you were hoping to build an app, that eventually would scale across a single-image cluster farm (for those not in-the-know, this isn't a beowulf cluster, but rather a cluster that you would add a new "node" to, that would then be treated as part of the collective resources of the single "machine". See SSI). Unlike on your single machines, a thread can not practically be migrated to a new "processor" on a different node, because the entire process space would need to be copied when the migration happens. Not to mention, think about a thread changing information in its parent processes space. All of a sudden, you would need to stop the processing of any other threads with the same parent process, on different nodes, sync the memory, then signal them to continue. It would practically make the whole idea of performance clustering useless.
"When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
Shared memory leaves too much control in the hands of the developer needlessly. A lot of the time there really is no need for 2 processes to be able to write to the same data at the same time, yet shared memory makes this situation default and leaves the developer to make sure it doesn't happen.
... not merely forget to check a condition.
It should be the other way around, the developer should have to work harder to be able to get into that situation
Indeed, "X Considered Harmful" is such a common title that the Jargon File has a whole entry for it. And the entry, of course, cites Djikstra as the inspiration for the meme. (Others have disputed this, and claim it was common in mainstream journalism even earlier, but Djikstra's famous essay clearly put the phrase on the map in the CS/IT world.) Merely using the phrase hardly indicates that a comparison to Djikstra's classic work is necessary or justified.
:)
There has been, at least according to the aforementioned Jargon File, an essay in CACM called "GOTO Considered Harmful" Considered Harmful. I have to wonder if we don't need an essay simply titled "Considered Harmful" Considered Harmful".
Furthermore, the Slashdot summary is (as usual) badly flawed: Knuth didn't say that threads were harmful, he simply said that he didn't find them useful. Not that I want to defend threads--I've pulled out enough of my own hair because of them--but, as long as we're on the topic of inappropriate comparisons.... I don't know who this James Nobody is, but I'm pretty sure that he's not in the same league as Djikstra or Knuth, and the inappropriate and misleading attempts to convince me that he his makes me less, not more, interested in reading TFA.
The problem isn't threads, it's threading models that depend on shared state to communicate between threads. If you explicitly pass data between threads using a message paradigm you get most of the performance advantages of threads with the ease of programming in independent processes. Design the code around a model like message passing (as in the Amiga Exec, which was really a threaded share-everything environment, or QNX) or a database-style access method (which is what I've been doing in speedtables) and you can implement it using threads or processes, SMP or not, single-host or networked, depending on the requirements.
You don't necessarily need fine-grained locking to do this, depending on the design.
Sometimes shared state is desirable, just as sometimes gotos are desirable, but only in the context of a structured framework (structured programming, message-based intertask communication) where they're the exception rather than the rule.
Do they not teach people how to write state machines any more? Any application that needs to deal with a fixed set of discrete events can easily be modeled as a finite state machine and implemented in a single process.
that programmers are heir to, I would suggest that most people make hash out of single thread programming...we were doing that for years before we started working in the kernel. So, whats a little more complexity when we already HAVE to have ways to combat it in design and test tools?
I suppose our confidence in driving toward parallelism rests on intuition, such as analogizing that the brains of all higher animals are parallel processors therefore some solution to the problems of parallel computation must exist. It is certainly a commonplace even among non-CS-literate folk that the brain is massively parallel.
SLASHDOT: news for people who can't concentrate on work or have no life at all and got tired of yelling back at the TV.
The main problem I have with Java threads, vs Erlang, is that Java threads are still using locks. They're locks with nice syntactic sugar on them, but locks nonetheless.
Don't thank God, thank a doctor!
Sounds like a case designed for Erlang.
Erlang makes multithreading on the same machine easy, using a message-passing model. If you need to connect to a remote machine, you can do that pretty much as easily. It would work well on a Beowulf cluster.
I'm not sure how well it would work on a "single-image cluster farm", but I don't see why it would be any harder to create multiple Erlang processes on the same "machine" and have them pretend to be remote.
Don't thank God, thank a doctor!
James Reinders, the article's author, is the Intel marketing director for the company's Developer Products Division. The Developer Products Division is responsible for the Thread Building blocks (TBB) API, which is mentioned in the article as a solution to the "problem" presented. To his credit, Mr. Reinders does also mention OpenMP.
With Erlang you've got no shared data between threads (processes in Erlang terminology), along with immutable data. Both of these greatly simplify concurrency as you don't need locks. If you need a shared resource (such as a file or network socket), a trival way to implement it in Erlang is to have another process that simply acts as a interface between the resource and any other process(es). To communicate between processes in Erlang you use asynchornous message passing, and by design you can block or wait for a limited time to recieve messages. Have a look at the Actor model which Erlang implements: http://en.wikipedia.org/wiki/Actor_model
Clone is the fundamental op in Linux. fork is a special case of clone. Creating a posix thread is another special case of clone. Meanwhile, clone itself has a series of flags to allow for a broad spectrum from shared nothing (including namespace and expreimentally, pid space) to shared everything except cpu context. Posix threads are at one end while fork falls near (but not at) the other end with flags like CLONE_NEWNS, you can share even less than a fork call would.
Just use OpenMP. Makes optimizing a program for parallel execution very easy.
I got nothin'
Welcome to Slashdot!
There's too many headlines concerning "X Considered Harmful"
Q) Why did the multithreaded chicken cross the road?
A) to To other the side. get the
Education is the silver bullet.
Let me just say that I miss reading posts like yours on Slashdot. Well done, sir.
Threads aren't the problem. It's the retarded programmer that hasn't a clue as to how to use them that is the problem. It's like blaming guns for people shooting each other.
Want threads to be safe? How about having enough grey matter to protect shared vars, etc with a mutex/semaphore/etc. Or how about making a thread only variable (man pthread_key_create, pthread_[get,set]specific). Hell, how about the crazy idea of keeping track of ones data flow.
This sort of ridiculous thinking is only keeping us from getting the power out of our machines that we could. It's time people suck it up and learn to deal with REALITY. Multi-core/Multi-processor machines are only getting more so. To ignore that is just plain profound stupidity. Stop complaining and just learn how to do it.
Well, at least there are some people out there that know how to think:
http://cag.csail.mit.edu/ps3/
Exactly. A single-threaded program is like a craftsman. A concurrent program is like an assembly line. Humanity has only been doing assembly lines for over a century now, i.e. Henry Ford's car factory. It sure would be nice if we could bring computer programming up to the beginning of last century.
Every other engineering discipline seems to be able to handle the concept. The only thing holding up the software world from embracing concurrency is our own collective lameness.
"Once we've identified and embraced our sickness, we'll have strength...and that's when we get dangerous." - John Waters
AFAIunderstand, if all your shared objects are immutable, then you don't need to use locks at all. Only mutable shared objects need to be synchronised.
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
Erlang concurrency works through lightweight processes that don't share staet and can communicate only through special dedicated communications channels (or at least that's the high-level view; on a low level, I assume, though I haven't looked at the implementation, that in fact the communication channels have to be implemented as shared state with locking enforced by the language implementation.) Consequently, "thread-safety" is rendered into a non-issue.
This is true, but TFA is still marginally useful in that the comments present a link to a more useful article which reflects the same general viewpoint as TFA but does make an argument and present alternatives:
http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf
I'm tired of reading replies to this article that evangelize some fancy-schmancy high-level solution. I wonder if these advocates have ever tried writing production code in such an environment.
Let me give you a wonderful example of when theory simply doesn't meet reality.
Recently, I wrote a bunch of multi-threaded code for a next-generation asymmetric-multiprocessing game console that shall remain nameless. Its operating system has a wonderful complement of synchronization features. There's the usual mutex lock/unlock, and the usual condition signal/wait, but there are also event queues (queues of generic events that can be passed between threads running on different types of processors), lightweight mutexes/conditions, spinlocks, semaphores, reader/writer locks, and so on and so on. Truly a rich palette from which one can paint a wonderfully synchronized multi-threaded application! I then proceeded to try to rewrite a key section of our code in a very multi-threaded way.
The problem was, the first version of this code added NINETY milliseconds per frame to our main thread. A profile showed that nearly all of the extra time was spent in the operating system's synchronization features.
After much rewriting and much pain, I stopped using all of the operating system's synchronization features, and used processor-level atomic operations instead, and finally, the extra code accounted for only FOUR milliseconds per frame in our main thread (with the rest of the time successfully farmed out to separate threads).
I challenge anyone with a fancy-schmancy automatic concurrency solution to demonstrate that it doesn't have this problem.
"Once we've identified and embraced our sickness, we'll have strength...and that's when we get dangerous." - John Waters
In before Pern!
And of course, should you program ever run a multi-processor system, it will fail in all sort of subtile, non-predictive way. E.g., in all cases where a variable has been loaded into a register, but likely there are also cases involving caching to some degree.
Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.
how about writing that up as an article for gamedev.net :)
The truth is that threading is a professional power tool. Like any pro power tool, you have to be careful and know how to use it, or it will rip your arm off (well, metaphorically) but it does do what you tell it to and can lead to great results.
Shared memory threading is basically just having two (or more) CPUs let rip on the same piece of memory. It can do great stuff, but it can also go wrong horribly easily and it is infamous for heisenbugs. Message-passing threading, and while it has higher overheads for smaller configurations, it is easier to debug and it scales up much larger (since it's not hard to pass messages to other processes, other computers or even other clusters); indeed, the largest message-passing system in the world is massively parallel and goes by the name of The Internet. Shared memory systems simply can't scale up that large. (As a side note, the apotheosis of a shared memory system came in some of the supercomputers in the '90s, which had very complex memory management hardware to make the model work; these days, supercomputers use message passing because it's far easier to scale that up.)
"Little does he know, but there is no 'I' in 'Idiot'!"
For example, it has been benchmarked by Ulf Wiger (iirc) with up to 20 million scheduled lightweight concurrent entities ('processes') on one modest box. But that might be because it encourages a lock-free design. :)
you had me at #!
I suppose that would work... Erlang isn't quite that, though. That's how a purely-functional language would work, which Erlang isn't.
Variables may not be changed once bound, but they can be either bound or unbound, and that is a mutable state -- they can change from unbound to bound. If variables were globally accessible in Erlang, I could see race conditions happening.
The way Erlang avoids locks is by having a shared-nothing architecture. It's all done with message-passing.
Don't thank God, thank a doctor!
Why Threads Are A Bad Idea (for most purposes) http://home.pacbell.net/ouster/threads.pdf
On the aforementioned unnamed game console, that's called a "lightweight sync", and is a processor instruction.
Don't confuse your lack of knowledge with the lack of an answer.
"Once we've identified and embraced our sickness, we'll have strength...and that's when we get dangerous." - John Waters
I'm sure it does. But unless that mystery instruction sync's all the processors' caches (and registers, if your compiler decided to put your variable in a register), you will still have these errors. Of course, if I understood you correctly, and the system in question only had one processor, this is not an issue and atomic operations works just fine.
Threading is always easier on one processor, likewise on one computer unit.
Don't confuse your lack of knowledge with the lack of an answer.
No need to get all uptight, it was not a criticism of you.
Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.
Mystery instruction? It's well documented.
Also, anything you need communicated to other threads needs to be put into what C++ would call a "volatile" variable. You can't expect randomly-architected code to just magically work in a multithreaded context.
Where did you get that impression? I clearly called it "asymmetric-multiprocessing".
At the risk of sounding "uptight", you sound like a very lazy programmer. You don't understand the subject of multithreading, you don't even read what I write, and yet you act as if your opinion has merit.
"Once we've identified and embraced our sickness, we'll have strength...and that's when we get dangerous." - John Waters
Mystery instruction? It's well documented.
I was hinting at the fact that we are talking of an unnamed system with unnamed processors. It makes it harder to give a qualified answer
Also, anything you need communicated to other threads needs to be put into what C++ would call a "volatile" variable. You can't expect randomly-architected code to just magically work in a multithreaded context.
Volatile won't help you. Volatile just disables some optimisations, and does not sync the caches in the processors - which I admit your mystery instruction might, though I find it unlikely. Try reading volatile considered harmful - I know it is for the linux kernel dev, but it does a fair job of explaining the issues at hand. As an extract
[...]one must protect shared data structures against unwanted concurrent access, which is very much a different taskIn short, you almost never want to use volatile, unless you are manipulating memory mapped IO through a pointer or similar strange tasks. The key point is that volatile does *not* sync caches in any way.
At the risk of sounding "uptight", you sound like a very lazy programmer. You don't understand the subject of multithreading, you don't even read what I write, and yet you act as if your opinion has merit.
lol. You are cute.
Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.
Actually, I was referring to OS processes. Within-process "process-like" threads would also be interesting, and a huge improvement other threading as currently practiced.
"Not an actor, but he plays one on TV."
Hmmm, let's see. There are exactly two next-generation game consoles in existence. One is symmetric-multiprocessing, the other is asymmetric-multiprocessing. And you're willing to admit, in public, that you're confused by this?
I guess it's a stroke of luck that you're too dense to understand how humiliated you should feel.
"Once we've identified and embraced our sickness, we'll have strength...and that's when we get dangerous." - John Waters
Hmmm, let's see. There are exactly two next-generation game consoles in existence. One is symmetric-multiprocessing, the other is asymmetric-multiprocessing. And you're willing to admit, in public, that you're confused by this?
Confused? No. But I refuse to guess.
I guess it's a stroke of luck that you're too dense to understand how humiliated you should feel.
You are still funny. Do you understand how volatile works now? :p
Seriously, I am sorry I exposed your ignorance, I know that rubs some people the wrong way. It is just that what you did is --- in general --- such a grave error I thought a warning was appropriate. I am still not speaking of the specific case --- the Cell architecture is not something I have studied in detail, but my understanding was that the was one power-something processor and several non-ram-sharing co-processors, so I think you luck out in your case. It only applies if two cores that do not share cache (typically because they reside on two different chips) attempts to modify the same, cached data.
Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.
Seems unnecessarily idiosyncratic.
Oh really. So if I write to my thread-communication structure through a pointer to volatile memory, do a lightweight sync, then write the flag that tells other threads that the structure is ready for use, then lightweight-sync again, you're telling me that's not valid?
I hope you know about lightweight-sync; it's part of the PowerPC architecture. That's only been around for nearly twenty years. The hot burning feeling is the realization that your obsolescence is measured in decades.
"Once we've identified and embraced our sickness, we'll have strength...and that's when we get dangerous." - John Waters
Although I'm skeptical, I am reading about Erlang now.
One problem I see so far is the way messages are passed between processes. Since Erlang is defined to send the message and continue without stopping (and presumably without the possibility of losing the message), that implies dynamic allocations for message queues. That's unacceptable in a limited-memory embedded platform like a video game console. It may be fine in one of Ericcson's telephone switches -- presumably they have more memory and no hard response-time limits (e.g. making framerate).
"Once we've identified and embraced our sickness, we'll have strength...and that's when we get dangerous." - John Waters
Seems unnecessarily idiosyncratic.
I really don't care what it seems to be to you.
Oh really. So if I write to my thread-communication structure through a pointer to volatile memory, do a lightweight sync, then write the flag that tells other threads that the structure is ready for use, then lightweight-sync again, you're telling me that's not valid?
On the condition I gave, no that is not valid. The sync only works on the specific core to provide a barrier to some internal state. But if the memory is cached in two different processors, both will be able to take that "lock" and not discover it.There is nothing magic about this, really. What should prevent this from happening? I hope you know about lightweight-sync; it's part of the PowerPC architecture. That's only been around for nearly twenty years. The hot burning feeling is the realization that your obsolescence is measured in decades.I have never had to work with that particular processor, so I really don't know why I should have read up on it. My criticism was about your codes portability, not whether it would work on some specific platform.
Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.
I'm like so wounded.
Because only one thread can write to an instance of that structure at a time? Otherwise I'd need a real semaphore. It's called a "lock-free design", which is a lot more efficient than requiring an operating-system semaphore. Your multithreaded code must thrash the hell out of the operating system. Have you ever profiled any of it?
General education? To improve the breadth of your knowledge? So that the principles you learn can be applied to other situations? Intellectual cross-training? These things really didn't occur to you?
You seem to me to be a perfectly normal computer programmer. Problem is, no one ever got anywhere by being normal.
"Once we've identified and embraced our sickness, we'll have strength...and that's when we get dangerous." - John Waters
Because only one thread can write to an instance of that structure at a time?
Except this is unlikely to be true. The structure could reside in both CPUs cache, and thus be corrupted. Of course. you would have to be very unlucky on several counts. Otherwise I'd need a real semaphore.Or similar, sure. That is the only way to be sure, the more's the pity. One reason why multi-threaded code is sometimes slower than single-threaded.
It's called a "lock-free design", which is a lot more efficient than requiring an operating-system semaphore. Your multithreaded code must thrash the hell out of the operating system. Have you ever profiled any of it?I prefer to avoid locks, which works fine for me. But I suspect that my code is very different from what you do, so my methods might not work for you :)
General education? To improve the breadth of your knowledge? So that the principles you learn can be applied to other situations? Intellectual cross-training? These things really didn't occur to you?
Learning new things are something I do continually, but obscure processors are quite far down my listYou seem to me to be a perfectly normal computer programmer. Problem is, no one ever got anywhere by being normal.
Looks like your opinion of me has improved somewhat ;) Anyway, I am a mathematician first, programming is just something I do.
Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.
Except that this is exactly what lightweight-sync prevents...!
I don't think I've ever heard the PowerPC line of processors described as "obscure".
If you don't claim to be a professional computer programmer, I can forgive you for not being this hardcore. The vast majority of my co-workers over the years, though, have no such excuse. ;-)
"Once we've identified and embraced our sickness, we'll have strength...and that's when we get dangerous." - John Waters