Multithreading - What's it Mean to Developers?
sysadmn writes "Yet another reason not to count Sun out: Chip Multithreading. CMT, as Sun calls it, is the use of hardware to assist in the execution of multiple simultaneous tasks - even on a single processor. This excellent tutorial on Sun's Developer site explains the technology, and why throughput has become more important than absolute speed in the enterprise.
From the intro: Chip multi-threading (CMT) brings to hardware the concept of multi-threading, similar to software multi-threading. ... A CMT-enabled processor, similar to software multi-threading, executes many software threads simultaneously within a processor on cores. So in a system with CMT processors, software threads can be executed simultaneously within one processor or across many processors. Executing software threads simultaneously within a single processor increases a processor's efficiency as wait latencies are minimized. "
How long has hyperthreading been available on Intel CPU's?
It means we're going to have to lean to program in parallel. We're going to have to parallelize our data processing and we're going to have to learn synchronization and locking methods.
This is nothing new. The decreasing returns and impending limits of single threaded processing has been upcoming for a long time now.
Start Running Better Polls
Well, if your data conversions are independent, multithreading might be of benefit to you if you have a hyperthreading processor.
And are you sure you are maxing the processor? Surely you have to wait for disk or network, at least some of the time. If more than 10% or so (number pulled from ass but based on empirical observations) of you time is spent waiting for latent devices, you can benefit from multithreading even on a plain vanilla single CPU system with no hyperthreading.
Well, if your data conversions are independent, multithreading might be of benefit to you if you have a hyperthreading processor.
Unless the two execution states overflow your L1 cache, in which case a HT CPU could run slower.
"We returned the General to El Salvador, or maybe Guatemala, it's difficult to tell from 10,000 feet"
It means "Difficult to reproduce bugs".
It worries me how many people just say "it means faster programs and doesn't take much more work". That mindset leads to lazy programmers who A - Can't optimize to save their jobs; and B - Don't actually understand what multithreading really does.
If you consider it easy, you've either just thrown great big global locks on most of your code, in which case your code doesn't actually parallelize well; or you've written what I refer to in my first sentence - Bugs that take an immense effort just to reproduce, nevermind track down and fix.
I know how to deal with them. It may seem easy at first, but it's actually very hard. Your program can run for days before a thread synchronization bug surfaces and it finally deadlocks. And since it's timing dependent, you can't reproduce it.
In principle there are rules to follow to avoid deadlocks and race conditions, but since they need to be manually enforced, there's always potential for error. At least with memory access bugs the hardware often shows you a segfault; with synchronization problems you usually don't even get that.
I've learned over the years that preemptive multithreading should be used only as a last resort, and even then, it's best to put exactly one synchronization point in the entire app. Self-contained tasks should be dispatched from that point and deliver their results back with little or no interaction with the other threads.
The worst thing you can do is randomly sprinkle a bunch of semaphores, mutexes, etc. all over your app.
Since I mostly work on J2EE stuff, I let the container take care of the threading for me. The one exception is J2EE Connector Architecture (JCA) bits that use the work manager. Even there, however, most of my work is simply putting a thin JCA layer in place between the outside world and the J2EE stack.
For me, these new chips simply mean increased performance for deployed apps, without any modification to the app code.
Beauty!
668: Neighbour of the Beast
As many others have already pointed out, Intel has had Hyperthreading available in Pentium 4 and Xeon CPUs for a couple of years now, which does exactly what the article is talking about.
As many others know, you know exactly nothing about what you are talking about. HT has basically two sets of registers so that during a cache miss which would cuase a bubble the chip switches to the other set so it doesn't sit idle. Suns chip on the other hand actually have multiple corses physically doing work at the same time. In fact were it not for Intel's hideously flawed NetBurst architecture the hideous hack that is HyperThreading would not provide any preformance increase at all (in fact it doesn't as much provide an increase as much as negate a decrease...). For evidence consider how many Pentium Ms have HT on them... Now I may not be fully correct but I didn't volunteer a comment; I only posted to prevent the misinformation of others. You'll find more on ArsTechnica. I'd link to the article but I can't find anything on their redesigned site.
Your CPU is not doing anything else, at least do something.
Sun are putting in hardware to ensure that context switches are fast (possibly even one or two cycles); hopefully, this will result in the context switches costing less than waiting for memory accesses, and speed up the throughput of the system as a whole. So, benchmarking one thread of execution will show a slow system, whereas a group will hopefully show a big speedup.
I appear to have a blog. Odd.
As far as threading is concerned, one of the few languages I've dealt with that makes mutexes, semaphores, etc. easy to deal with is Java. Most other languages bury the stuff too deep into the proprietary APIs to make them useful. Consider multithreading in win32. We need better programming languages before we can ever start reaping the benefits of good multithreading hardware.
Furthermore, we need to get rid of lazy programming. I'm tired of watching people write slow, lazy, inefficient (in terms of both memory space AND speed) code, and justify its existence with "it'll run fast on the new über-hyper-monkey-quadruple-bucky processors." Too many times, the problem is that you've got slow code running in every thread. If the code wasn't so damned lazy, programmers would care more about nifty new hardware. We're not even coming close to using our current hardware to capacity. I've got a 1.2GHz processor with 1024Mb of RAM, and my box chugs opening an M$ Word doc?! WTF?!
<soapbox>
Most programming in the world is very similar to the universal statu$ symbol in the U.S.A. - a big gas-guzzling SUV. It's not like Jane the Soccer Mom really needs 300hp to haul her kids and groceries around town. Similarly, we have lots of lazy code out there that doesn't do much of anything but consume resources and pollute the environment. A nifty new processor feature won't be noticed in the computing world because it won't get used anyway, just like Jane the Soccer Mom wouldn't notice 100 more horsepower. </soapbox>
I pity the foo that isn't metasyntactic
When a person says something, the intended meaning is not ambiguous (unless you are a poet), although the words used to describe that meaning may be.
In this case it was intended to mean "What does it mean" and absolutely nothing else, your grammatical writhings notwithstanding.
Since the Pentium 4 according to Intel, but it's not a good question as that's Intel's trademarked term for their two-thread implementation of simultaneous multithreading:
By contrast, Niagara is implementing Chip-level multiprocessing:
In other words, Niagara implements in hardware, at greater scale, what Pentium 4 offers as an emulation feature. In theory one could SMP on top of CMP chipsets for even greater throughput. If you find the Sun article too hard, the Wikipedia references I have cited will probably prove much easier to understand.
Pure bullshit.
"As far as threading is concerned, one of the few languages I've dealt with that makes mutexes, semaphores, etc. easy to deal with is Java."
So can Erlang.
Wings3D is written in Erlang.
Oracle's been talking about reworking their licensing for a long time, and I agree licensing by core is sub-optimal. However, Oracle is being forthright that they charge by core, while Sun is _hiding_ the fact the USIV _is_ a multi-core processor.
Sure, Oracle are the ones charging per processor core, but Sun is the company that is selling this upgrade as a painless, cost-effective way to upgrade their infrastructure. I firmly believe they are being negligent in not warning customers that this is a multi-core architecture - if you go to Sun's site and look at how its sold, they pitch it as one processor, one core.
Imagine you're a customer - you spend $100k on Sun's new processors as a "painless" 1-1 upgrade, and suddenly find out that the first 100k has put you on the hook for 150k in new licenses. Wouldn't you feel like you'd been misled?
Thanks,
Matt
me@mzi.to
Make a comment and ask a question and get marked as troll.
Go figure.
Hexy - a strategy game for iPhone/iPod Touch
Ah, but when you have one physical 'chip' that actually consists of four processor cores, you *can* do four simultanious tasks on one processor.
The advantage over good old fashioned SMP? Well, probably the interconnect is way faster, and if the cores all share some cache or something, sibling threads should see some benefit.
Vintage computer games and RPG books available. Email me if you're interested.
Another problem with multi-threading is that nothing is a black box anymore (not like anything really is, anyway). Once you start worrying about sharing statics and globals, you need to consider all the accesses done by objects you bring in from other libraries, which means you need to check the source to see if, for example, it uses a static cache (with no locking). Then, you need to dig in, find out why you're getting seg faults or corrupted memory, track down where else you're using this class (could be, for example, inside another object entirely, which could again be from a different library), synchronize with that other thread that you thought was completely unrelated (i.e., used the same class, but had its own instance of it), rinse, and repeat.
If you are using third party or homegrown (but not yours) libraries inside your multithreaded program, pretty soon you'll realize that not only do you need to know what and how they are accessing, but you also need to keep close tabs on what changes are done in future releases (again, keeping track of implementation details in those release). Your quick and highly parallelized threading program just became a maintenance nightmare.
I'm quite impressed that you managed to convince yourself that Java multithreading is in any way better than win32's multithreading API. I mean, Have you ever even tried to write a program with nontrivial multithreading in Java?
Just an example, in win32, a thread can blcok to wait on multiple locks or events and wake when any of them is signalled (e.g. waiting on multiple sockets, or on multiple client threads, or until one of a set of locks is open). This is just a single WaitForMultipleObjectsEx call. This is impossible to do in java. You have to have every lock run its own thread, to grab a lock and do a signal. At least now java has a socket.select, so you don't need to keep a thread per socket (as you did in 1.0 and 1.1).
Try writing something to handle the reader/writer problem with multiple readers and writers. You require some really contrived code using something like five locks in order for it to work correctly. Even the reader/writer streams provided by the java api only work with a single pair of threads.
The windows API is ugly in parts, especially the GUI section, but the windows base API is quite nice. I like their synchronization stuff a lot. It's even object oriented, of sorts. For example, you always use the same calls to wait for any object, regardless of the type of object (mutex, lock, semaphore, event, file handle, thread, etc), a sort of dynamic dispatch.
I'm told Java 1.5 fixed up a lot of these problems, but unless you started coding a few months ago, I don't understand how you can consider Java's threading API anything but crippled and lousy.