Slashdot Mirror


Multithreading - What's it Mean to Developers?

sysadmn writes "Yet another reason not to count Sun out: Chip Multithreading. CMT, as Sun calls it, is the use of hardware to assist in the execution of multiple simultaneous tasks - even on a single processor. This excellent tutorial on Sun's Developer site explains the technology, and why throughput has become more important than absolute speed in the enterprise. From the intro: Chip multi-threading (CMT) brings to hardware the concept of multi-threading, similar to software multi-threading. ... A CMT-enabled processor, similar to software multi-threading, executes many software threads simultaneously within a processor on cores. So in a system with CMT processors, software threads can be executed simultaneously within one processor or across many processors. Executing software threads simultaneously within a single processor increases a processor's efficiency as wait latencies are minimized. "

24 of 357 comments (clear)

  1. -1, Redundant: Hyperthreading. by Anonymous Coward · · Score: 2, Insightful

    How long has hyperthreading been available on Intel CPU's?

  2. Same thing SMP and such has meant by Soong · · Score: 4, Insightful

    It means we're going to have to lean to program in parallel. We're going to have to parallelize our data processing and we're going to have to learn synchronization and locking methods.

    This is nothing new. The decreasing returns and impending limits of single threaded processing has been upcoming for a long time now.

    --
    Start Running Better Polls
    1. Re:Same thing SMP and such has meant by SunFan · · Score: 2, Insightful

      It means we're going to have to lean to program in parallel.

      Not really. If you've been using SMP servers, what's different about SMP on a chip? Even if you only have a few dozen Apache processes running, Solaris will schedule them onto Niagara just like if you had lots of separate CPUs.

      I don't think this is as big a change as people think. The main advantage will be a super-efficient CPU (50 to 60 watts, IIRC) but with the performance of many regular CPUs (hundreds of watts).

      --
      -- Microsoft is the most expensive commodity operating system and office suite vendor in the marketplace.
    2. Re:Same thing SMP and such has meant by Bastian · · Score: 2, Insightful

      I imagine that multithreading is a situation where OOP finally begins to really shine, as the amount of code factoring involved would make it much easier to keep track of when and where you need to be frotzing with synchronization and locking.

      I also imagine that if you can try to line up thread boundaries with object boundaries, the task of avoiding race conditions becomes almost trivial.

      But then, I haven't done much serious multithreaded programming, so maybe I am missing the point. Someone set me straight.

    3. Re:Same thing SMP and such has meant by Homology · · Score: 2, Insightful
      Solaris treats each process as a single thread in the kernel. It makes little difference from a scheduling point of view whether you have a 32-thread application or a 32-process application, except the latter might consume more memory.

      With threads you have to syncronize access to common data that resides in the same memory adress space. With processes you don't have to do this as they have their own copy of the data at fork.

  3. Re:i dont use multithreading by pclminion · · Score: 4, Insightful
    anything i write usually maxes out the processor at 100% for days at a time (i deal with huge data conversions) so yeah i'd also like to know: what does it mean to me?

    Well, if your data conversions are independent, multithreading might be of benefit to you if you have a hyperthreading processor.

    And are you sure you are maxing the processor? Surely you have to wait for disk or network, at least some of the time. If more than 10% or so (number pulled from ass but based on empirical observations) of you time is spent waiting for latent devices, you can benefit from multithreading even on a plain vanilla single CPU system with no hyperthreading.

  4. Re:i dont use multithreading by Fulcrum+of+Evil · · Score: 2, Insightful

    Well, if your data conversions are independent, multithreading might be of benefit to you if you have a hyperthreading processor.

    Unless the two execution states overflow your L1 cache, in which case a HT CPU could run slower.

    --
    "We returned the General to El Salvador, or maybe Guatemala, it's difficult to tell from 10,000 feet"
  5. marketing handwave by klossner · · Score: 2, Insightful
    "throughput has become more important than absolute speed in the enterprise"
    I've been seeing this quote in press releases for three decades. It has always meant "we can't compete on performance so we're going to explain why performance isn't important anymore." The few times my management bought that story, they came to regret it.
  6. What DOES it mean to me? by pla · · Score: 5, Insightful

    It means "Difficult to reproduce bugs".

    It worries me how many people just say "it means faster programs and doesn't take much more work". That mindset leads to lazy programmers who A - Can't optimize to save their jobs; and B - Don't actually understand what multithreading really does.

    If you consider it easy, you've either just thrown great big global locks on most of your code, in which case your code doesn't actually parallelize well; or you've written what I refer to in my first sentence - Bugs that take an immense effort just to reproduce, nevermind track down and fix.

  7. Re:it means a lot by Waffle+Iron · · Score: 5, Insightful
    Multi-threading comes with synchronization, semaphore, mutex, etc, once you know how to deal with them, it's easy.

    I know how to deal with them. It may seem easy at first, but it's actually very hard. Your program can run for days before a thread synchronization bug surfaces and it finally deadlocks. And since it's timing dependent, you can't reproduce it.

    In principle there are rules to follow to avoid deadlocks and race conditions, but since they need to be manually enforced, there's always potential for error. At least with memory access bugs the hardware often shows you a segfault; with synchronization problems you usually don't even get that.

    I've learned over the years that preemptive multithreading should be used only as a last resort, and even then, it's best to put exactly one synchronization point in the entire app. Self-contained tasks should be dispatched from that point and deliver their results back with little or no interaction with the other threads.

    The worst thing you can do is randomly sprinkle a bunch of semaphores, mutexes, etc. all over your app.

  8. My J2EE Application wil FLY by PHAEDRU5 · · Score: 3, Insightful

    Since I mostly work on J2EE stuff, I let the container take care of the threading for me. The one exception is J2EE Connector Architecture (JCA) bits that use the work manager. Even there, however, most of my work is simply putting a thin JCA layer in place between the outside world and the J2EE stack.

    For me, these new chips simply mean increased performance for deployed apps, without any modification to the app code.

    Beauty!

    --
    668: Neighbour of the Beast
  9. way to get it wrong by CaptainPinko · · Score: 5, Insightful

    As many others have already pointed out, Intel has had Hyperthreading available in Pentium 4 and Xeon CPUs for a couple of years now, which does exactly what the article is talking about.

    As many others know, you know exactly nothing about what you are talking about. HT has basically two sets of registers so that during a cache miss which would cuase a bubble the chip switches to the other set so it doesn't sit idle. Suns chip on the other hand actually have multiple corses physically doing work at the same time. In fact were it not for Intel's hideously flawed NetBurst architecture the hideous hack that is HyperThreading would not provide any preformance increase at all (in fact it doesn't as much provide an increase as much as negate a decrease...). For evidence consider how many Pentium Ms have HT on them... Now I may not be fully correct but I didn't volunteer a comment; I only posted to prevent the misinformation of others. You'll find more on ArsTechnica. I'd link to the article but I can't find anything on their redesigned site.

    --
    Your CPU is not doing anything else, at least do something.
  10. Re:Efficiency and latency are mutal tradeoffs by farnz · · Score: 2, Insightful
    A processor's wait latency is the time it spends doing absolutely nothing while it waits for an external device to catch up. If your RAM latency is around 100 cycles, and context switching costs you 100 cycles, you're right in saying that efficiency goes down. On the other hand, if each context switch costs you 10 cycles, you can context switch nine times before you've started to lose efficiency.

    Sun are putting in hardware to ensure that context switches are fast (possibly even one or two cycles); hopefully, this will result in the context switches costing less than waiting for memory accesses, and speed up the throughput of the system as a whole. So, benchmarking one thread of execution will show a slow system, whereas a group will hopefully show a big speedup.

  11. Re:it means a lot by guitaristx · · Score: 5, Insightful

    As far as threading is concerned, one of the few languages I've dealt with that makes mutexes, semaphores, etc. easy to deal with is Java. Most other languages bury the stuff too deep into the proprietary APIs to make them useful. Consider multithreading in win32. We need better programming languages before we can ever start reaping the benefits of good multithreading hardware.

    Furthermore, we need to get rid of lazy programming. I'm tired of watching people write slow, lazy, inefficient (in terms of both memory space AND speed) code, and justify its existence with "it'll run fast on the new über-hyper-monkey-quadruple-bucky processors." Too many times, the problem is that you've got slow code running in every thread. If the code wasn't so damned lazy, programmers would care more about nifty new hardware. We're not even coming close to using our current hardware to capacity. I've got a 1.2GHz processor with 1024Mb of RAM, and my box chugs opening an M$ Word doc?! WTF?!

    <soapbox>
    Most programming in the world is very similar to the universal statu$ symbol in the U.S.A. - a big gas-guzzling SUV. It's not like Jane the Soccer Mom really needs 300hp to haul her kids and groceries around town. Similarly, we have lots of lazy code out there that doesn't do much of anything but consume resources and pollute the environment. A nifty new processor feature won't be noticed in the computing world because it won't get used anyway, just like Jane the Soccer Mom wouldn't notice 100 more horsepower. </soapbox>

    --
    I pity the foo that isn't metasyntactic
  12. Re:Wrong. by pclminion · · Score: 2, Insightful
    Are you being purposefully dense?

    When a person says something, the intended meaning is not ambiguous (unless you are a poet), although the words used to describe that meaning may be.

    In this case it was intended to mean "What does it mean" and absolutely nothing else, your grammatical writhings notwithstanding.

  13. Complementary concepts by WebMink · · Score: 2, Insightful

    Since the Pentium 4 according to Intel, but it's not a good question as that's Intel's trademarked term for their two-thread implementation of simultaneous multithreading:

    Simultaneous multithreading allows multiple threads to execute different instructions in the same clock cycle, using the execution units that the first thread left spare.

    By contrast, Niagara is implementing Chip-level multiprocessing:

    CMP is SMP implemented on a single VLSI integrated circuit. Multiple processor cores (multicore) typically share a common second- or third-level cache and interconnect.

    In other words, Niagara implements in hardware, at greater scale, what Pentium 4 offers as an emulation feature. In theory one could SMP on top of CMP chipsets for even greater throughput. If you find the Sun article too hard, the Wikipedia references I have cited will probably prove much easier to understand.

    1. Re:Complementary concepts by WebMink · · Score: 2, Insightful
      Actually the wikipedia article doesn't say that. It says:
      Sun Microsystems, in contrast, considers its UltraSPARC IV to be a multi-threaded rather than multi-processor chip. Intel agrees with Sun. This is not an idle debate, because software is often more expensive when licensed for more processors.

      Sun refers to the architecture as Chip-level Multi-Threading (CMT) and according to the white paper, while there are indeed multiple cores, each can also multi-thread:

      Sun's CMT processors will also have multiple cores on a single piece of silicon, with each core being able to process multiple threads, as shown in Figure 1.5. As a result, a single CMT processor will be able to process tens of threads simultaneously, exponentially increasing the amount of data processed each second.

      Cache also seems to have been considered:

      Shared chip resources such as large amounts of cache are designed to speed communications between cores to streamline parallel processing of threads.

      So while breathless enthusiasm may not be in order, a certain level of optimism seems warranted :-)

  14. Re:it means a lot by Homology · · Score: 2, Insightful
    As far as threading is concerned, one of the few languages I've dealt with that makes mutexes, semaphores, etc. easy to deal with is Java. Most other languages bury the stuff too deep into the proprietary APIs to make them useful. Consider multithreading in win32 [microsoft.com]. We need better programming languages before we can ever start reaping the benefits of good multithreading hardware.

    Pure bullshit.

  15. it means a lot-Erlang. by Anonymous Coward · · Score: 1, Insightful

    "As far as threading is concerned, one of the few languages I've dealt with that makes mutexes, semaphores, etc. easy to deal with is Java."

    So can Erlang.

    Wings3D is written in Erlang.

  16. Re:This is just Multi-core processing... by mzito · · Score: 2, Insightful


    Oracle's been talking about reworking their licensing for a long time, and I agree licensing by core is sub-optimal. However, Oracle is being forthright that they charge by core, while Sun is _hiding_ the fact the USIV _is_ a multi-core processor.

    Sure, Oracle are the ones charging per processor core, but Sun is the company that is selling this upgrade as a painless, cost-effective way to upgrade their infrastructure. I firmly believe they are being negligent in not warning customers that this is a multi-core architecture - if you go to Sun's site and look at how its sold, they pitch it as one processor, one core.

    Imagine you're a customer - you spend $100k on Sun's new processors as a "painless" 1-1 upgrade, and suddenly find out that the first 100k has put you on the hook for 150k in new licenses. Wouldn't you feel like you'd been misled?

    Thanks,
    Matt

    --
    me@mzi.to
  17. Re:Hyperthreading by BigZaphod · · Score: 2, Insightful

    Make a comment and ask a question and get marked as troll.

    Go figure.

  18. Re:FTA/RAIABSF instead of FT/RAID? by SuiteSisterMary · · Score: 2, Insightful

    Ah, but when you have one physical 'chip' that actually consists of four processor cores, you *can* do four simultanious tasks on one processor.

    The advantage over good old fashioned SMP? Well, probably the interconnect is way faster, and if the cores all share some cache or something, sibling threads should see some benefit.

    --
    Vintage computer games and RPG books available. Email me if you're interested.
  19. Re:it means a lot by Gauchito · · Score: 2, Insightful

    Another problem with multi-threading is that nothing is a black box anymore (not like anything really is, anyway). Once you start worrying about sharing statics and globals, you need to consider all the accesses done by objects you bring in from other libraries, which means you need to check the source to see if, for example, it uses a static cache (with no locking). Then, you need to dig in, find out why you're getting seg faults or corrupted memory, track down where else you're using this class (could be, for example, inside another object entirely, which could again be from a different library), synchronize with that other thread that you thought was completely unrelated (i.e., used the same class, but had its own instance of it), rinse, and repeat.

    If you are using third party or homegrown (but not yours) libraries inside your multithreaded program, pretty soon you'll realize that not only do you need to know what and how they are accessing, but you also need to keep close tabs on what changes are done in future releases (again, keeping track of implementation details in those release). Your quick and highly parallelized threading program just became a maintenance nightmare.

  20. Re:it means a lot by Anonymous Coward · · Score: 1, Insightful

    I'm quite impressed that you managed to convince yourself that Java multithreading is in any way better than win32's multithreading API. I mean, Have you ever even tried to write a program with nontrivial multithreading in Java?

    Just an example, in win32, a thread can blcok to wait on multiple locks or events and wake when any of them is signalled (e.g. waiting on multiple sockets, or on multiple client threads, or until one of a set of locks is open). This is just a single WaitForMultipleObjectsEx call. This is impossible to do in java. You have to have every lock run its own thread, to grab a lock and do a signal. At least now java has a socket.select, so you don't need to keep a thread per socket (as you did in 1.0 and 1.1).

    Try writing something to handle the reader/writer problem with multiple readers and writers. You require some really contrived code using something like five locks in order for it to work correctly. Even the reader/writer streams provided by the java api only work with a single pair of threads.

    The windows API is ugly in parts, especially the GUI section, but the windows base API is quite nice. I like their synchronization stuff a lot. It's even object oriented, of sorts. For example, you always use the same calls to wait for any object, regardless of the type of object (mutex, lock, semaphore, event, file handle, thread, etc), a sort of dynamic dispatch.

    I'm told Java 1.5 fixed up a lot of these problems, but unless you started coding a few months ago, I don't understand how you can consider Java's threading API anything but crippled and lousy.