Slashdot Mirror


Intel Releases Threading Library Under GPL 2

littlefoo writes "Intel Software Dispatch have announced the availability of the Threading Building Blocks (TBB) template library under the GPL v2 with the run-time exception — so this previously commercial only package is now open for all the use, whether for open-source projects or commercial offerings (although they are explicitly encouraging open source use). The interface is more task-based then thread-based, but with a somewhat different view of things than, e.g. OpenMP. From the Intel release: 'Intel® Threading Building Blocks (TBB) offers a rich and complete approach to expressing parallelism in a C++ program. It is a library that helps you leverage multi-core processor performance without having to be a threading expert. Threading Building Blocks is not just a threads-replacement library. It represents a higher-level, task-based parallelism that abstracts platform details and threading mechanism for performance and scalability.'"

16 of 158 comments (clear)

  1. I'm glad to hear it by ookabooka · · Score: 5, Informative

    I attended a seminar about this at GDC (Game Developers Conference) this year. It is really nifty stuff, automatically parallelizes things for you and helps take the load off of the OS scheduler. It is also trivial to implement in many cases, for instance there are parallel loops that execute things in parallel, all you have to do is write it like a normal loop but use a different keyword (ok so it is a wee bit more involved, but you get the idea). If I recall correctly it is basically a thread-pool that manages scheduling itself better than the OS because it knows ahead of time the needs of the code. Also you don't have to know the # of cores or anything as it handles that transparently. Also it isn't limited to Intel processors, I'm pretty sure at GDC it was actually being demoed on some sparc machines. If I had the time and/or a reason to use it I would definately investigate further.

    --
    If you are about to mod me down, keep in mind that this post was most likely sarcastic.
  2. Looks good, but a little hampered by C++ by TomorrowPlusX · · Score: 4, Insightful

    I looked at some of the tutorials yesterday, and I believe I'm going to dip my toes in this.

    But. As much as I love C++ ( and I do ) the real weakness is the lack of usable closures/lambda. The parallel_for example requires you to pass a functor to execute on ranges, which is fine, it makes sense, but since you can't define the closure in the calling-scope in C++ you end up filling your namespace with one-off function objects.

    This is not a critique of TBB, but rather of C++. In java I can make an anonymous subclass within function scope. In python and hell even javascript I can make anonymous functions to pass around. But in C++ I can't, and this means that my code will be ugly.

    Not that this is new news. I use Boost.thread for threading right now, and most of my functors are defined privately in class scope ( which is, at the very least, not polluting my namespace ) but it's too bad that I don't have a more elegant option in C++.

    That being said, Boost.lambda makes my brain hurt a little, so my complaints are really just a tempest in a teacup. If I were smarter and could really grok C++ I could probably use Boost.Lambda and this would be a non-issue.

    --

    lorem ipsum, dolor sit amet
    1. Re:Looks good, but a little hampered by C++ by ray-auch · · Score: 3, Informative

      Erm, yes, C++ has local classes, however there is a "BUT" and it's a big one:

      Local classes / structs do not have external linkage and therefore can't be used as template arguments. So, for functors etc., which is precisely where you'd want something like a local class (ie. because you really want a closure), they are useless.

      Hence why we have Boost lambda. Expect, and I agree with the GP, the syntax ends up so horrible (due to the constraints of C++, not in any way the fault of the Boost devs) that you end up not using it. Not a lot of point in trying to do something because it is technically cleaner and neater if it ends up unreadable and therefore unmaintainable (for that, there is always Perl).

  3. Re:As if enough people weren't already confused... by ookabooka · · Score: 3, Informative

    Thats the thing, it makes programming easier by making the whole parallel thing a bit more transparent. Basically picture a foreach loop. This thing allows you to do the same thing but instead can do multiple instances of the loop at once and automatically uses the "optimal" number of threads based on the cores available, you just have to call parallel_for. It's not quite as simple as that but it certainly does take the grunt work out of parallelizing things.

    --
    If you are about to mod me down, keep in mind that this post was most likely sarcastic.
  4. Re:task based then thread based by Holi · · Score: 5, Funny

    >There are 11 types of people in the world, those who know binaries and those who don't.

    Obviously you are in the those who don't group.

    --
    Sorry, teleporters just kill you and then make a copy. A perfect, soul-less copy.
  5. Re:I'm thinking by hrieke · · Score: 5, Informative

    The AMD question was raised on their Forums, and there is no issues with TTB running on AMD CPUs.
    And, if there was, well it's under the GPL now, and I'm sure someone would have added / corrected that mistake.

    --
    III.IIVIVIXIIVIVIIIVVIIIIXVIIIXIIIIIIIIVIIIIVVIIIV IIVIIIIIIVIII...
  6. Re:GPL 2 by Aladrin · · Score: 3, Informative

    It depends on which version of the GPL you use. There's a 'runtime exception' version (That Intel chose for this project) that allows you MORE freedom than the LGPL in the case of libraries.

    Simply put, you can link in the code as a library without worrying about LGPL's library requirements. (Namely the need to be able to replace the library with an upgraded version.) Intel notes that this is necessary for C++ libraries because of the way they have to be linked.

    For the parent's code, I doubt he chose to have this clause in the GPL he chose, and it wouldn't be possible with his.

    --
    "If you make people think they're thinking, they'll love you; But if you really make them think, they'll hate you." - DM
  7. Re:As if enough people weren't already confused... by Doctor+Memory · · Score: 3, Informative

    it makes programming easier by making the whole parallel thing a bit more transparent I'd argue that it makes things more opaque, by abstracting away the need to explicitly deal with threads. Instead, you just define "tasks" that can run concurrently, and the toolkit takes care of mapping the tasks to actual threads.

    Agreed it does look to take a lot of the grunt work out of writing parallel-processing code. There are supposedly Java and .NET versions under development, it'll be interesting to see if they're able to implement the concepts as cleanly as in C++. My guess is both implementations will be a little "clunky" (cumbersome and less efficient).
    --
    Just junk food for thought...
  8. Memory requirements - bummer by ohell · · Score: 3, Interesting

    I read on their FAQ that TBB requires 512MB to run, though they recommend 1GB. This appears to be very high, especially when compared to Boost.Threads etc. I can't think of a reason why they need to allocate this much - and it would probably be a problem for consumer applications.

    Also from the FAQ, the so-called concurrent containers still need to be locked before access. So no change from normal STL containers there.

    But I will download it just for the memory allocator they supply, since it can be plugged into STL, and claims to hand out cache-aligned memory. It can apparently be built independently of the rest of TBB.

    --
    Three o'clock is always too late or too early for anything you want to do. - Jean-Paul Sartre
  9. Re:task based then thread based by dubbreak · · Score: 3, Funny
    I fixed it for him:

    There are 11 types of people in the world: those who know binaries, those who don't and those who don't.


    The then/than mixup is kind of funny though. Reminds me of something I read in the engineering faculty on a white board (I assume a first year engineer):
    "I'd rather be retarded then do my engineering homework.."

    Looks like he had the pre-requisite fulfilled and should have just got on with the homework.
    --
    "If you are going through hell, keep going." - Winston Churchill
  10. Re:GPL 2 only by networkBoy · · Score: 4, Interesting

    Which is perfectly fine. I have a friend at Intel and based on what I've heard of the corporate culture, open ended licenses are a no-go. That doesn't mean they won't later release under GPL v3, just that they want their lawyers to have a chance to review any license they release under and don't want to be beholden to the unknown. Frankly I think that's a good thing. In theory GPLv4 could say: this can be used in closed source proprietary DRM schemes. and if they had the "or later" clause they would have to allow it.
    -nB

    --
    whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
  11. PS3? by LinuxGeek · · Score: 3, Interesting

    I checked the site and forum, but no search results on PS3. Having just bought a shiny new 60gig PS3, this release makes me wonder just how easy it could be to take fairly good advantage of all the cores.

    Hmmm, it may be one of my first projects; six cores running @ 3.2GHz and an easy method of putting them to use. It would be interesting to parallelize pi calculation and see how long it would take to get one million digits.

    --

    Kindness is the language which the deaf can hear and the blind can see. - Mark Twain
    1. Re:PS3? by Doctor+Memory · · Score: 3, Informative

      Having just bought a shiny new 60gig PS3, this release makes me wonder just how easy it could be to take fairly good advantage of all the cores. That should be interesting, since the Cell is a non-orthagonal multi-core CPU (sort of like a PPC core with multiple AltiVec units). Opcodes for the main core (the PPE) are Power/PowerPC, while the satellite processors (the SPEs) run a vector (similar to the AltiVec or VMX) instruction set. I believe the PPE can also execute the vector instructions, so maybe it would be possible to just target that. I'm not sure how general-purpose those opcodes are, though, and since I don't believe the PPE has the SPE's complement of 128 registers, you might wind up to just supporting whatever register set the PPE has.
      --
      Just junk food for thought...
  12. But the thing is by Sycraft-fu · · Score: 3, Informative

    C++ (or C) is where all the fast code is still written. Thus it is the most relevant place for this kind of thing. If you look at Intel's page, you'll see they sell compilers, but only for two languages: C/C++ and Fortran. The reason is that their compilers are specifically to get as much performance as possible on an x86/x64 chip. So they target the languages people use when they are performance oriented. There are lots of other great languages out there, but face it, you aren't (or at least shouldn't) be using a managed language like Java when every last clock cycle counts.

    You'll find that this is rather evident in most games. While it is increasingly common to write large portions of the game in a scripting language since that make it easier to write and perhaps more importantly easier to mod, you'll find that the high speed stuff is still C++. Take Civ 4 for example. They wrote almost the whole damn game in XML and Python. All data (like unit definitions, technology tree, etc) is stored in XML files, all the scripting necessary to make them work is Python. Makes the game extremely easy to mod. However, the AI code, which they also released to end users, is in C++. The reason is that the AI is highly intensive and would have run too slow in Python. Also, the core engine of the game (not released to users) is C++ as well.

    So it isn't surprising this is where Intel is targeting their optimisations. Also, I'd argue that to a large degree any of this kind of thing for a managed language is the responsibility of the runtime itself. If Java is to have better support for automatically threading things, the JRE is probably where that should be done.

  13. Re:Compatibility kinda sucks by GooberToo · · Score: 3, Informative

    I compiled and ran the examples on my AMD system. They run without issue.

  14. Re:Compatibility kinda sucks by James_Intel · · Score: 4, Informative

    We've been supporting Linux, Windows and Mac OS X for x86, x86-64 and Itanium processors in the commercial product for a year. And, yes, those include Intel and AMD processors. The commercial product information only lists those.

    The commercial product information quoted does not include some ports which were completed for the open source project only days before the open source release.

    Preparing for open source, we were able to get G5 for Mac OS X as well as support for Solaris and FreeBSD (both x86 and x86-64) working before releasing on Tuesday. It was tight - but they made it. I wasn't sure until the week before what we would have - but the team got them working. I think it will be easier now that the project is started - and we can let other join in to help us.

    I should also say we got a bunch more Linux distributions working for builds too. We have tested them enough to see no issues - but we haven't enough experience to call them supported on the product pages (commercial product). Please look for the latest ports on the open source project threadingbuildingblocks.org. We'll work with anyone who has processors/system expertise and needs any advice we can offer. Understandably, we don't have a lot of non-Intel hardware inside Intel to test upon and we are hoping others can help a bit with that.

    For compilers - we have gcc, Intel, Microsoft and Apple (gcc in Xcode environment) compilers all working with the builds. It seems like we may have something to do for Sun's compilers and/or environment working - some Sun engineers are in touch and helping us double check this. No schedule - just working together - which I have faith will get results to put out in an updated open source copy in the not too distant future - non-binding wish - this is not a promise ;-) We're talking about what to do together to add SPARC support to - which shouldn't be too hard but will take some work.

    The biggest issues from processor to processor is knowing how to implement a few key locks, and atomic operations, best in assembly language. Since we have support for processors with both weak and strong memory consistency models - we know TBB is up to the task.

    TBB is very strongly tied to shared memory, and so a port to a Cell processor (or a GPU) would be a bit more challenging - but might be doable for the Cell. We've had only a few discussions/thoughts - no progress I know of figuring out a good approach there. That will almost certainly take someone with more Cell experience than we have at this time. I'm open to learning - but I'd need a teacher for sure.