Intel Releases Threading Library Under GPL 2
littlefoo writes "Intel Software Dispatch have announced the availability of the Threading Building Blocks (TBB) template library under the GPL v2 with the run-time exception — so this previously commercial only package is now open for all the use, whether for open-source projects or commercial offerings (although they are explicitly encouraging open source use). The interface is more task-based then thread-based, but with a somewhat different view of things than, e.g. OpenMP.
From the Intel release: 'Intel® Threading Building Blocks (TBB) offers a rich and complete approach to expressing parallelism in a C++ program. It is a library that helps you leverage multi-core processor performance without having to be a threading expert. Threading Building Blocks is not just a threads-replacement library. It represents a higher-level, task-based parallelism that abstracts platform details and threading mechanism for performance and scalability.'"
If it's as smooth as the Intel C compilers this ought to be a treat. Now if only they'd release the icc under a similiar license.
As the GPL 2 they link to says:
"Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation"
You can of course get it as GPL 3....
I find it interesting that the original poster took the trouble to differentiate between open source and commercial offerings as if there has to be a difference.
"You can now flame me, I am full of love,"
I attended a seminar about this at GDC (Game Developers Conference) this year. It is really nifty stuff, automatically parallelizes things for you and helps take the load off of the OS scheduler. It is also trivial to implement in many cases, for instance there are parallel loops that execute things in parallel, all you have to do is write it like a normal loop but use a different keyword (ok so it is a wee bit more involved, but you get the idea). If I recall correctly it is basically a thread-pool that manages scheduling itself better than the OS because it knows ahead of time the needs of the code. Also you don't have to know the # of cores or anything as it handles that transparently. Also it isn't limited to Intel processors, I'm pretty sure at GDC it was actually being demoed on some sparc machines. If I had the time and/or a reason to use it I would definately investigate further.
If you are about to mod me down, keep in mind that this post was most likely sarcastic.
I looked at some of the tutorials yesterday, and I believe I'm going to dip my toes in this.
But. As much as I love C++ ( and I do ) the real weakness is the lack of usable closures/lambda. The parallel_for example requires you to pass a functor to execute on ranges, which is fine, it makes sense, but since you can't define the closure in the calling-scope in C++ you end up filling your namespace with one-off function objects.
This is not a critique of TBB, but rather of C++. In java I can make an anonymous subclass within function scope. In python and hell even javascript I can make anonymous functions to pass around. But in C++ I can't, and this means that my code will be ugly.
Not that this is new news. I use Boost.thread for threading right now, and most of my functors are defined privately in class scope ( which is, at the very least, not polluting my namespace ) but it's too bad that I don't have a more elegant option in C++.
That being said, Boost.lambda makes my brain hurt a little, so my complaints are really just a tempest in a teacup. If I were smarter and could really grok C++ I could probably use Boost.Lambda and this would be a non-issue.
lorem ipsum, dolor sit amet
Thats the thing, it makes programming easier by making the whole parallel thing a bit more transparent. Basically picture a foreach loop. This thing allows you to do the same thing but instead can do multiple instances of the loop at once and automatically uses the "optimal" number of threads based on the cores available, you just have to call parallel_for. It's not quite as simple as that but it certainly does take the grunt work out of parallelizing things.
If you are about to mod me down, keep in mind that this post was most likely sarcastic.
Hopefully their compiler will follow suit. This sounds like a great move for Intel especially since the lion's share of income is from processors & semi-conductors this will encourage more people to use their tools.
Like most things in CS, I think it's important to understand the theory of writing multi-threaded applications before letting software do it for you.
;-)
That said, I'm sure most CS courses teach at least the basics of memory management, but people are still happy to rely on the Java garbage collector
>There are 11 types of people in the world, those who know binaries and those who don't.
Obviously you are in the those who don't group.
Sorry, teleporters just kill you and then make a copy. A perfect, soul-less copy.
The AMD question was raised on their Forums, and there is no issues with TTB running on AMD CPUs.
And, if there was, well it's under the GPL now, and I'm sure someone would have added / corrected that mistake.
III.IIVIVIXIIVIVIIIVVIIIIXVIIIXIIIIIIIIVIIIIVVIII
Agreed it does look to take a lot of the grunt work out of writing parallel-processing code. There are supposedly Java and
Just junk food for thought...
From the Development download src/tbb/Makefile:
# Copyright 2005-2007 Intel Corporation. All Rights Reserved.
#
# This file is part of Threading Building Blocks.
#
# Threading Building Blocks is free software; you can redistribute it
# and/or modify it under the terms of the GNU General Public License
# version 2 as published by the Free Software Foundation.
There's no "Or Later" in there. This is GPL v2 only.
So.. it has come to this
I've got a program that does benefit enormously from using multiple cores. I looked into the TBB first, and I have to say my head hurt for an hour after looking at their examples. It would have required a serious rewrite of my core numerical routines, and not in a pretty way. I've found the OpenMP pragmas to be the easiest way to maintain the structure of existing code while leveraging the multiple cores. Now, there are very few examples of OpenMP that do anything useful on the web, but after a couple of hours of reading, I was able to very easily integrate it with maybe an extra couple of lines of code and some very minor reworking of the existing code.
I know this comes as a great surprise, but the OSes and processors this runs on are limited. If you want your programs to run on non-Intel platforms, or on any of the BSDs, I suggest you skip it and use something else.
Processors:
OSes:
Compilers:
P.S. Slashdot pulled out all the trademark symbols, and doesn't support the sup tag, so you'll just have to picture them in all the appropriate spots. :P
GLaDOS for President 2016! "Well here we are again. It's always such a pleasure." -- GLaDOS, 2011
As near as I can tell, this is GPLv2 ONLY (without the "or any later version" clause). Checking a random source file in the distribution, there is no "later version" language present.
This doesn't surprise me much, actually - I imaging Intel wouldn't want to commit their code to an unknown future license, and I expect they're still evaluating GPLv3. Even if they were done with that evaluation, the process for releasing this under v2 probably took a LONG time to complete - Intel is after all a large corporation. Restarting with GPLv3 probably would have just delayed it, although I suppose the only ones who would actually know that work for Intel.
"I object to doing things that computers can do." -- Olin Shivers, lispers.org
I read on their FAQ that TBB requires 512MB to run, though they recommend 1GB. This appears to be very high, especially when compared to Boost.Threads etc. I can't think of a reason why they need to allocate this much - and it would probably be a problem for consumer applications.
Also from the FAQ, the so-called concurrent containers still need to be locked before access. So no change from normal STL containers there.
But I will download it just for the memory allocator they supply, since it can be plugged into STL, and claims to hand out cache-aligned memory. It can apparently be built independently of the rest of TBB.
Three o'clock is always too late or too early for anything you want to do. - Jean-Paul Sartre
http://softwarecommunity.intel.com/tbbWiki/FAQ/60
The then/than mixup is kind of funny though. Reminds me of something I read in the engineering faculty on a white board (I assume a first year engineer):
"I'd rather be retarded then do my engineering homework.."
Looks like he had the pre-requisite fulfilled and should have just got on with the homework.
"If you are going through hell, keep going." - Winston Churchill
Well. . .c++ abstracts away from ASM, so is it bad too? Abstraction isn't a problem really, especially when it handles a bunch of grunt work correctly and efficiently. Yeah some programmers might not understand exactly what they are doing, but tools that add a layer of abstraction are OK in my book so long as they don't make things more complicated or grossly inefficient. Besides, if you really wanted to do it differently you could either modify the GPL code or write it from scratch. Hopefully, handling threads manually will become like inline assembly, there for people that need that low-level access but an easier and more abstract way of doing things is readily available (regular C/C++ code). Honestly I think libraries like this are going to be more and more common, multi-core is definitely the way of the future and it will take a whole new set of tools and programming paradigms to really harness it. Most programming languages weren't designed with the notion of parallelizing everything.
If you are about to mod me down, keep in mind that this post was most likely sarcastic.
Variation: There are 1 types of people in the world, those who program in C, and those who don't.
That intel figured out that 5 percent market share mattered a whole lot when it's only a two player game, and it's running close. Obviously, if intel can control the entire *NIX world, AMD is in for some hurt.
It is neither Linux nor Intel specific
http://threadingbuildingblocks.org/
Cross platform support:
* Provides a single solution for Windows*, Linux*, and Mac OS* on 32-bit and 64-bit platforms using Intel®, Microsoft, and GNU compilers.
* Supports industry-leading compilers from Intel, Microsoft and GNU.
Threading Building Blocks supports the following processors:
* Non Intel processors compatible with the above processors
Due to this limitation, virtual machines on x86 used one of two work-arounds:
- Binary re-writing, where the instruction stream is scanned for privileged instructions, and these are replaced by jumps to the emulated versions (and a lot of other tricks to get around side-effects of doing this). This is what VMWare does.
- Paravirtualisation, where you replace all of the occurrences of privileged instructions with something like a system call (a hypercall), which performs the operation on behalf of the guest. This is what Xen does.
Paravirtualisation is fast, and less error-prone than binary rewriting (which has a huge number of irritating corner cases you have to cover), but it has the disadvantage that it requires fairly considerable modification to the running guest, on a source-code level. You could, in theory, write a scanner that would read a binary and replace all privileged operations with jumps to a library that performed hypercalls, but no one has done this. This means, you can't run an operating system on something like Xen without access to the source code.This changed somewhat recently. Both Intel and AMD added extra modes to their latest chips which can be used to trap all privileged instructions, allowing pure trap-and-emulated virtualisation. By using this, Xen can run unmodified guests, although they are slower than paravirtualised ones. Since this feature is highly dependent on hardware support, it will only work on chips with the correct hardware assistance mode.
None of this has anything to do with a threading library, however. I don't know quite where you got that idea from.
I am TheRaven on Soylent News
C# has something called the CCR - Concurrency and Coordination Runtime.
As the developers themselves are well aware of, gluing "true" concurrency onto procedural languages such as C/C++/C#/Java will always be "ugly".
There is actually a microsoft labs-developed "fork" of C# called COmega which tries to integrate concurrent programming more tightly into the language.
Just to point out:
1) C# is actually further along in some ways to realizing true and easy-to-use concurrent programming (also ref C# 3.5).
2) Modern C++ could hardly be considered clean or simple -- It's a huge and complicated language, ever changing and with arguably the most dense syntax this side of perl. Not that there's anything wrong with that, but C++ is fast approaching a lisp-like state of unapproachability imho.
Intel wants TBB to be ubiquitous. Not only can you run it on AMD, you can run it on PPC. However, they did say that they don't have very many G5 Macs at Intel, so the engineers say the PPC port is "alpha quality".
$x='S24;r)>63/* h@<5+oZ)32"5cz';$me='phroggy'x$];
$x=~y+ -xz+\0-Tx+;print$_^chop$me for split'',$x;
I checked the site and forum, but no search results on PS3. Having just bought a shiny new 60gig PS3, this release makes me wonder just how easy it could be to take fairly good advantage of all the cores.
Hmmm, it may be one of my first projects; six cores running @ 3.2GHz and an easy method of putting them to use. It would be interesting to parallelize pi calculation and see how long it would take to get one million digits.
Kindness is the language which the deaf can hear and the blind can see. - Mark Twain
C++ (or C) is where all the fast code is still written. Thus it is the most relevant place for this kind of thing. If you look at Intel's page, you'll see they sell compilers, but only for two languages: C/C++ and Fortran. The reason is that their compilers are specifically to get as much performance as possible on an x86/x64 chip. So they target the languages people use when they are performance oriented. There are lots of other great languages out there, but face it, you aren't (or at least shouldn't) be using a managed language like Java when every last clock cycle counts.
You'll find that this is rather evident in most games. While it is increasingly common to write large portions of the game in a scripting language since that make it easier to write and perhaps more importantly easier to mod, you'll find that the high speed stuff is still C++. Take Civ 4 for example. They wrote almost the whole damn game in XML and Python. All data (like unit definitions, technology tree, etc) is stored in XML files, all the scripting necessary to make them work is Python. Makes the game extremely easy to mod. However, the AI code, which they also released to end users, is in C++. The reason is that the AI is highly intensive and would have run too slow in Python. Also, the core engine of the game (not released to users) is C++ as well.
So it isn't surprising this is where Intel is targeting their optimisations. Also, I'd argue that to a large degree any of this kind of thing for a managed language is the responsibility of the runtime itself. If Java is to have better support for automatically threading things, the JRE is probably where that should be done.
No, not FORTRAN IV, or even 77 . . .
Fortran 90 and later already have the structures for this (Forall, etc).
*sigh*
hawk, who hasn't written a line in over two years