Auto-threading Compiler Could Restore Moore's Law Gains
New submitter Nemo the Magnificent writes "Develop in the Cloud has news about what might be a breakthrough out of Microsoft Research. A team there wrote a paper (PDF), now accepted for publication at OOPSLA, that describes how to teach a compiler to auto-thread a program that was written single-threaded in a conventional language like C#. This is the holy grail to take advantage of multiple cores — to get Moore's Law improvements back on track, after they essentially ran aground in the last decade. (Functional programming, the other great hope, just isn't happening.) About 2004 was when Intel et al. ran into a wall and started packing multiple cores into chips instead of cranking the clock speed. The Microsoft team modified a C# compiler to use the new technique, and claim a 'large project at Microsoft' have written 'several million lines of code' testing out the resulting 'safe parallelism.'"
The paper is a good read if you're into compilers and functional programming. The key to operation is adding permissions to reference types allowing you to declare normal references, read-only references to mutable objects, references to globally immutable objects, and references to isolated clusters of objects. With that information, the compiler is able to prove that chunks of code can safely be run in parallel. Unlike many other approaches, it doesn't require that your program be purely functional either.
Moore's law has nothing to do with performance, numbnuts. Apparently the real geeks left Slashdot ages ago.
Moore's law is only about the number of transistors on integrated circuits.
Or, gawd forbid.. we could teach programmers how to use threading? I am a casual developer, with no formal training beyond writing practical code and reading as much as I can from the keyboards of real developers. I've run into my fair share of "real", "professional" developers who've been tasked to work on my code and thrown their hands up in defeat - not, as I feared, because of the awfulness of my coding style, but "This uses threading, I don't know this!".. Of course, looking at their resumes, a long list of single threaded webapps where the actual threading is handled for them by the webserver itself.. I give up. Even some basic native GUI client development should teach developers simple threading and asynchronous callbacks? alas, yet another talent abandoned in the age of the webapp. Is it any wonder the security issues (my actual realm of supposed 'expertise') in their code often make the architectural issues look trivial in comparison?
An interesting development, and much needed I fear, but yet another layer of abstraction to allow lazy developers to not have to really bother about knowing what their code is actually doing (that's for the poor SoB who has to maintain it is for...)
I'm wondering how this works. Does it scan for loops to remake them as event driven processes? Does it splice off multiple function calls and then throw the results into some sort of stack for retrieval? It's a cool idea, but man, for any kind of non-trivial program it must be some monumental piece of work.
The world's burning. Moped Jesus spotted on I50. Details at 11.
OK, so now we shall have another way to produce parallel programs.
Running things safely in parallel is a very good thing, but it does not by itself guarantee significant improvements in speed. The hard bit is actually to optimize the parallel code.
Having the compiler identify things which could run in parallel and thread work to run as A B&C D would be a huge step forward as long as it doesn't introduce bugs.
Compilers already try to do this for example with auto-vectorization. The problem is that they are usually quite terrible at it. Even Intel's compiler which is probably the best at it usually misses out on most of the obvious places that should be vectorized. This is why pretty much all code dealing with multimedia content (audio/video/image codecs, games, etc.) still rely on tons of had written SIMD to be optimized to their fullest.
And it must be hell trying to debug without digging deeply into the generated code.
Mainstream language have mutable state all over the code. Functional programming's big change on state issues is to careful isolate state. The Microsoft approach means that state needs to be tracked carefully so that it could be isolated by the compiler even if it isn't isolated by the code. Which is likely just as much work as isolating state. And the nice thing about isolating state is once you do it you can make use of all sorts of incredibly powerful functional paradigms like like first class functions (closures, partial execution...) and lazyness (infinite data structures, no need to figure out proper order of evaluation..)
The solution to parallelism is functional programming. And no it is not too hard. Excel is a functional programming language that lots of people know that does a great job isolating state.
About time.
Or, gawd forbid.. we could teach programmers how to use threading?
That's easy: "Don't."
From everything I've read about threading, there's no general way a hand-coded multithreaded program can ever be proven to be correct. Threads introduce all sorts of extremely subtle timing-based logic and security bugs which even the smartest programmers in the world think they can handle but in practice don't. And most programmers are not the smartest programmers in the world (they only think they are).
The correct solution is to switch from C-based imperative languages to pure-functional implicitly parallelised languages, but that's not likely to happen before the heat death of the universe.
You are not a brain: http://books.google.com/books?id=2oV61CeDx-YC
Any code that doesn't access the same resources can be run simultaneously. Alternately, if no thread is writing to a resource, it can be shared without locking.
Those two observations allow you to determine when code can be run together in different threads. If a method only uses 'final' or 'const' things, then you can run it. If your method only uses one isolated cluster of objects, then it can run simultaneously with another method that only uses a different isolated cluster of objects.
Honestly, any programmer worth $120k could make something more efficient just using threads, if they thought about it. This will get you 20% of the efficiency gains without thinking of it.
Indeed. silicon is sort of a shiny black color!
OP probably once read or heard the phrase,
misunderstood the context and thought it
would embiggen the post's language.
Were that I say, pancakes?
all the magic compiler is going to do is increase the speed of the System Idle Thread.
Have you seen how much processing power that thing consumes? Usually 90% and up. On all cores. It really needs some serious optimization.
This is totally true.
So long as you define trivial programs to be a subset of programs that can be proven correct.
Or, gawd forbid.. we could teach programmers how to use threading?
No, we want our compilers to do these things for us, because things that compilers can do they usually do better than humans can. Once your programs become large enough and complex enough, compilers outperform humans every time -- it's not about laziness, it is about the limits of human programming ability.
Compilers surpassed humans when it comes to optimization a very long time ago, except for very small programs or very short inner loop bodies.
Palm trees and 8
Bingo.. and as many other folks have pointed out, 90% of of threading is purely to decrease latency and bypass blocking operations - very few applications out there today are heavily dependent on concurrency to achieve any kind of raw horsepower... This does just seem like, if it were to become mainstream, would just become a crutch for developers to ignore threading and blocking I/O issues entirely, because "The compiler will sort that out".. In theory, this isn't necessarily a bad thing. In theory, all things work in practice too.
Imagine you have a for-loop that calls the method 'hop' on every object 'bunny' in the list:
for every bunny in list {
bunny.hop()
}
This is a simple, sequential process - bunny.hop() is called in order, for every bunny in the list, one after the other.
Now suppose you have defined the data in your program in such a way that the compiler knows that method 'bunny.hop()' only ever accesses read-only data, or modifies local data that is unaccessible from anywhere else. The compiler now knows that the order of execution of the bunny hops doesn't really matter, as every call of bunny.hop() is independent from anything else. This frees the compiler to spawn threads or processes to call as many bunny.hop()'s as he likes at the same time, thereby processing through the list alot faster.
Another method, bunny.eat() actually performs write access on a shared object 'carrot' when called. If two bunnies eat the same carrot, the compiler can not perform automatic parallelization, as running two bunny.eat() methods could lead to invalid state (only one piece of carrot remaining, two bunnies eating 1 piece of carrot at the same time, results in -1 pieces of carrot). In this case, the compiler will take care to run two bunnies eating away at the same carrot sequentially. However if there are 2 bunnies eating the green carrot and another 2 bunnies eating the yellow carrot, these are again independent from each other and can again be paralleled.
The requirement to make this possible is to provide the compiler with information on what kind of data something is - is it an isolated hopping? Or a shared carrot? or a global Bunnygod that affects all bunnies?
How about we write new code in a functional language, instead of clinging to imperative languages? Put C, C++, and Java in the category they belong in: maintenance and legacy code only.
Palm trees and 8
An interesting development, and much needed I fear, but yet another layer of abstraction to allow lazy developers to not have to really bother about knowing what their code is actually doing (that's for the poor SoB who has to maintain it is for...)
Developing software is all about managing complexity. Abstractions are the primary tool used to do so. They are neither inherently good or bad. A large part of writing good software is finding the appropriate abstractions and eliminating inappropriate ones. If an abstraction allows a program to be written more easily with an acceptible level of performance and correctness, it is an appropriate abstraction.
To know what code is "actually doing" is relative. No programmer knows what his code is doing at the level of gates on the chip. It's rarely necessary or even helpful to know what the code is doing at the level of CPU instructions or microinstructions. This is especially true if the code runs on multiple CPUs.
Honestly, you know what this says? This just says some programmers still need to go practice in the threading space.
Threading is not that hard at all for a VERY large class of problems. There are issues -- like, the memory bandwidth will eventually choke the ability of processors to get at nearby memory regions -- and you need some kind of sane plan to deal with cacheing in one core as compared to the next core when you share flags or data coming in during execution from elsewhere -- and you need to figure out when thread overhead starts to chew into your efficiencies -- but really, it's entirely doable, and isn't that difficult a problem space as compared to the actual problems *serious* programmers have to solve. It may just be that your "thing" isn't really all that threadable. Some things aren't. But it should also be very obvious when they aren't.
Then again, some problem spaces lend themselves very well to multiple core approaches -- I just finished a library that slices up images and then applies various separable algorithms to the slices in a variable number of threads (the user of the library can specify.) I wrote it in pure C, used posix threads, no sweat at all, in my 8-core machine, you get just about exactly the gains you'd think (x the number of cores) when the process is CPU cycle intensive, less as the ops are simpler and memory I/O rates rise.
Seriously. If threading seems that hard, you need to go do some studying. It's not the threading. It's you. You really should get on this bus.
I've fallen off your lawn, and I can't get up.
If two bunnies eat the same carrot...the compiler will take care to run two bunnies eating away at the same carrot sequentially. However if there are 2 bunnies eating the green carrot and another 2 bunnies eating the yellow carrot
Am I the only one who is totally horny after reading that?
This is my pay day baby! My code is going to run super fast in multicore machines without any (more) extra work from me! Wooo Hooo!
Please take pity on my and let me enjoy my day dream for a few hours, without rudely inserting reality into my reverie! Thanks slashdotters, I know I can count on you!
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
The holy grail of parallel processing is teaching programmers how to handle parallel processing (and what domains can benefit and where).
A compiler may be able to thread your program, but it will be a long time before it understands your intent well enough to do it well. Also I can think of many situations under which such functionality may not even be a good idea at all. I would assume such a system would use a pool of threads to avoid constant thread construction overhead and if you get a few IO-busy actions threaded out in an ignorant fashion I think you will find your program blocking a lot more, and producing a lot less throughput than it should.
Also, the OP blithely stated that functional programming isn't happening -- yet features of the functional paradigm, like anonymous functions, have made their way into nearly every language of consequence and many new languages proudly tout functional programming features (see f#, scala, rust, clojure, and go). Perhaps pure functional programming is moving pretty slow, but it's features are not.
I was crazy back when being crazy really meant something. (Charles Manson)
This is nothing new. As in decades-old. Back when I was at DEC in the 1980s we had auto-threading compilers that worked very well on standard application code. Today, Intel has been doing auto-threading in its C++ and Fortran compilers for more than a decade and it is very effective for a large class of applications. Intel's compiler also has features that help you tweak the code for better auto-parallelism, notably the Guided Auto Parallel feature introduced in 2010. Other compiler vendors also have auto-parallelizing compilers.
I've been in the industry for decades, and every couple of years someone comes out with yet another programming language or technique that is supposed to revolutionize application parallelization. This is just another one, and many years behind the rest of the industry from what I can tell.
Actually you don't need functional. Java works fine with multi-threading (well, it does for my daily use). In fact, that is the real strength of the garbage collection model of Java, it can track and manage efficient release of resources in a multi-threaded system. Also Java has direct language support for synchronization mechanisms and extensive library support for multi-thread friendly data structures (Doug Leah is a legend!). Lots and lots of multi-threaded programs get written in Java, without needing functional languages, it's just that we we do write them they do work (after we test for various possible thread-related problems), work well, and we don't have to bitch about needing to change paradigms to get things going (eg. needing to change to functional languages when they are not always the best fit for many classes of problem).
Of course you'll hear those still stuck on legacy languages designed without multi-threading in mind whinging. That doesn't mean there isn't a straightforward, easy to learn language and development platform out there that isn't strong on multi-threading. There is - and it is cross-platform and general purpose too.
Imagine you have a for-loop that calls the method 'hop' on every object 'bunny' in the list:
for every bunny in list {
bunny.hop()
}
And if the original intent is to make the bunnies hop in sequence like performing "the wave", making a new thread for each hop might produce a visual effect of them all hopping at once if the sleep occurred in the hop() function instead of the for loop calling the hops.
Automatically paralleling compilers aren't new. SGI had one for C ten years ago. Intel has it for C++ and Fortran now. It's been done for Matlab. There's plenty of theory.
Outside of the supercomputing community, it's never gone mainstream. Microsoft is at least trying to do it for C#, which is a reasonable language. An interesting point is that this Microsoft approach exploits immutable objects. Immutable objects can be freely shared without locking, so wide use of immutable objects makes automatic extraction of parallelism easier.
I'd looked at doing something similar for Python, to get rid of the Global Interpreter Lock, Python's boat-anchor. Python already makes wide use of immutable objects, but doesn't gain any performance from them. If everything is thread-local, immutable, or synchronized in the Java sense, you don't need global locking. But the politics of the Python community do not favor performance. (Potentially, Python could be up to at least the LISP level of performance, within a factor of 2 of C, if the language was restricted in certain ways.)
Another part of the problem is the pernicious heritage of POSIX/Linux locking primitives. Many programmers think that locking is an library or OS-level problem, not a language level problem. Locking via library primitives means language doesn't know what data is locked by which lock. This makes optimization and race checking very tough.
The political and social problems here are tougher than the technical ones. So the open source community can't solve them. It takes someone with a big hammer, like Microsoft, to do it.
"Who tells whom what to do?" - V. Lenin
There is great potential here if this sort of thing can be made to work cleanly and safely. I suspect a lot of programmers (myself included) think better in a single-threaded manner. People are mostly used to dealing with A B C D and, as such, I think we usually write code to execute steps the same way.
If you've ever done a dish with more than one casserole or making a side dish while the main dish is cooking you've done parallelization in real life, the problem is that expressing it in a computer language is hard. Functional programming works okay if you're doing lots of similar things at the same time, but not if you're doing lots of different - but parallelizable - tasks. I want to call oven.roast( meat ), stove.boil( potatoes ) and workbench.slice( salad ) but I don't particularly care about the order and I know they're side effect free relative to each other, but how do I express that? Either I write them in sequence ABC and they happen in sequence ABC or I have to write threads and schedule them which is honestly overkill - I want it to use the resources efficiently not micromanage the schedule myself.
Particularly if these come in random and unpredictable amounts like the orders at a restaurant I need the computer to work it out itself. It's also very important if you have small resource conflicts like the hot casserole and hot pot roast both requiring mittens to move. I know quite a lot about mutexes and locks and semaphores to make it explicit but now it feels like I'm explicitly describing the opportunities for parallelism not the dependencies preventing it. Imagine if you were a project manager, do you start with every task depending on every other task? No, you start with the tasks then make dependencies where they're needed.
I wonder if you could do something more with "scoped const-ness", for lack of a better term. That is to say a function is non-const in a certain scope and const outside it, and thus can be executed in parallel with anything else outside that scope. To continue the example above, let's say I could define a function oven.raiseTemperature() that's non-const in the oven scope and const in the global scope and a function stove.stirPot() that is the same for the stove. If I then call them the compiler would know it's free to reorder them or run them in parallel, it picks the optimal execution based on the hardware available or even dynamically at run time. Functions that depend on multiple objects would automatically become fences to "sync up", though not necessarily on a global scale. Like a function on both the oven and stove may sync the kitchen, but not the living room.
It might not give you massive parallelization but it'd at least let you parallelize many mixed tasks that I don't feel you manage to do easily in any language today. It certainly sounds like a safer way than starting off a bunch of worker threads and praying you didn't create any deadlocks or live locks or race conditions or whatnot. Much like the global const the compiler should be able to validate that it doesn't in fact have any side effects outside the scope it claims. I'm sure there's lots of reasons why this wouldn't be as useful as I imagine it would be, but it certainly seems easier for those of us used to imperative programming.
Live today, because you never know what tomorrow brings
If it has a visual effect, then it's not read-only.
Even the best of C or C++ compilers are terrible at vectorization of code.
Yeah, and the best humans are terrible at allocating registers -- so bad, in fact, that the best C compilers ignore the register keyword. What do you think is more relevant to the general case: vectorizing multimedia operations, or allocating registers? Compilers are also better than humans at:
To put it another way, look at the Orbitz story, which is over a decade old now:
http://www.paulgraham.com/carl.html
On the one hand, you have hand-tuned assembly language. On the other, you have a program written in Lisp (a high level, compiled language) with some C++ mixed in (for managing memory). Orbitz was able to compete on speed, but more importantly, it was returning better results. It's not that the people who wrote that mainframe assembly language were idiots -- they were taking advantage of all sorts of special hardware features, they knew how to hack their machines better than anyone else -- it is just that the Orbitz algorithm was far too complex for efficient hand-rolled assembly code, at which point compilers are really the only choice. The mainframe guys were busy thinking about how to make use of special machine features in assembly language; the ITA team was busy solving the higher-level problem, and relying on their compiler to generate good assembly language code. This is a particularly telling line:
We disassemble most every Lisp function looking for inefficiencies and have had both CMUCL and Franz enhanced to compile our code better.
[emphasis mine]. They disassembled their code...and then improved their compiler when they saw problems. They did not hand-roll the code, they made the compiler do a better job of generating code. These are not lazy programmers, nor are they programmers who do not know how to use assembly language; they are programmers who understand that they have a tool that is far better at generating assembly language than they are, and that they have more important things to do with their time.
I deal with quite a bit of crypto code in my work. I have seen lots of hand-tuned assembly language, I dealt with code that took advantage of the AESNI instructions to perform very fast encryption. I am well aware that in small, highly specialized functions (like AES), humans are better able to utilize special instructions to improve performance. Those are niche cases, and the techniques used in those cases have very limited applicability (even SSE is fairly limited in its applicability, by comparison with the sort of code programmers write and maintain every day), and the techniques scale very poorly.
Palm trees and 8
One would assume that "world modification" (monad; in this case, visual I/O) does not fit the restriction "only ever accesses read-only data, or modifies local data that is unaccessible from anywhere else" and as soon as you are showing something to the user, that blocks the auto-threading.
Then again, if you are drawing map tiles, the order you draw them in doesn't matter - so this goes to show that in some situations, there's no way around the programmer actually having to stop, think and declare whether the program's contract allows it to do some user-visible operation in parallel.
It's not the fall that kills you. It's the sudden stop at the end. -Douglas Adams
Compilers already try to do this for example with auto-vectorization. The problem is that they are usually quite terrible at it.
I suspect one reason they are so bad at it is they have to be very conservative in how they optimize, due to the relaxed nature of the C language. For example, if the C optimizer cannot prove beyond a shadow of a doubt that a particular memory location isn't being aliased, it can't make any assumptions about the location's value not changing at any step of the process. In practice, that means no optimizations for you.
Given that, it would seem that the Microsoft approach (using not only the higher-level language C#, but a specially customized version of C#) gives their optimizer much greater latitude. Because the language forces the programmer to annotate his objects with readable/writable/immutable tags (think C's "const" tag, but with teeth), and because the language (presumably) doesn't allow the programmer to do sneaky low-level tricks like casting away const or aliasing pointers or pointer math, the optimizer can safely make assumptions about the code that a C or C++ optimizer could never get away with. That may allow it to be more effective than you might anticipate (or maybe not, we'll see).
I don't care if it's 90,000 hectares. That lake was not my doing.
One problem, and the great attraction of this way of doing things, is that most software companies don't have too many programmers worth $120k. Lots of $50k-$80k, but not so much on the higher end of the pay scale until you get into research divisions at big companies like Microsoft or Google or IBM.
Just like quicksort(3) is far faster than bubblesort so too is a highly threadable code faster than non-threadble code
First, just to be pedantic, I'll point out that quicksort is as bad as bubblesort in the worst case, to a constant factor (you should have picked heapsort or mergesort). That aside, it is worth noting (and I am somewhat bothered by this when it comes to TFA) that we still do not know if it is even possible to optimize any program by parallelizing it; see the NC-vs-P question:
https://en.wikipedia.org/wiki/P-complete
Multithreading is not a magic bullet, and in all likelihood it is not generally applicable.
Languages do not, contrary to belief, express intent, the provide a strict set of instructions that the computer MUST respect
Wrong on all counts. Imperative languages are a way to convey instructions to a computer; declarative languages do not convey instructions, and programming in a declarative language requires an entirely different mode of thinking (it is closer to asking a question that giving instructions). It is also not strictly necessary for the computer to do exactly what a program expresses; there has been some work on compiler optimizations that have a (tunable) chance of not maintaining soundness.
In the end a good algorithm with no compiler help will beat optimized "dumb" code in all cases larger than "toy" (say, a few dozen "n" in Big-O notation)
If you are convinced of this, try implementing something more complex than the algorithms you see in Knuth's books; say, this:
http://eurocrypt2010rump.cr.yp.to/9854ad3cab48983f7c2c5a2258e27717.pdf
Then ask yourself this: could the constant factors in your implementation be better? At the end of the day, big constant factors will hurt your performance so badly that you might as well have used an asymptotically worse algorithm; indeed, consider fast integer multiplication:
https://en.wikipedia.org/wiki/Sch%C3%B6nhage%E2%80%93Strassen_algorithm
10000 digits are needed before that algorithm actually outperforms the asymptotically worse Toom-Cook family of algorithms. Here is an even more extreme example:
https://en.wikipedia.org/wiki/Coppersmith%E2%80%93Winograd_algorithm
Sure, that's a better matrix multiplication algorithm in the asymptotic sense...but only for matrices that are so large that you cannot even store them on today's computers.
So really, while you are right that asymptotic improvements will always beat constant factor improvements (which is what compilers are mostly going to do for you), you are wrong to ignore the importance of constant factor improvements. In the real world, constant factors matter. In the real world, quicksort and mergesort will use asymptotically worse algorithms below a certain problem size because of constant factors. In the real world, large integer multiplication is done using Karatsuba or Toom-Cook methods, not FFT methods, because of constant factors. In the real world, if you are not using a compiler to optimize your code, your code is going to be slower than it needs to be, even if you spent hours hand-tuning it, unless you are only dealing with toy problems.
Palm trees and 8
Some threading problems are hard, really hard. They aren't that common.
Which is why compilers should be handling as many threading problems as they can: so that rather than spending time and mental effort dealing with problems that are not so hard, we can spend time and mental effort on problems that compilers cannot solve for us. There is a reason that we use higher level languages, and that reason is not "laziness."
Palm trees and 8
How frequently do you spend time thinking about which registers are being used to store partial results in the programs you write? When was the last time you spent time thinking about which registers are being used by which variables, or when it would be a good time to store register contents in memory?
What about memory alignment and packing? Calling conventions? Near pointers and far pointers (OK maybe that one is a bit dated)? Most programmers do not think about these problems, because most programmers are trying to solve bigger problems and need to spend time thinking about the bigger picture. Getting mired down in low level details makes programmers less productive -- that is why we use programming languages.
So if we can have compilers that can automatically utilize multithreading, even if it is in a somewhat limited context, we should be happy. If you are happy to not be spending time thinking about padding data or allocating registers, you should be happy about automatic multithreading.
Palm trees and 8
What you describe sounds a lot like "make".
$ cat Makefile
dinner: roast_meat bake lasagna boil_potatoes chop_salad
turn_off stove oven
wipe table
clean_table:
wipe table
prepare_meat: clean_table
cut meat
salt meat
set_meat_temp:
# make it fast
chtemp 800
roast_meat: prepare_meat set_temp_meat
ovenize meat
sleep 900
put_on_table meat
prepare_lasagna: clean_table chop_salad prepare_meat
remove_box lasagna
set_lasagna_temp: roast_meat
chtemp 500
bake_lasagna: prepare_lasagna set_lasagna_temp roast_meat
ovenize_lasagna
sleep 3600
put_on_table lasagna
prepare_potatoes:
wash potatoes
boil_potatoes: prepare_potatoes turn_on_burner1
put_on_burner1 pot water potatoes
sleep 1800
put_on_table potatoes
chop_salad: clean_table prepare_meat prepare_potatoes
put_on_table salad
$ make dinner
/casts detect troll 10 post radius. Multiprocessor technology was already established when Bill Gates was still dumpster diving for code samples, and a lot of the work that gave us modern high performance multiprocessing was done by companies like IBM, Intel (and Intel shootoffs like Sequent) back when Windows was just the latest new DOS application. At the extreme end of the spectrum, this is why all of the top 10 supercomputers run on the UNIX-alike known as Linux, and none run on a Microsoft based OS.
I'm afraid regardless of what Microsoft would now like to present as a PR image, they've consistently been embarrassingly late to the party. Let's not belittle their contribution though; they certainly throw more money at problems than any other tech company.
L4 isn't exactly trivial, but it's proof exceeds the length of it's code by whole factors.
"MS R&D is the largest computer tech R&D in the world. Combine IBM, Intel, and AMD, and you get an idea of their size."
Citation need. Not disputing the first part. Just the second part about the relative size of Miscorsoft Research.
Sounds like user provides list of Structural/Control/Data Hazards. Compiler Pipelines code into blocks that can be ran in parallel. :)
Sounds familiar
Who logs in to gdm? Not I, said the duck.
These and other applications written in the source language are performance-competitive with established implementations on standard benchmarks
Translation: we didn't speed them up any, or at least not by enough that we care to share any number.
Amdahl's Law is difficult to overcome in auto-parallelising systems that rely on anything other than loop optimizations. Basically, in straight-line code, if you make 50% of the code run on multiple cores, you're only going to get at most a 2x improvement in speed. In practice, you won't even get that much (except in loops) due to added overhead. Bottom line: you can write papers about your wonderful auto-parallelizing technique, but when the rubber hits the road this is unlikely to lead to much performance improvement.
Have you read my blog lately?
Have you ever considered writing a text book on programming?
I am anarch of all I survey.
I know nobody knows how to write low level code anymore but displaying something involves writing to memory.
someObj.someMethod(if(x == 2) { 5; } else { 10; });
Aarrrrghhh!!!!!111 Surely you would never *teach* students such monstrosities. The idea is to keep it simple, one statement or side effect per line of code. This helps you when you wave your debugger over the code, since it will automagically highlight the result of the statement. This also helps the people reading your program (and source code is meant for people firstly). You are trying to write Java in non-Java different style, no wonder that code you gave is so hideous. Of course, the Java compiler is so damn efficient that extra assignments for debugging and readability are no problem at all - let the machine mash the code, not you. Trying to mash everything into one line is for C and LISP coders (which I was at one time so I recognize the same style), not for someone used to multiple 1080p displays. Newlines really aren't that expensive, dontcha know?
As a language, Java is pretty awful; you are stuck with a single programming paradigm even in cases where that paradigm makes no sense, you are writing enormous amounts of code to express what you are trying to do, and features that programmers in other languages take for granted are just not there (or are very limited). Meanwhile, you have Scala and Clojure on the JVM -- so why not just use these where functional approaches make sense, like for parallel algorithms?
I would argue that Java is more general purpose than Scala or Clojure, at least for the kind of problems I'm often trying to solve (eg. device control, 3D graphics, medical applications, numerical modelling etc). Perhaps Scala and Clojure fit your/people's problem domains, if so, then I have no problem with people using them. Don't mistake that for being useful in the outside (non-teaching) world. Note also that functional constructs do have some overheads that iterative-style code does not (where the latter is not exclusive to, but more in Java's 'style', yeah?). For my problems that are massively iterative (eg. some parts of graphics, numericals) this does make a difference, and the functional constructs are unnatural and used so infrequently to be useless. The usefulness of threading is above this (eg. the multi-threading is coarse grained and limited to a few dozen threads, again functional programming doesn't offer enough of an advantage in this regard).
Back to your teaching. Surely new students follow the deterministic procedural code better at the start than having lots of anonymous closures popping up all over the place? It is more advanced users that like the constructs that bring them home to the Source, that is, LISP.
Yes, they should have used the ternary operator.
No they should not have. The ternary operator is for C programmers making the jump. Again, newlines are not expensive. There is nothing wrong with making an if statement and assigning to a variable. Then give the output of that variable to your function call. One statement or side-effect per line. The compiler can make this fast. Someone who is not you can read the code without thinking about it (unless they are snob, and think the code is too simple and should be slightly obfuscated in the way you have shown). With lines split out in this way you get to put a debugger on the statement you want because multiple statements are not mashed together, and again, the debugger will automagically show you the value of the variable or expression as you move you mouse over it (less clicking is good, yeah?).
If aiming for simplicity is good enough for Einstein and Steve Jobs, then who am I (or you?) to argue? As they say, "You can write bad FORTRAN in any language" and your example proves it. Just because Java (and all other languages) allow obfuscated constructs because people think terse code is better (it's not!) and those that write terse code are better coders (not true, such things are only held in esteem by those lacking several decades of experienc
Of those, IBM have around 439,999 project managers.
I'm sure that they have produced all the worlds great literature the last 30 years as well. Just their size doesn't make them the source of everything good.
I haven't seen any GPU or CPU design used in modern computers come from MicroSoft. Sure, they have worked with it to implement it into their own software. They probably have been able to express their preferences to hardware vendors, in order to make them support the hardware and make it perform good in their software. High Performance multi-core kernel design in Windows was, if I call correctly, taken from VMS and Compaq design when they designed Windows NT. MicroSoft has been struggling to get over 16 cores effectively supported by their OS and most definitively by their applications. Linux has been getting great support from SGI when they ported their NUMA architecture from IRIX to Linux and building on that, Linux has been expanding and rewriting their multi core stuff more efficient and larger than MicroSoft has done with Windows.
Now what technologies has Linux taken from MicroSoft (free) research exactly? What MicroSoft patents were violated by Linux again? What's that eerie silence all of a sudden?
I was promised a flying car. Where is my flying car?
If you talk about Nobel prizes, could you please remind a forgetful Swede how many are awarded in mathematics or software engineering? Or is research strictly a speaking only something that happens in physics, chemistry, medicine.... or literature... or (snrk) in (giggle) economics (bwa-ha-ha-ha-ha!)?
And the last one cleans the office and makes coffee.
Nobel Prizes? None. Because there is no Nobel Prize for computer science. The closest equivalent is a Turing Award. I haven't checked the whole list, however the 2009 winner works at Microsoft Research (and has since 1997), the 1998 winner is now dead, but was working at Microsoft Research between 1995 and his death. You can check the list yourself for IBM.
I am TheRaven on Soylent News
as can any code that does access the same memory location as long as that access is all read-only
Not entirely true. If reads on that block of memory are atomic it will work every time, otherwise you might get some strange results as it may be possible for values to change while reading part of it. This mostly happens with more complex data structures but it can happen with primitive ones depending on their size.
Time to offend someone
https://en.wikipedia.org/wiki/Microsoft_Research
http://researcher.ibm.com/researcher/search.php?sn=1
The idea is to keep it simple, one statement or side effect per line of code
Except that you are then going to either (a) duplicate code or (b) have to introduce and keep track of more variables. In what way is that simple?
You are trying to write Java in non-Java different style
Perhaps I was not clear: that was a mistake that one of my students made. They did not know Java, they were still trying to learn it, and the point is that the Java style of programming is very unnatural. "Call f with 5 if x is 2 or else 10" is how the student was thinking about the program; they were forced to think about it instead as "if x is 2 call f with 5, otherwise call f with 10."
Note also that functional constructs do have some overheads that iterative-style code does not
Yes, there are limits to functional programming. There are limits to all programming paradigms, which is why languages should not force programmers to use a particular paradigm in all cases (and as it so happens, Clojure does have a loop construct). One of Java's major problems is that it really lacks support for anything other than object oriented programming, with only limited support for other paradigms or styles.
For my problems that are massively iterative (eg. some parts of graphics, numericals) this does make a difference, and the functional constructs are unnatural and used so infrequently to be useless
Perhaps so, although I suspect that in both cases object oriented programming is not universally applicable, and that you would appreciate the ability to just write a function without having to dress it up with class definitions. When I was an undergrad, I knew a lot of professors in the EE department who used Matlab to do computationally intense signal processing, and I know people in the CS department here who use Matlab for image processing work (and they would rule out both Java and any functional language for that work).
Surely new students follow the deterministic procedural code better at the start than having lots of anonymous closures popping up all over the place?
They do, but what I was saying was the Java is not an easy language to learn. Let's put it this way:
If you have never programmed before, that is an awful lot to deal with in a first program. It is certainly more to deal with than this:
In the Java case, most of the questions students were asking were about public, static, void, main, class, String, and then questions about function definition syntax, what "args" even is, etc. The Lisp case is far simpler and far more straightforward. I would say that it is acceptable to teach Java if all that complexity would eventually lead to simpler programs, if it was just a matter of scale, but unfortunately that is not the case; Java's syntax does little to keep complexity down, and the "everything is object oriented" approach adds a lot of complexity.
I find it interesting that you would mention closures here; in introductory Java courses, it is almost guaranteed that students will have to write at least one class definition. Do you think students have an easier time dealing with classes and objects than closures? Why even bring up anonymous closures; do you think new programmers have an easy time dealing with anonymous classes and objects (do you think we even get that far in a single semester)? My experience with teaching students how to write classes was this: it's hard for people who have never programmed before, especially when they are writing programs that are tiny and benefit little from classes. I would say that object oriented programming
Palm trees and 8
I've spent a lot of time in the earlier part of my career writing multithreaded applications (mostly client/server stuff) and I got very good at it. That said, it's a complete waste of my time as a developer. Developers should focus on domain problems, not how to tweak the microprocessor. In my experience, most developers don't seem to ever fully get how to write multithread apps and threading problems are a royal pain in the ass to debug. I can't see a better candidate for automation than taking this out of developer's hands.
I used to write my own linked lists and hash tables but I don't do that anymore either. It's a new era and it's all about the domain.
Are agnostics skeptical of unicorns too?
i++ is atomic if the CPU can handle working at the size of i. On a 32bit machine, if "i" is 64bit, then it won't be atomic, but on a 64bit machine, i++ is atomic. Newer x86 CPUs do support 128bit and soon 256bit atomic memory reads and writes via SIMD registers.
... i++), "i" may be stored in a register and never pushed back out to memory. You need to declare i as volatile or use a memory fence to force i to be pushed.
Now, it is not guaranteed to be pushed out to memory immediately. If you're doing some like for(i = 0;
Volatile doesn't fix everything either. All it does is say the update will be pushed out to memory, but because most CPUs use Out-of-Order execution, the read/write may get re-ordered.
eg.
shared volatile int i = 0
shared volatile bool b = false
threaded method()
{
i = 3
b = true
}
Even if the compiler doesn't re-order these two assignments, the CPU might because "i" doesn't have a detectable dependency.