Faster Chips Are Leaving Programmers in Their Dust
mlimber writes "The New York Times is running a story about multicore computing and the efforts of Microsoft et al. to try to switch to the new paradigm: "The challenges [of parallel programming] have not dented the enthusiasm for the potential of the new parallel chips at Microsoft, where executives are betting that the arrival of manycore chips — processors with more than eight cores, possible as soon as 2010 — will transform the world of personal computing.... Engineers and computer scientists acknowledge that despite advances in recent decades, the computer industry is still lagging in its ability to write parallel programs." It mirrors what C++ guru and now Microsoft architect Herb Sutter has been saying in articles such as his "The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software." Sutter is part of the C++ standards committee that is working hard to make multithreading standard in C++."
Some algorithms are inherently not amenable to parallelization. If you have eight cores instead of one, then the performance boost you can get can be anywhere from eight times faster to none at all.
So far, multiple cores have boosted performance mostly because the typical user has multiple applications running at a time. But as the number of cores increases, the beneficial effects diminish dramatically.
In addition, most applications these days are not CPU bound. Having eight cores doesn't help you much when three are waiting on socket calls, four are waiting on disk access calls and the last is waiting for the graphics card.
The cake is a pie
It's not just making your app multithreaded, it's completely changing your algorithms so they they take advantage of multiple processors. I took a parallel programming course in University, so I'm by no means an expert, but I'll give what insight I have. You can't just take a standard sort algorithm and run in multithreaded. You have to change the entire algorithm. In the end, you end up with something that sorts faster than n log (n). However, doing this type of programming where you break up the dataset, sort each set, and then gather the results can be very difficult. Many debuggers don't deal well with multiple threads, so that adds an extra layer of difficulty to the whole problem. Granted, I don't think that we really need this level of multithreadedness, but I think that's what the article is referring to. I think that 10+ core CPUs will only really help for those of us who like to do multiple things at the same time. I think it would even be beneficial to keep most apps tied to a single CPU so that a run-away app wouldn't take over the entire computer.
Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
processors with more than eight cores, possible as soon as 2010 -- will transform the world of personal computing....
Translation:
Code will get even more inefficient / bloated and require faster hardware to do the same thing you are doing now. While I'm all for better / faster computer hardware, most if not all Jane and Joe Sixpack users never need Super Computer power to surf the net, read e-mail and watch videos.
"I bow to no man" - Riddick
A guy who's on the C++ standards committee AND works for Microsoft.
Actually, according to the latest Dr Dobbs, Herb is the *chair* of the ISO C++ Standards committee. (He had an article on lock hierarchies being used to avoid deadlock)
He's really going to know what he's talking about, then.
As chair of the committee, I'd say there's a pretty fair chance that he *does*.
I really love people who bash things just because Microsoft is involved. Contrary to what seems to be a popular belief here, they have some incredibly intelligent people who are very good at what they do there.
Everything I need to know I learned by killing smart people and eating their brains.
It's not quite like that.
On modern systems, threads are themselves first-class constructs, and it runs somewhat like this:
A process has things like memory-tables for virtual memory, handles for objects, files, socket connections, etc. A process always contains at least one thread (this isn't always true while a process is being set up or torn down, but it's true when most anyone's code is running).
A thread generally has a stack (in the host-process's virtual address space, so everyone can read it), some thread-local storage to make life easier for some api's (you don't need to care about this in most cases), and lives in a process. This means that threads can use virtual addresses for memory interchangeably with other threads in the same process.
Additionally, some operating systems support fibers. A fiber is like a thread except that it has to be explicitly or cooperatively (not quite the same thing) multi-tasked. Fibers use even less memory than threads, and you really don't have to care about them.
When you're in, say, Visual Studio, there's a "threads" window for all of the threads of the process that you are debugging. You can end up stepping through code on one thread while other threads are running.
The modern hardware designs lead to interesting performance side-effects from cache location and memory location. It's not quite as hard as systems that have asymmetric access to resources (e.g. Playstation 2), but it makes for fun work.