Intel's Single Thread Acceleration
SlinkySausage writes "Even though Intel is probably the industry's biggest proponent of multi-core computing and threaded programming, it today announced a single thread acceleration technology at IDF Beijing. Mobility chief Mooly Eden revealed a type of single-core overclocking built in to its upcoming Santa Rosa platform. It seems like a tacit admission from Intel that multi-threaded apps haven't caught up with the availability of multi-core CPUs. Intel also foreshadowed a major announcement tomorrow around Universal Extensible Firmware Interface (UEFI) — the replacement for BIOS that has so far only been used in Intel Macs. "We have been working with Microsoft," Intel hinted."
For a moment, I hoped Intel had come out with something like AMD's rumored reverse-Hyperthreading. That would be a real revolution!
Nuffsaid
________
Don't know about his cat, but Schroedinger is definitely dead.
It makes perfect sense that you'd still try to speed up single-threaded applications. After all, if you have 4 cores, then any speedup to one core is a speedup to all of them. I realize that's not what this article is about. In this case, they are speeding up one at the expense of the other, but the article's blurb makes it sound like Intel shouldn't be interested in per-core speedups when that is clearly false.
EFI is used by more than just Apple. For example, HP Itanium systems use EFI. By virtue of being "extensible", EFI is vastly better than the BIOS which has frankly failed to evolve since Compaq reverse engineered it in the early 1980s.
It is well past time that BIOS went to the grave.
Really. I know Google is hard to use, but even Wikipedia would have given some detail on EFI history. (Hint: Itanium only ever used EFI). And it turns out that Macs are not even the first x86 machines to use it, either:
Intel's "Enhanced Dynamic Acceleration Technology" is a triumph of marketing. Notice how the focus is on the transition where one core becomes inactive and the other one speeds up. This is the good transition. The other transition, where the chip workload increases & voltage/frequency are limited to keep within a power envelope, is called "throttling" and is much disliked in the user community.
Don't get me wrong, this is valuable technology. It is important that microprocessors efficiently use the power available to them. Having a choice on a single chip between a high-performance, high-power single-thread engine & a set of lower-performance, lower-power engines has great promise. But, the way this is presented is a big victory for marketing.
The article suggests that this technology makes 1 core run twice as fast by basically disabling the second core for a while. They go on to 'prove' how effective it is by running a photo processing thing that they don't explain. It runs twice as fast this way.
So... If they can have 2 cores at full speed, or 1 core at double speed... WHY THE FUCK do they have 2 cores in the first place?
"If you make people think they're thinking, they'll love you; But if you really make them think, they'll hate you." - DM
While I am all for having something a bit more intelligent than BIOS to init a computer, I can't help but wonder... Does this UEFI integrates DRM functions? Is this the Trojan Horse that will make all computers DRM-enabled?
Inquiring minds want to know!
The right to offend is far more important than the right not to be offended. (Rowan Atkinson)
It seems like a tacit admission from Intel that multi-threaded apps haven't caught up with the availability of multi-core CPUs.
Or maybe Intel, unlike the story submitter, knows that many apps simply do not lend themselves to multithreading and parallelism. It's not about "catching up".
Multi-core for multithreaded apps? Check.
Trying to get each core as fast as possible for when it's only used by one single-threaded app? Check.
Makes sense to me.
ClutterMe.com - easiest site creation on the Net. Just click and type.
Ahhh, journalism at its finest: "The new chips will be able to overclock one of the cores if the other core is not being used." Then two paragraphs later: "This is not overclocking. Overclocking is when you take a chip and increase its clock speed and run it out of spec. This is not out of spec."
That said, this seems to make perfect sense to me. If they're able to pump all that power into a single core while the other one is asleep/idle, all while keeping it within its operating parameters, then I'm all for it.
This guy's the limit!
Why should they? The advent of multicore CPUs won't actually hurt single-threaded apps. They just won't get any faster. For most things, that's fine. Legacy apps that aren't changing are most likely already fast enough. Besides, not everything can be parallelized properly, anyway. Multithreaded applications will become more popular, but I think this trend will affect new applications much more than old ones because it's just not that important. Even new apps don't necessarily need parallelization because many things are "fast enough" on a single core.
By the way, I actually hope that many things never become multithreaded. In my experience, most coders simply aren't capable of thinking threading through clearly. For many people, the concept is just too complex. Hopefully, compilers will improve to the point where many things can be parallelized without the coder having to know very much, if anything, about the threading involved, but, today, we're nowhere near that. We desperately need higher-level threading primitives in computer science.
"We have been working with Microsoft," Intel hinted."
Now I know to avoid it.
I doubt it. My reading of the article is that the CPU detects when only one core is in use and does everything itself. But, even if it does require some level of OS support, I wouldn't worry about Linux's support of it (or of UEFI, for that matter, as Linux runs quite well on Macs and Intel does a good job of supporting Linux, anyway). Linux even has support for hotplugging CPUs, so, even if it comes to that (and I doubt it will), then it should still work.
Any change in a CPU's implementation should not be observable to anyone unless the observer knows to look for it (e.g. with the CPUID instruction). Intel won't release a chip that breaks existing apps. Besides, if you think about it, if apps work on a single-core CPU, why shouldn't they work on a dual-core CPU with one core disabled?
As many slashdotters are in software development or something related, we should all be grateful that multi-core processors are becoming so prevalent, because it will mean more jobs for hard-core code-cutters.
The paradigm for using many types of software is pretty well established now, and many new software projects can be put together by bolting together existing tools. As a result of this, there has been a lot of hype about the use of high level application development like Ruby on Rails, where you don't need to have a lot of programming expertise to chuck together a web-facing database application.
However, all the layers of software beneath Ruby on Rails are based on single-threaded languages and libraries. To benefit from the advances of multi-core technology, all that stuff will have to be brought up to date and of course making a piece of code make good use of a number of processors is often a non-trivial exercise. In theory, it should mean many more jobs for us old-schoolers, who were building web/database apps when it took much more than 10 lines of code to do it...
Peter
With all this talk of multi-threading on multi-core CPUs, Slashdotters appear to have forgotten that we all run multi-tasking operating systems. An OS isn't forced to schedule all of the threads of a single application between cores: it's perfectly capable of spreading several different single-threaded applications between cores, too.
And no, EFI didn't appear first on Intel Macs. Intel Macs weren't even the first x86-based machines to employ it.
In my experience, most coders simply aren't capable of thinking threading through clearly
;-)
I agree completely, though you can expect to catch some flack for that one, from the hoardes of poor coders who think nothing (or rather, who don't think about the implications) of splitting off another thread to boost performance (even in a single core environment).
Personally, I consider myself a damned good coder - And I avoid multithreading wherever possible. If I really need the raw CPU power, I'll usually try to model it as a full slave process before resorting to messy threading.
We desperately need higher-level threading primitives in computer science.
We've had it for decades - Just look for multiprocessor support, and you have implicit multithreaded support automatically.
As one "mature" implementation, we could all start coding in HPF. I'd personally rather gnaw my own right leg off, but, to each their own.
Well, yes and no. I think the easiest model for multithreading today is message passing, but it doesn't suit all needs and requires you to design your app to support it from the start. Most mainstream languages (read C/C++, Java, and .NET) don't really support much beyond your basic mutex, semaphore, and monitor. There are a few other things out there that provide various ways of doing things, but none are universal and none seem to have really caught on.
What we really need is either a language that can express things in such a way that the compiler can easily make good decisions about what can be parallelized, or a compiler that can do that with existing languages. I think that the latter approach may prove impossible. To make informed decisions about threading, a compiler really needs to know things about the data, and most procedural languages just don't cope with that very well.
It seems that HPF may provide some of these things already. I did a few quick Google searches and it seems interesting, but I wonder how much better it is than current work that is being done on auto-vectorization of loops and such in modern compilers. I'll have to look into that language more closely before I can really draw any conclusions. I believe that IBM has been trying to do some interesting work in this area with the Cell processor, too, and I suspect that's why Sony makes interesting statements about how the true power of the Cell will never be fully realized.
Regardless, the next decade is going to be an interesting one for compiler writers, I suspect.
Good lord, let me sell all my web, application, and DB servers then!!!! I've overpaid for 32 CPU systems!!!! ACK!!!
The cesspool just got a check and balance.
Seriously, several constructs in Fortran are designed specifically for parallel execution. The language itself makes it hard to write code that the compiler can't heavily optimise. There's a reason why variable aliasing is strongly controlled in Fortran and why function parameters have an 'intent' attribute. Then there are constructs such as WHERE, which is by its very nature implicitly a parallel set of operations.
Concurrent applications needn't be so difficult to program. Take a look at the actors model and STM.
What's unfortunate is that we're stuck on this idea that concurrency == multiple threads w/shared state. With that approach, sure, apps will never scale. You're right, we do need higher-level threading primitives. I'm just not so sure they're all at the compiler level.
threading through clearly
I agree completely, though you can expect to catch some flack for
that one, from the hoardes of poor coders who think nothing (or rather,
who don't think about the implications) of splitting off another thread
to boost performance (even in a single core environment).
multithreading wherever possible. If I really need the raw CPU
power, I'll usually try to model it as a full slave process before
resorting to messy threading. You may be a good coder, but you apparently fall into the majority camp by your own admission. Not that there's anything wrong with that though. You at least realize that multi-threading isn't your thing. We desperately need higher-level threading primitives in computer science.
As one "mature" implementation, we could all start coding in HPF. I'd
personally rather gnaw my own right leg off, but, to each their own. As many folks pointed out to me previously, Eiffel seems to be pretty nice in this arena. I've never seen a production use of it, but who's to say it's not the next big thing? (Perhaps 3+M Java coders?)
The major issue with multi-threading remains though, and that's identifying the parallel processes. Take a series of sequential code blocks that involve retrieving pieces of information from several sources. If those retrievals are independent of each other, you can retrieve all the pieces concurrently (in parallel) and then sequence them together when all retrievals are done. Now the process takes the time of the longest retrieval plus assembly vs an sum of all retrievals and assembly. This type of process is quite common in enterprise systems working off of several DBs. Putting such code in a slave process requires inefficient messaging results back to the calling process, and adds unnecessary overhead. This is but one case where multi-threading helps performance significantly. I'm not sure that something like Eiffel would make this code any easier to write since the bulk of the multi-threaded work is in the design itself.
The cesspool just got a check and balance.
What this amounts to is taking a part that is qualified to run at, say, 2.8GHz, and selling it with a default clock of 2.2GHz in order to meet TDP. Then, when one core is disabled, you crank up the other core's clock to 2.8GHz and stay within TDP. This sounds like a good idea for mobile computing, since power (i.e. battery life) is by far the most important thing. But for servers, I think you'd want to sell as many chips as you can with the highest rated clock freq, since those are higher margin.
Some people didn't get it. Here:
This chip has to throttle itself when you use all the cores. (probably a power/heat issue)
People hate throttling. Throttling is not marketable.
Intel marketing turned things around, saying that the chip speeds up (a.k.a. "stops throttling") when running single-threaded apps. Speeding up is good! It's like the old turbo buttons.
It's a sane idea. I'd been expecting to see chips that can't run at full speed continuously because of heat issues; this is pretty much the same thing. I should've patented it...
Just for the fortran-ignorant reader, note that there have been two fortran standards since F90 - 95 and 2003.
Also, the array-operations nicked from APL in modern fortran enable a lot of implicit parallelism, as does idiomatic fortran's referential transparency.
DON'T base your opinion of Fortran on GNU Fortran - it'd be like taking Emacs Lisp to be the state-of-the-art in Lisp. The Intel Fortran compiler can do magic things.
I understand that it doesn't work at this point, sorta like "don't cross the streams" from ghostbusters.. But really, we're talking about a long series of math problems at this point, why not interleave? I understand the math is hard, that's why intel has all of those Phd's. Getterdun. I wants me some Quake 9 a 4.2 billion frames per second. Plus, programming multithreaded is all superhardish!
And besides, most modern OSes basically relegate the bios to the back burner. Its not like we're still calling bios interrupts from DOS anymore.
It's not as good as you hope. I have three new machines all with BIOS bugs that are a real problem - a SiS mobo that doesn't setup my MTTR registers correctly and so causes the machine to run murderously slow unless I tell the kernel to map out the last bit of RAM or setup my own MTRR registers by hand, an Asus mobo that causes all kinds of problems and kernel panice on the IDE CD-ROM device unless it's jumpered slave, on the secondary bus, and on the end of a single-headed cable (yeah, that was easy to figure out) and an nVidia chipset with BIOS bugs that causes the third and fourth SATA drives in a server to drop dead if they're heavily used (Tyan has sent us new BIOS flashes to try to fix this 'known problem'.).
My only success (that is, the gear actually works without crazy bugs) in the past couple years has been with all-Intel Mobos and HP Athlon boards.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)