Will Pervasive Multithreading Make a Comeback?
exigentsky writes "Having looked at BeOS technology, it is clear that, like NeXTSTEP, it was ahead of its time. Most remarkable to me is the incredible responsiveness of the whole OS. On relatively slow hardware, BeOS could run eight movies simultaneously while still being responsive in all of its GUI controls, and launching programs almost instantaneously. Today, more than ten years after BeOS's introduction, its legendary responsiveness is still unmatched. There is simply no other major OS that has pervasive multithreading from the lowest level up (requiring no programmer tricks). Is it likely, or at least possible, that future versions of Windows or OS X could become pervasively multithreaded without creating an entirely new OS?"
I still hate that BeOS went belly up. It was a great operating system but was crushed before it ever got very far. The hardware support was also amazing: it would run winmodems and other windows only hardware. I've never tried writing an operating system, but I hope some of the features from BeOS make it into linux/OSX. One interesting thing to note is Be was originally a mac alternative and was only later moved to x86.
Another cool operating system to check out is MenuetOS... it is written entirely in Assembly! Very fast boot times and the GUI and eevrything fits easily on a floppy!
Get a web developer
It isn't really the pervasive multithreading that does the job on responsiveness for BeOS, and nor does having the "two threads per window" thing (which I think is what the poster is referring to in terms of "pervasive multithreading) avoid "programmer's tricks" - in fact, you have to be just as careful as if you were developing with Windows, and span up a background thread. One issue for BeOS developers was the amount of hard thinking you had to do to perform simple tasks in a pervasively multi-threaded environment, when you're still having to deal with all the pitfalls of lock-based programming.
However, taking only a few cycles to spin up or kill a thread (rather than the 10,000 plus it takes Windows), or perform a context switch, is a significant help. (There used to be an interesting article benchmarking those things on the Be website, but I can't find it any more).
MS have also added some more interesting stuff to the scheduler in Vista, which helps with uninterrupted sound or movie playback, so at least some of that stuff is possible without a complete redesign.
A few years ago, on a Dual Celeron 366Mhz with 256MB of RAM, I went out of my way to attempt to crash it. I opened about 120 OpenGL demos with only minor decrease in performance. After inherriting that mainboard, processors and RAM from my uncle and then increasing it to 512MB, the same test ground both FreeBSD and Linux to a halt.
Well, it's not really an OS issue. Sure the OS has to provide some underpinnings so that the programmers can take advantage of it. But I think most of it is already mature enough for applications to use it. Why don't they use what is already there? I mean everybody whines about how unresponsive X is. Until X is rewritten to be multi-threaded, you won't see the UI responsiveness that you see in BeOS. On your typical Linux box, X is the real bottleneck. There is no point in rewriting QT or any UI toolkit until X is fixed. You won't be able to replicate the multiple videos trick unless X is fixed first and then the applications are modified to use multi-threading to its fullest.
too true. The linux kernel beats the beos kernel in threading benchmarks, but the entire Be OS GUI stack (kernel, display, windowing, controls) were designed with multithreading in mind. X/KDE/GTK et al are relics based on 1986 era computing.
Recall that this was the effet of Intel's NSP (the ill-named "Native Signal Processing"), a real-time multui-thread scheduler inserted at the device-driver level of Windows. Combined with something called VDI (Video Direct Interface), which allowed applications to bypass the Microsoft GDI graphics layer in certain ways, this allowed multiple video, graphics, and audio streams, mixed and synchronized, on circa-1993 computers, something largely not even possible today. While NSP was intended primarily for media streams, its technology was broadly applicable to more responsive and vivid interfaces. The result was Microsoft's threat to cut off Intel from future Windows development and specifically to withhold 64-bit support from Itanium, to more publically support AMD (which they did, for a while), and to threaten any OEMs using the code with withdrawal of Microsoft software support. Much of this was detailed in the Microsoft anti-trust trial and the accompanying discovery documents. Under this pressure, Intel abandoned the software, transferring VDI to Microsoft (it formed the core of what was later called DirectX), and outright killing NSP. Andy Grove admitted to Fortune magazine "We caved." (http://cyber.law.harvard.edu/msdoj/transcript/sum maries2.html)
This is not to suggest that this was the best or only way to do this, or that others haven't done it and done it well. But despite the best efforts of Linus and friends, Windows remains the dominant desktop OS, and Windows continues to be built on a base of 1970s-era operating system principles. Microsoft has, and continues to, build substantial barriers to anyone trying to substantially modify the behaviour of Windows at the HAL/device layer. Whether VMWare and equivalent virtualization technologies are finally a camel's nose under the tent edge remains to be seen. But as long as Windows remains the dominant desktop OS, you can expect the desktop to lag 10-15 years (at best) behind the state of the art in OS, GUI, and real-time developments.
I'm a CS grad student at the University of North Carolina. I've never used BeOS, but I'm confident that responsiveness will increase, because the work I'm doing right now is attended to address this very issue.
The thing that makes multi threaded programming so difficult is concurrency control - it's extremely easy for programmers to screw up lock-based methods, deadlocking the entire system. The are newer methods of concurrency control that have been proposed, and the most promising method (in my opinion) is 'Software Transactional Memory' which makes it almost trivial to convert correct sequential code to code that is thread-safe. Currently, there are several 'High Performance Computing Languages' in development, and to my knowledge, they all include transactional memory.
The incredible difficulties involved in making chips faster are precipitating a shift to multicore machines. The widespread prevalence of these machines, coupled with newer concurrency control techniques will undoubtedly lead to an increase of responsiveness.
My blog
If/when the CPU designers currently screaming "more threads, more threads!" at us coders get around to implementing efficient h/w transactional memory, painless fine grain parallelism may become a reality. Until then, STM may be fine for very large applications on systems with huge memories and lots of cores, but probably isn't an option for the average desktop.
But STM does present some intriguing possibilities for distributed parallel environments (think STM + DSM).
007: "Who are you?"
Pussy: "My name is Pussy Galore."
007: "I must be dreaming..."
What you say may have been true some years ago. Nowadays Linux is far more advanced technically than Windows with respect to multi-threading and even more multi-processor / multi-core support.
E.g. gcc does thread-safe initialization of local static variables -- Visual C++ does not. Linux runs on up to 4096 processor machines -- Windows does not. Linux can be run tickless (to some extend) -- Windows can be not. Linux has support for the SUSv3 realtime API with support for nanosecond resolution timers -- Windows has nothing comparable. Linux will shortly have the new completely fair scheduler (CFS) were a user reported that the system is still quite usable with 32k busy threads running in parallel -- Windows would be not.
Sure, but most desktops don't run more than one or two apps at a time. So, 2-4 cores is all that you get "for free" without new apps. Sure, if I'm building a web server application, it'll scale much more gracefully, but it already scales rather gracefully.
Are you serious? The idea is to have all your programs running all the time, and interact with them whenever you want with instantaneous response. Not to mention that most apps people run nowadays either are servers (P2P, LAN Shares, etc), clients that sit around listening to servers (IM) or querying them with frequent regularity (Email Client). And the progression is towards having personal servers that you can connect to using either a local or remote client.
The next generation of computing is going to come from the vast multitude of developers who are accustomed to writing client-server applications applying what they know to computers that behave like a server cluster. They are better equipped to approach the problems and rewards of this architectural progression than the guy who has been working in the traditional application space. Now, that's a generalization that's full of exceptions, but it'll be still be proven true on the wider scale.
-1 Uncomfortable Truth
"I was nearly crucified when I suggested my boss to recode a piece of an application in C so it scales better than the current shitty VB COM version. He just looked through me and said: add another server! Lot of today's code is written by people who don't even understand how the code is getting executed"
Was it more cost effective to have a programmer recode it in C (which includes the required maintenance) or use the less optimal but easier to maintain VB COM? I'm all for using C over C#, Java, and VB, but sometimes you need to look at the situation from a business standpoint.
This is good, I like this political stuff:
MS-DOS 1.0 was Herbert Hoover, aloof to the problems of the common man but friend of the engineer in all of us. Also discovered Transformers.
Mac OS 7-8-9, all Franklin Roosevelt, very competent, lead us through difficult times, but left a legacy of programs which have become quite a mixed bag.
Windows 3.1, Dwight Eisenhower, amiable enough, competent, but leaving historians (and many contemporaries) very wanting.
Windows 95 thru ME, Lyndon Johnson, one of the boys, very able at getting things done, but in the end a disaster, rightfully ceding his throne.
Windows NT, Richard Nixon, the archetypal back-room politician, ruthless, and ultimately brought down by little faults, but many believe he was a great president and did much to modernize the Republican Party.
Windows XP, Ronald Reagan, everybody who hates him never met him, he could charm anyone, the Great Communicator. Bought Iranian weapons for contras with drug money.
Mac OS X, Bill Clinton, cheerful and smart, if not the most productive. Known for his speeches.
Don't blame me, I voted for Baltar.
I think you misunderstand the ways in which STM are relevant to this sort of issue. Sure you can do full blown STM with crazy commits and rollbacks that are large and complex but that isn't what causes the problems with most threading issues. Really the primary benefit of STM is just to give an understandable and intuitive means to manage simple things that programmers now do with locks, e.g., making sure the other thread doesn't update part of the object while your thread is making some small change to it.
As far as performance the key here is compiler design. Sure in the fully general case STM may be fairly resource intensive but most cases aren't the general case. The hope is that compilers can be improved to natively support STM and recognize where simplifying assumptions can be made.
In other words practical STM is a way to get the compiler to meet the programmer halfway. Compilers can't do auto-parrililization and won't be able to anytime in the foreseeable future but having programmers deal with very low level constructs like locks and semaphores is confusing and a waste of time. This is a nice comprimise to meet in the middle. At least as long as it is used correctly.
If you liked this thought maybe you would find my blog nice too:
It's ok. Too bad there is no recursive directory support in inotify. Software has to add a watch for every subdirectory of a tree it wants to monitor.
So what? Why do you want to put more functionality into the kernel than necessary? You can write user-mode code around inotify for recursive watches--Beagle does just that. If enough people wanted it, it could be wrapped up as a library.
That's a pet peeve of mine. Ha-i-ku is three syllables.
There's a sign hanging in the restroom here at work, and I just realized it was a haiku.
Isogutomo
kokoro shizukani
te wo soete
soto ni kobosuna
-Matsutake no Tsuyu
Even when hurried
Quiet your heart
Steady with your hand
And don't spill any on the outside
-Mushroom Dew
Beautiful, isn't it? The English version just says, "We aim to please, so please aim."
Use of the words "good", "bad" or "evil" is almost invariably the result of oversimplification.
Back when BeOS was still cool, and Rhapsody was hot, and NT was still counting by numbers instead of names, I installed BeOS, Rhapsody DR1, and NT 4 on the same hardware... a Pentium with 16MB of RAM... not exactly state of the art but not ridiculous for the time either.
BeOS showed no exceptional capabilities. Both Rhapsody and NT were easily able to run multiple concurrent applications without slowdown, and BeOS was at least as often bottlenecked on I/O.
BeOS was certainly a competent OS design, but the "remarkable" performance was only remarkable when it was compared with the classic Mac OS and mainstream Windows 9x. With those as the "competition", the legend of BeOS has grown over the years, but any contemporary preemptive multitasking OS could do as well.