Will Pervasive Multithreading Make a Comeback?
exigentsky writes "Having looked at BeOS technology, it is clear that, like NeXTSTEP, it was ahead of its time. Most remarkable to me is the incredible responsiveness of the whole OS. On relatively slow hardware, BeOS could run eight movies simultaneously while still being responsive in all of its GUI controls, and launching programs almost instantaneously. Today, more than ten years after BeOS's introduction, its legendary responsiveness is still unmatched. There is simply no other major OS that has pervasive multithreading from the lowest level up (requiring no programmer tricks). Is it likely, or at least possible, that future versions of Windows or OS X could become pervasively multithreaded without creating an entirely new OS?"
I still hate that BeOS went belly up. It was a great operating system but was crushed before it ever got very far. The hardware support was also amazing: it would run winmodems and other windows only hardware. I've never tried writing an operating system, but I hope some of the features from BeOS make it into linux/OSX. One interesting thing to note is Be was originally a mac alternative and was only later moved to x86.
Another cool operating system to check out is MenuetOS... it is written entirely in Assembly! Very fast boot times and the GUI and eevrything fits easily on a floppy!
Get a web developer
It isn't really the pervasive multithreading that does the job on responsiveness for BeOS, and nor does having the "two threads per window" thing (which I think is what the poster is referring to in terms of "pervasive multithreading) avoid "programmer's tricks" - in fact, you have to be just as careful as if you were developing with Windows, and span up a background thread. One issue for BeOS developers was the amount of hard thinking you had to do to perform simple tasks in a pervasively multi-threaded environment, when you're still having to deal with all the pitfalls of lock-based programming.
However, taking only a few cycles to spin up or kill a thread (rather than the 10,000 plus it takes Windows), or perform a context switch, is a significant help. (There used to be an interesting article benchmarking those things on the Be website, but I can't find it any more).
MS have also added some more interesting stuff to the scheduler in Vista, which helps with uninterrupted sound or movie playback, so at least some of that stuff is possible without a complete redesign.
...QuickTime and the AppKit become threadsafe. Which might be a priority for Apple, but then again might not, given that they've had multicore machines for a long time. Cocoa doesn't lend itself well to the UNIX approach of multiple processes, so if we really want to take advantage of multiple cores, Apple's going to need to seriously step up their multithreading support.
The AppKit docs are riddled with notes like "on the main thread" or "some thread will receive this notification". Maybe Leopard will change that.
A few years ago, on a Dual Celeron 366Mhz with 256MB of RAM, I went out of my way to attempt to crash it. I opened about 120 OpenGL demos with only minor decrease in performance. After inherriting that mainboard, processors and RAM from my uncle and then increasing it to 512MB, the same test ground both FreeBSD and Linux to a halt.
Well, it's not really an OS issue. Sure the OS has to provide some underpinnings so that the programmers can take advantage of it. But I think most of it is already mature enough for applications to use it. Why don't they use what is already there? I mean everybody whines about how unresponsive X is. Until X is rewritten to be multi-threaded, you won't see the UI responsiveness that you see in BeOS. On your typical Linux box, X is the real bottleneck. There is no point in rewriting QT or any UI toolkit until X is fixed. You won't be able to replicate the multiple videos trick unless X is fixed first and then the applications are modified to use multi-threading to its fullest.
too true. The linux kernel beats the beos kernel in threading benchmarks, but the entire Be OS GUI stack (kernel, display, windowing, controls) were designed with multithreading in mind. X/KDE/GTK et al are relics based on 1986 era computing.
Recall that this was the effet of Intel's NSP (the ill-named "Native Signal Processing"), a real-time multui-thread scheduler inserted at the device-driver level of Windows. Combined with something called VDI (Video Direct Interface), which allowed applications to bypass the Microsoft GDI graphics layer in certain ways, this allowed multiple video, graphics, and audio streams, mixed and synchronized, on circa-1993 computers, something largely not even possible today. While NSP was intended primarily for media streams, its technology was broadly applicable to more responsive and vivid interfaces. The result was Microsoft's threat to cut off Intel from future Windows development and specifically to withhold 64-bit support from Itanium, to more publically support AMD (which they did, for a while), and to threaten any OEMs using the code with withdrawal of Microsoft software support. Much of this was detailed in the Microsoft anti-trust trial and the accompanying discovery documents. Under this pressure, Intel abandoned the software, transferring VDI to Microsoft (it formed the core of what was later called DirectX), and outright killing NSP. Andy Grove admitted to Fortune magazine "We caved." (http://cyber.law.harvard.edu/msdoj/transcript/sum maries2.html)
This is not to suggest that this was the best or only way to do this, or that others haven't done it and done it well. But despite the best efforts of Linus and friends, Windows remains the dominant desktop OS, and Windows continues to be built on a base of 1970s-era operating system principles. Microsoft has, and continues to, build substantial barriers to anyone trying to substantially modify the behaviour of Windows at the HAL/device layer. Whether VMWare and equivalent virtualization technologies are finally a camel's nose under the tent edge remains to be seen. But as long as Windows remains the dominant desktop OS, you can expect the desktop to lag 10-15 years (at best) behind the state of the art in OS, GUI, and real-time developments.
I'm a CS grad student at the University of North Carolina. I've never used BeOS, but I'm confident that responsiveness will increase, because the work I'm doing right now is attended to address this very issue.
The thing that makes multi threaded programming so difficult is concurrency control - it's extremely easy for programmers to screw up lock-based methods, deadlocking the entire system. The are newer methods of concurrency control that have been proposed, and the most promising method (in my opinion) is 'Software Transactional Memory' which makes it almost trivial to convert correct sequential code to code that is thread-safe. Currently, there are several 'High Performance Computing Languages' in development, and to my knowledge, they all include transactional memory.
The incredible difficulties involved in making chips faster are precipitating a shift to multicore machines. The widespread prevalence of these machines, coupled with newer concurrency control techniques will undoubtedly lead to an increase of responsiveness.
My blog
It was not. The five hours I spent trying to get a simple modem to work in BeOS, with no OS diagnostics to guide me, and very poor support from BeOS the company, was all the proof I needed that BeOS wasn't all it was hyped up to be. I can understand that Be did not have the resources to support every piece of hardware under ths sun, but I found it inexcusable that their support for diagnostics was so incredibly rudimentary. At the time, with Linux (this was 1999 or so), if I had a problem with some hardware I could either read the source (OK Be could never match this since it was proprietary and closed-source so that's not quite fair), or look at the copious amount of system logging that would generally point to the problem (stuff in dmesg, kernel logs, /var/log/messages, lots of tools and documentation to help me out). With BeOS, I was getting pop-up dialogs that just said stuff like "Error 0xFFFFFFFF occurred", with absolutely no useful information whatsoever. It was impossible for me to diagnose the problem no matter how hard I might try because the operating system just wasn't going to give me enough information to go on.
Also BeOS the company didn't respond at all to my requests for help with this. They provided zero technical support to me. Emails went unanswered.
Maybe BeOS had some nice architecture, but there is more to an OS than its handling of threads - much, much more, and I think that BeOS was not even close to ready for prime time. And the developers clearly had glossed over many aspects of an operating system (such as the aforementioned error diagnostics) to get to the pretty demos that the OS was capable of.
If/when the CPU designers currently screaming "more threads, more threads!" at us coders get around to implementing efficient h/w transactional memory, painless fine grain parallelism may become a reality. Until then, STM may be fine for very large applications on systems with huge memories and lots of cores, but probably isn't an option for the average desktop.
But STM does present some intriguing possibilities for distributed parallel environments (think STM + DSM).
007: "Who are you?"
Pussy: "My name is Pussy Galore."
007: "I must be dreaming..."
What you say may have been true some years ago. Nowadays Linux is far more advanced technically than Windows with respect to multi-threading and even more multi-processor / multi-core support.
E.g. gcc does thread-safe initialization of local static variables -- Visual C++ does not. Linux runs on up to 4096 processor machines -- Windows does not. Linux can be run tickless (to some extend) -- Windows can be not. Linux has support for the SUSv3 realtime API with support for nanosecond resolution timers -- Windows has nothing comparable. Linux will shortly have the new completely fair scheduler (CFS) were a user reported that the system is still quite usable with 32k busy threads running in parallel -- Windows would be not.
Sure, but most desktops don't run more than one or two apps at a time. So, 2-4 cores is all that you get "for free" without new apps. Sure, if I'm building a web server application, it'll scale much more gracefully, but it already scales rather gracefully.
Are you serious? The idea is to have all your programs running all the time, and interact with them whenever you want with instantaneous response. Not to mention that most apps people run nowadays either are servers (P2P, LAN Shares, etc), clients that sit around listening to servers (IM) or querying them with frequent regularity (Email Client). And the progression is towards having personal servers that you can connect to using either a local or remote client.
The next generation of computing is going to come from the vast multitude of developers who are accustomed to writing client-server applications applying what they know to computers that behave like a server cluster. They are better equipped to approach the problems and rewards of this architectural progression than the guy who has been working in the traditional application space. Now, that's a generalization that's full of exceptions, but it'll be still be proven true on the wider scale.
-1 Uncomfortable Truth
- GCC.
- Final Cut Express.
iTunes used to be on that list when ripping CDs, but since my last upgrade the CD drive has become the bottleneck. GCC doesn't need to be multithreaded, because I can always add a -j option to my make command and run one instance of it on each code (and a floating spare one or two for when one CPU is waiting for I/O). Final Cut can consume pretty much as much CPU power as it's possible to throw at it, but anything involving video is an embarrassingly parallel problem (decompose along the time access or into macroblocks as you wish).There is no reason to add support for SMP machines to any program that only uses a fraction of a single core's power. If you're doing something in the background then it might be worth spawning off a worker thread to keep the UI responsive, but most other things are better handled with co-routines, which are much easier to reason about (hence the fact that pretty much every GUI toolkit uses some form of them).
When you are not performing embarrassingly parallel computations, threads aren't such a good idea, since you end up with a lot of synchronisation issues that can be avoided by moving to an asynchronous model such as that used by Erlang.
I am TheRaven on Soylent News
"I was nearly crucified when I suggested my boss to recode a piece of an application in C so it scales better than the current shitty VB COM version. He just looked through me and said: add another server! Lot of today's code is written by people who don't even understand how the code is getting executed"
Was it more cost effective to have a programmer recode it in C (which includes the required maintenance) or use the less optimal but easier to maintain VB COM? I'm all for using C over C#, Java, and VB, but sometimes you need to look at the situation from a business standpoint.
This is good, I like this political stuff:
MS-DOS 1.0 was Herbert Hoover, aloof to the problems of the common man but friend of the engineer in all of us. Also discovered Transformers.
Mac OS 7-8-9, all Franklin Roosevelt, very competent, lead us through difficult times, but left a legacy of programs which have become quite a mixed bag.
Windows 3.1, Dwight Eisenhower, amiable enough, competent, but leaving historians (and many contemporaries) very wanting.
Windows 95 thru ME, Lyndon Johnson, one of the boys, very able at getting things done, but in the end a disaster, rightfully ceding his throne.
Windows NT, Richard Nixon, the archetypal back-room politician, ruthless, and ultimately brought down by little faults, but many believe he was a great president and did much to modernize the Republican Party.
Windows XP, Ronald Reagan, everybody who hates him never met him, he could charm anyone, the Great Communicator. Bought Iranian weapons for contras with drug money.
Mac OS X, Bill Clinton, cheerful and smart, if not the most productive. Known for his speeches.
Don't blame me, I voted for Baltar.
>> Let me know when a modern linux system can asynchronously notify a process that a file/directory has been modified.
> You mean--like every one? Beagle uses it, and so does Nautilus. It's been there for a number of years.
It's ok. Too bad there is no recursive directory support in inotify. Software has to add a watch for every subdirectory of a tree it wants to monitor.
As one commenter mentioned many programmers are bad at writing reliable multithreaded code.
Microsoft realized this early on and put a bunch of barriers into windows (and more so into MFC) that are designed to prevent programmers from even writing multithreaded GUIs.
Let me make this clear. If you call an MFC gui routine from a thread other than the main GUI thread it will return without executing. If you "send" or "post" a message to a control from the wrong thread using the MFC versions of send() and post(), it will return without doing anything. Microsoft used thread local memory to prevent programmers from being able to write multi-threaded GUIs.
I've programmed work-arounds many times with user-messages and created programs that were as responsive as BEOS programs. Understand that the fact that most programs' guis lock up for a few seconds when opening files, etc. is the result of MS' decision to not trust programmers with multithreading the GUI.
I've also worked on multiprocessor, high performance server apps, and I know how obscure multithreaded techniques are, and how small mistakes can make software unreliable. I don't entirely blame MS for preferring reliability to responsiveness, but they are in a position where they could educate rather than restrict, and they chose restricting.
> When I was a university, I had classes in.. assembler, Pascal, Concurrent Euclid, Simula, Prolog and C
... eye-opening ... computer-related experience I have ever had. It was the first time I REALLY "got it" -- and understood EVERYTHING that was happening under the hood as a system of disjoint events, acting together in concert.
Concurrent Euclid? Dude, where did you go to school? U of T? In the 80s?
I had the distinct pleasure of having Jim Cordy as a prof when I was an undergrad in the 90s. In particular, studying compilers with the man was the single most
Actually, thinking back to the days reminds me of a funny story I haven't thought about in about a decade. I was taking first year computer science. There was a fellow in my class, smart guy, good C coder.. couldn't see the forest for the trees. In fact, he still owes me a pair of Sony headphones he borrowed about a thousand years ago. Anyhow. He stood up in class one day and asked Cordy something like "What kind of an IDIOT would design a language like TURING?".. "Well, Mr xxx... that idiot would be me".
Haha hahaha
I kind of miss being in school.
But I don't miss stats.
I do miss forging usenet control messages.
Too bad you can't do that any more. Kids are missing so much nowadays!
Do daemons dream of electric sleep()?
I think you misunderstand the ways in which STM are relevant to this sort of issue. Sure you can do full blown STM with crazy commits and rollbacks that are large and complex but that isn't what causes the problems with most threading issues. Really the primary benefit of STM is just to give an understandable and intuitive means to manage simple things that programmers now do with locks, e.g., making sure the other thread doesn't update part of the object while your thread is making some small change to it.
As far as performance the key here is compiler design. Sure in the fully general case STM may be fairly resource intensive but most cases aren't the general case. The hope is that compilers can be improved to natively support STM and recognize where simplifying assumptions can be made.
In other words practical STM is a way to get the compiler to meet the programmer halfway. Compilers can't do auto-parrililization and won't be able to anytime in the foreseeable future but having programmers deal with very low level constructs like locks and semaphores is confusing and a waste of time. This is a nice comprimise to meet in the middle. At least as long as it is used correctly.
If you liked this thought maybe you would find my blog nice too:
It's ok. Too bad there is no recursive directory support in inotify. Software has to add a watch for every subdirectory of a tree it wants to monitor.
So what? Why do you want to put more functionality into the kernel than necessary? You can write user-mode code around inotify for recursive watches--Beagle does just that. If enough people wanted it, it could be wrapped up as a library.
Hmm ... It depends what code would be easier to rewrite. I think that a new GUI to replace X would be a good idea, then yet-another Qt version, so KDE would work with next to no modification. Is it possible to rewrite a multi-threaded X? What would be needed to be replaced? I really don't know : do apps depend on the memory management of X and the fact that it only has one process? Or is it possible to write a fully multi-threaded X-compatible server? So there would just be that one package to rewrite... KDE/Qt has that nice signal/slot thing, it must be easy to write that in a way that makes use of multi-cores.
Making laws based on opinions that stem up from false informations leads to witch hunts.
Utimately it's a version of #1. The Gui blocks if you make the mistake of doing work in the Gui thread!
I've made work around classes to fix that so that it's only a few key strokes in C++ to chain (ie call) a routine that doesn't run in the current (GUI) thread, it runs in a work thread, and to chain back to a completion routine in the gui when its finished.
It works like a charm, and programs can be completely responsive 100% of the time, like BEOS, I suppose. You can be loading a file, and still the menu works and you can move around the subwindows and edit them... And if a window is recalculating, it's still responsive during that - and it redraws the new data, when done.
You have to specifically decide what the program can do and not do while it's calculating, and code that. So there is more work in keeping a program responsive. You have to code for responsiveness.
That's a pet peeve of mine. Ha-i-ku is three syllables.
There's a sign hanging in the restroom here at work, and I just realized it was a haiku.
Isogutomo
kokoro shizukani
te wo soete
soto ni kobosuna
-Matsutake no Tsuyu
Even when hurried
Quiet your heart
Steady with your hand
And don't spill any on the outside
-Mushroom Dew
Beautiful, isn't it? The English version just says, "We aim to please, so please aim."
Use of the words "good", "bad" or "evil" is almost invariably the result of oversimplification.
Back when BeOS was still cool, and Rhapsody was hot, and NT was still counting by numbers instead of names, I installed BeOS, Rhapsody DR1, and NT 4 on the same hardware... a Pentium with 16MB of RAM... not exactly state of the art but not ridiculous for the time either.
BeOS showed no exceptional capabilities. Both Rhapsody and NT were easily able to run multiple concurrent applications without slowdown, and BeOS was at least as often bottlenecked on I/O.
BeOS was certainly a competent OS design, but the "remarkable" performance was only remarkable when it was compared with the classic Mac OS and mainstream Windows 9x. With those as the "competition", the legend of BeOS has grown over the years, but any contemporary preemptive multitasking OS could do as well.
I assume those 8 movies are all small so they all fit in memory and don't let the hard drive become the bottleneck, and low-resolution so they don't engage the tilt bits? Vista may be a bit faster than XP, but that doesn't make it a useful operating system for people who want to go where they want to today, rather than go to whichever sandbox Microsoft has approved today.
That being said, I've had multiple HD resolution videos running on my Linux laptop and desktop, flawlessly, on multiple Beryl cube sides. Vista isn't faster than Linux in any meaningful measure, and is slower in many instances because of it's insistence on DRM and encryption over INTERNAL BUSES.
My blog. Good stuff (when I remember to update it). Read it.
>>Well, you didn't look close enough to the demo: he launched 5 application simultaneously and had them running in a snap and whatever he did, the OS stayed responsive and very fast.
>So what? I can launch 20 applications simultaneously in Linux and have them running in a snap;
Well, I'm using Linux everyday at work, and I tell you: it doesn't feel responsive.
I played with BeOS (quite some time ago now) and it felt smooth, quick, reactive (on a PC ten time less powerful).
> that just isn't a big deal. Whether the OS stays responsive and fast depends on the apps.
Sure, if you ported FF to BeOS, it would suck as much on BeOS as it sucks on the other OS, that's true, but it's also true that using BeOS and its application felt responsive because they designed the applications which came with the OS, the toolkits, the programming guides this way, whereas using Linux or Windows don't feel responsive, and IMHO *this is* a big deal.
>If you launch Firefox, [..] simultaneously on BeOS, I guarantee you it would also bring it to its knees.
The thing is, you can bring an OS to 'its knees' while still making it 'responsive', and the video showed it quite well.. I remember another video where they overloaded the computer so that video rendering was stuttering, but the interface was still smooth, that's a priority issue and BeOS was nicely tuned for desktop load.
>But everybody uses Firefox because, in the end, it's still fast enough.
Well, I started using Opera because I don't think that FF is responsive enough. I don't understand why FF has a bigger marketshare than Opera: I think that Opera is better than FF.
Currently I'm using 50% FF and 50% Opera because I cannot stand Opera's weird tab management scheme (which cannot fully emulate FF tab management), but as soon as they fix this, I'll gladly drop FF until it becomes responsive (I'm not holding my breath)..
Nonsense. A single-threaded program doesn't magically become multi-threaded just because you're running it on a multi-core system. The programmer still needs to do "tricks" (or, hopefully, use a solid concurrency library) in order to create threads. A multithreaded program will run faster if there's more than once core, but even then it tops out if there are more cores than threads.
Oh yeah, and if there are lots of wait states in your program so that most of your threads are idle most of the time, it doesn't matter how many cores you have.
Take the time. It's worth it. It's not like they will become useless in the meantime. While they are learning the new system they can apply it where possible (simpler projects) and continue using the old one to do real work. Then the next generation will come along, having been immersed in it from the beginning, and we can all move on for great justice.
Even Linux, as a new take on Unix, has avoided many of the classic UNIX mistakes and achieved great things that none of them ever touched; it's running on damned near everyone's hardware - they're putting it there themselves! And it's running on cellphones, cameras, etc etc. (Some guys who had designed a digital camera that I met at a job fair told me that they had based it on a uSparc processor, but they didn't tell me what OS it ran...) :)
Anyway, yes, we should just get the hell over it and move on. It's past time. Easy to say, hard to do, absolutely necessary.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"