Distributed Operating Systems?

← Back to Stories (view on slashdot.org)

Distributed Operating Systems?

Posted by Cliff on Monday July 31, 2000 @06:50AM from the pipe-dream-or-useful-technology dept.

ayejay asks: "Are there any models/designs for a totally distributed operating system, possibly utilizing AI to learn patterns of use, resource need, and anything else that might be relevant? What -would- be relevant to such a thing? Given Napster and all the load balancing kernel enhancements and SETI@home type programs out there, it seems the idea is ready to be developed into a feasible paradigm. What do you think some of the major concerns/design issues are? I'm talking about nuts and bolts..." Now I'm all for distributed applications, but applying such paradigms to something as critical as the operating system seems to be taking the issue a bit too far. Would creating a 'distributed' operating system gain us any advantage over what we are currently familiar with?

15 of 204 comments (clear)

Min score:

Reason:

Sort:

Suns... Plan 9 by Dungeon+Dweller · 2000-07-31 01:54 · Score: 4

Sun Microsystems products are designed around a network paradigm. A lot of the distributed stuff we have today comes out of their work. Distributed being used in a bit more ubiquitous sense than necessarily meaning clustering the processor power.

Plan 9, as part of its design, is designed with distribution in mind. Check it out!

--
Eh...
Mosix by 1010011010 · 2000-07-31 01:57 · Score: 5

Mosix is pretty cool, and will be even nicer when they have Distributed Shared Memory, Migratable Sockets and Direct Filesystem Access issues worked out (currently Mosix does i/o remotely through the home node, which makes it slower and loads the home node; DFSA allows remote nodes to access files locally rathen than via remote-I/O).

It provides preemptive process migration among cluster members. If you log into your "home node" and start a process, it will get migrated around the cluster according to its memory and CPU needs. Take a look at their remote monitor.

Currently it's Intel-only, but a mixed-architecture version would be sweet. Imagine a cluster of intel, alpha, PPC and sparc CPUs such that you log into any of them, run any Linux binary, and the loader cranks it up on the appropriate machien for you, transparently...

From the website:
MOSIX is a software package that enhance the Linux kernel with cluster computing capabilities. The enhanced kernel allows any size cluster of X86/Pentium based workstations and servers to work cooperatively as if part of a single system.

To run in a MOSIX cluster, there is no need to modify applications or to link with any library, or even to assign processes to different nodes. MOSIX does it automatically and transparently, like an execution in an SMP - just "fork and forget". For example, you can create many processes in your (login) node and let MOSIX assign these processes to other nodes. If you type "ps", then you will see all your processes, as if they run in your node.

The core of MOSIX are adaptive resource management algorithms that monitor and respond (on-line) to uneven work distribution among the nodes in order to improve the overall performance of all the processes. These algorithms use preemptive process migration to assign and reassign the processes among the nodes, to continuously take advantage of the best available resources. The MOSIX algorithms are geared for maximal performance, overhead-free scalability and ease-of-use.

Because MOSIX is implemented in the Linux kernel, its operations are completely transparent to the applications. It can be used to define different cluster types, even a cluster with different machine or LAN speeds, like our 100 processors cluster:

---- ----

--
Napster-to-go says "Fill and refill your compatible MP3 player", which is a lie. It's not MP3. It's WMA with DRM.
1. Re:Mosix by Amoeba+Protozoa · 2000-07-31 02:28 · Score: 3
  
  Mosix does rule, but I think it is based on the fork() and forget principle. I think it would be even cooler to have something that, given enough bandwidth, would transparently divide up processor time for a single thread/task. Why? Because I want to see ridculous speed for applications I cannot, myself, easily parallelize such as seti@home or commercial codecs.
  
  -AP
Google is your friend. by Carnage4Life · 2000-07-31 01:58 · Score: 4

From a quick search on Google.

A listing of the major OS research projects involving distributed operating systems
Of course there would be an advantage by joshamania · 2000-07-31 02:03 · Score: 3

Having a distributed OS would take a great load off of distributed application developers. Currently, a distributed application has to be able to handle all the tasks that a normal operating system currently does. Not having a distributed operating system for distributed apps is like not having an OS for normal client apps.

Seti@Home has to be able to route all its necessary functions and information around its network. Why is that necessary? A distributed operating system should be able to handle the tasks of distribution for the applications. It's almost as if every distributed app developer has to re-invent the wheel every time he/she wants to create such an app. Why do you think there aren't many distributed apps out there? They're too bloody hard to code. Joe Schmoe VB developer cannot create distributed apps because like as not, he knows very little about networking. Most developers know squat about networking (keep in mind that most developers don't read /., so I'm not referring to YOU).

Soon, every appliance in your abode is going to have a processor in it. That processor may be much more powerful than what is really necessary to operate the appliance. Especially if a web browser is built into your fridge. The processor has to be able to run the browser, so lets say it's Pentium class. Do you really need a Pentium to measure the temperature of the fridge and turn on the compressor? No. So every time the browser is not being used, clock cycles are wasted.

I see no reason why future homes don't have the standard PC. They could use the collective power of all the processors in all of the appliances in the home to make a PC-type of interface for a user. It would also lend a certain amount of fault tolerance. Many functions would be duplicated on the home network, and data loss and downtime would be minimal if at all.
Programming for distributed systems. by Christopher+Thomas · 2000-07-31 02:21 · Score: 3

Having a distributed OS would take a great load off of distributed application developers. Currently, a distributed application has to be able to handle all the tasks that a normal operating system currently does. Not having a distributed operating system for distributed apps is like not having an OS for normal client apps.

Seti@Home has to be able to route all its necessary functions and information around its network. Why is that necessary? A distributed operating system should be able to handle the tasks of distribution for the applications. It's almost as if every distributed app developer has to re-invent the wheel every time he/she wants to create such an app.

You are already running a distributed application whenever you run a threaded application on a SMP box. Writing applications for a distributed operating system is no easier and no harder than this.

You _will_ have some programming overhead no matter what - by nature, a distributed application needs to have multiple pieces running concurrently, and so has to manage synchronization and communication between these parts.

The good news is that everyone already understands multiple processes and threads, so we already have a well-established programming model for it.

Now, in the real world, client/server computing will always tend to have an advantage for wide deployment, as you can run those on heterogenous platforms (a la SETI-at-home). For small deployment... you're looking at either a high-processor-count SMP machine or a cluster, depending on the degree of coupling, and those are already well-understood.

So, I'm a bit puzzled as to what you think needs to be developed. It looks like we have distributed computing already.
OS info, including distributed ones by JohnZed · 2000-07-31 02:26 · Score: 4

There's a huge list of various operating system projects here: http://www.cs.arizona.edu/peo ple/bridges/os/full.html.
I find all the "pure" distributed OS stuff (systems build from the ground up to do distributed processing and not much else)relatively uninteresting on its own, but a lot of good ideas from those projects can filter into general purpose operating systems, especially when you start talking about clustering or even NUMA. You might want to see MOSIX for a cool, distributed/clusterd Linux version.
--JRZ
Several Options... by Christopher+B.+Brown · 2000-07-31 02:28 · Score: 5
- Mach was the "granddaddy" of distributed OS work, with most of the recent efforts going into GNU Hurd.
- There's Mosix that builds a NOW atop Linux
- The MIT Parallel and Distributed OS Group should be mentioned; efforts include the Exokernel
- Plan 9 has an interesting model for splitting work across "compute servers" and "file servers" and "display servers."
- Distributed Operating Systems lists lots of them...
- Sun's Spring was the basis for much of what is in CORBA;
- Sprite provided a Unix-like distributed OS that provided much of what is being used now to build journalling filesystems
- Amoeba was Tanembaum's successor to Minix; note that Python was one of the side-effects of the Amoeba project...
Each has some somewhat different insights to bring to the table; there is no unambiguous way of saying "this is all vastly superior."
--
If you're not part of the solution, you're part of the precipitate.
Success Depends on Application by tarsi210 · 2000-07-31 02:29 · Score: 4

From the What-do-you-mean-the-coffee-maker-stopped-respondi ng? dept.

The true success of a distributed OS will be in the applications in which it is applied. Obviously, if you don't have need for the advantages that a distOS brings to your computing, then you don't need a distOS, however cool it might be. My mother (who finally checks her email every night, bless her technologically-crippled heart) does not need the problems associated with attempting a distOS. What she does would not benefit from the extra resources.

Of course, supporters of this idea (and I'm not saying I'm not one) would state that you don't think you need the distOS because we haven't actually made a reason yet to need it. Kind of like how everyone didn't NEED the Internet until, of course, we had it. Now there are sites like /. full of caffeine-enhanced techno-addicts. The presence created the need.

This is true, I think, in many ways. However, I think when implementing such an OS consideration needs to be had for exactly what is being accomplished by it being distributed. I can see mainframe-like systems being extremely benefitted by such a system. A game system could really benefit from the extra horsepower, given that the connections were strong enough. Playing music, DVDs, etc...all very high CPU and memory applications could see some interesting benefits.

How about stability and redundancy? How would you like an OS that ran even if a bomb knocked out part of its system? Rewrote and/or re-routed itself to account for the damage and still get the job done? Wow! What a disaster-safe way to compute! Of course, you have one of these OSes inside your head right now......

End fact is: Good idea, needs lots of consideration into the practical application of such a thing so that we aren't playing solitaire with a distOS.

--
Blog,Twitter
Some Reasons for a Distributed OS by hopping+yak · 2000-07-31 02:37 · Score: 3

1) Fault Tolerance: programs can re-continue execution even though some of the processors and memory that they reside on cease to function.
2) Performance Benifits from Parallelism: distribute threads of execution across the global computational grid.
3) Share Resources Efficiently: don't waste those idel CPU cycles. Don't waste that extra main memory. This may be the least valid reason, as cpu cycles and memory have a big head start over bandwidth on the value vs. time scale. Moore's law has all of them getting exponentially cheaper over time, but right now bandwidth is the most valuable of the three.
4) Support a New Generation of Applications: Distributed operating systems can offer unique support for things like shared virtual environments, or widly distributed databases. It is a classic point of contention whether the distributed system services should be implemented on the application layer, or on some lower layer. However, I don't think anyone can argue that in terms of ease of application development, it is often very nice to have a really nice abstraction available on which to base your app.
"A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable." -- Leslie Lamport
Mozart by Baldrson · 2000-07-31 02:59 · Score: 3

What do you think some of the major concerns/design issues are? I'm talking about nuts and bolts...
Many of the important theoretic issues have been addressed at the nuts-and-bolts level by the Mozart Programming System. Specifically, if you read Distributed Programming in Mozart - A Tutorial Introduction you'll have an idea of the kind of distributed programming power provided by a network of Mozart systems.
The key to Mozart's power is its use of ultra-light-weight threads that can share single-assignment distributed variables within heirarchical computation spaces. What this means is you can have unlimited "processes" that are waiting on all sorts of things all over the network -- and failures are easily confined to the minimum logical spaces.
By "ultra-light-weight threads" I mean a virtual unification of process structure with data structure.

--
Seastead this.
"divide processor time for a single task"? by Christopher+Thomas · 2000-07-31 03:02 · Score: 3

I think it would be even cooler to have something that, given enough bandwidth, would transparently divide up processor time for a single thread/task.

How exactly do you propose that the operating system do this?

Unless the programmer or compiler parallelizes the code, you're out of luck for running it on more than one processor at a time. What is the OS supposed to do? Recompile it on the fly, adding all of the MT-safing, rebuild it, and hope that it's faster?

Unless an application is designed from the start to be parallel, it can't be run as a parallel program.
Distributed OSes are here by Greg+Lindahl · 2000-07-31 03:20 · Score: 4

There are several real, full-featured distributed operating systems out there. One good example is Legion. It gives you the illusion of running programs on your desktop, while they are actually running lord-knows-where. Yes, you often need a lot of network bandwidth to get good results. Depending on the exact details, you can run programs on other machines with either no or small modifications.
Lest you think this has nothing to do with today's operating systems, the Linux desktop folks have started using Corba quite a bit to link things together. Well, Legion provides much more powerful, secure, and reliable ways to do the same thing, in a much more consistant fashion.
Distributed, but not too connected by Animats · 2000-07-31 03:29 · Score: 4
There have been all too many "distributed operating systems" out of academia. Few if any have gone mainstream, for a number of good reasons.
- There aren't many problems that really need one. SETI@Home and crypto problems need so little coordination that E-mail would be enough.
- Clusters are easier to do Read In Search of Clusters, a philosophical book on why clustering beats tightly-connected systems. This was written in 1995, before clusters took over the web server industry, but it's more relevant today than it was then. And it's out in paperback now.
- There seem to be no useful stops between shared-memory multiprocessors and clusters. Many efforts have been made to build machines with lots of processors and exotic schemes for interconnecting them. From the Illiac IV to the Ncube to the Transputer to the Monarch to the Connection Machine, they've all lost out to more vanilla architecture.
- Writing tightly-coupled distributed applications is both hard and wierd. There have been many attempts to make it easier via language design, from T/TAL for Tandems to LINDA to Occam to single-assignment languages. Nobody uses that stuff. (Arguably some should; one big lack of C/C++ is a total lack of language support for concurrency.)
- Networking bandwidth is high enough for clusters. So ordinary techniques suffice.
It's one of those things that's hard to do and has a low payoff.
Fair 'nuff... by Christopher+B.+Brown · 2000-07-31 06:20 · Score: 3

My bad; yes, Hydra should be on the list, perhaps as the great-grand-daddy. I gather that the IBM AS/400 platform is based on Hydra, albeit with the advanced stuff hidden far from view so as not to frighten the accountants.
The interesting part is that Legion provides tools that resemble some parts of CORBA, whilst Spring provided tools that grew into CORBA, whilst Sprite provided journalling and cache tools that are essentially what journalling and cache servers provide today.
In a sense, what has happened is that an OS of the 1970s, Unix, has been shown sufficiently malleable that it could integrate in concepts from the research projects of the 1970s and 1980s.
Unfortunately, the 1990s were not a terribly good time for OS research; sort of like The Very Long Night of Londo Mollari of the OS world. There was this minor problem of Microsoft "buying away" whatever serious OS researchers that they could...

--
If you're not part of the solution, you're part of the precipitate.