Towards an Internet-Scale Operating System
gschoder writes: "Two Berkeley computer scientists (including David P. Anderson of SETI@home) envision an Internet-scale operating system to harness the processing power, networking efficiency, and storage capacity of everyone's computers. Scientific American has their proposal."
There are still no simple ways to use a pair
of computers on the same desk efficiently, why not start there?
This is basically SetiAtHome on a massive scale. I wounder home many work units this cluster could do an hour ;-)
Cruise TT
And you will of course let other people freely benefit from your bandwidth / CPU power / etc., will you ? No, I didn't think so either.
"When Mary gets home from work and goes to her PC to check e-mail, the PC isn't just sitting there. It's working for a biotech company, matching gene sequences to a library of protein molecules. Its DSL connection is busy downloading a block of radio telescope data to be analyzed later. Its disk contains, in addition to Mary's own files, encrypted fragments of thousands of other files. Occasionally one of these fragments is read and transmitted; it's part of a movie that someone is watching in Helsinki. Then Mary moves the mouse, and this activity abruptly stops. Now the PC and its network connection are all hers."
Nope. Cause some l33t h4x0r will have own3d her already.
This is scary as hell. I hope it doesn't get implemented. This is far different from Seti...
Sent from your iPad.
I'm not so sure how i feel about something i own being used for something i don't. I use seti, but i downloaded it myself and agree with its purpose. But whose to say what my computer will be used for, whose to say what files will fill up my hd, ect. Luckly we still have a choice of the OS we want to run.
Carpe meam simiam!
Once the geek value wears off, this is just turning my office into a community resource.
This is all great, but let's face it. People don't leave their computers on all of the time. In fact, here in California, they run ads on television telling you to turn _off_ your computer when you're "out of the room."
Liquid cooling for PC's is still out of the reach of many, so noise is a factor. And I can only assume that this work will require your computer to be awake, so power management goes out the window.
Even if these were overcome, there's still the obstacle of just getting people to go along with this. It doesn't sound to me like these "pennies trickling into a virtual bank account" are going to pay for that broadband connection or the increased electricity bill.
Like most other things, it sounds great on paper...
Guess there is nothing new under the sun.
However, the proposed ISOS is big, powerful, and likely to be sought after by the most powerful corporations and institutions on the planet. How much lobbying would a large drug company need to do to get more than its share of distributed processing power? How much money would the U.S. Government need to give to them to use the system for cracking "terrorist" messages from the "evil ones" like Kevin Mitnick and Bernie G? How much money would the Government need to give to them to use the system for spying on individual users? Remember, this is the same government who pays Hollywood to put anti-drug themes in their sit-coms, so what would they not be willing to try?
The end result of this, then, is that ordinary computer users will be forced to subsidize (through the use of CPU cycles, electricity, wear and tear on hardware, and memory use) the efforts of large companies and governments who are working against their best interests. So, tell me again... what would we gain from this?
Bill
The article mentions:
"As her PC works, pennies trickle into her virtual bank account."
However, it doesn't mention the other side, that as her files are backed up elsewhere, pennies trickle out. In addition, assuming an equal amount of "work", the outflow needs to be greater then in inflow. Take for example, the pay-per-view movie. It has a set cost to purchase. Everyone storing the movie gets a bite. But a single copy of it won't work - a single system off (or back under control of the user) means that part of the real-time delivery of the movie is delayed. So the movie has to be stored in such a way that dozens of systems can be inaccessable and yet still play in real time. As such, you need to have a large numebr of copies.
Now think about this for data backup. Is Mary gets paid "X" to hold some data, she can't be the sole recipient of it. Say she's one of 3 people with a copy of it (a rather low number). So the total cost is 3X. Now, she's going hand having her data backed up, which is the same size. She's paying out 3X to back up the same amount of storage she's only getting paid X to provide - it's much more economical to back it up herself, say a copy on her laptop and her home coputer, or work and home so the never share geographical space.
Same goes for processing power - you can't assume that a unit will finish the task given it, so that you need to run it multiple times if it is time sensitive, leading to the same inflation on what you pay out over what you are paid for your unused resources.
=Blue(23)
LITTLE GIRL: But which cookie will you eat FIRST? C. MONSTER: Me think you have misconception of cookie-eating process.
- Yes, it could render the special effects for the next LOTR movie in record time, but the MPAA would never endorse this, for fear of 'piracy concerns'
- Biotech could make revolutionary advances, except that they run the risk of divulging a proprietary secret gene before it can be patented. A distributed network like this is practically begging for industrial espionage.
- It's not likely that banks will use it, as an accidental disclosure, or worse, alteration of the data could result in the corruption of account information and costly litigation.
Yes, scientists could very well use a general-purpose, distributed network. But with all the concern about privacy and IP rights, I doubt that any largely profitable business would be able to utilize such a system.The society for a thought-free internet welcomes you.
For technical computing jobs, this makes great sense.
For commercial computing jobs, as a business with economic incentives for participation, a distributed operating system unfortunately makes little or no sense due to the types of applications that are currently server-limited.
Commercial computing jobs which need "big servers" are typically very database-dependent. You can't distribute the application very well unless you can distribute the database. (And hopefully you aren't crunching terabyte data warehouses, right? That takes a while to send down the pipes...) Besides the inherent difficulty of distributing your database across many nodes, you have the the typical basket of problems the IOS must overcome with a very high degree of assurance: security of your highly-proprietary information, reliability, backup, etc.
Most of the P2P plays a year or two ago discovered this the hard way. The most promising sales approaches ended up being things like distributed caching for search engine companies, which is a niche, not a mainstream business.
--LP
...none of which were designed to tolerate the high latencies and frequent failures that a truly Internet-scale OS would face. Legion and similar projects are much nearer the mark, but this is still nowhere near being the sort of "solved problem" you claim it is.
Slashdot - News for Herds. Stuff that Splatters.
The utopian future that dreamers always look forward to will never happen. It hasn't happened before, it won't happen in the future. However, this type of computer for the desktop that shares it's 'computing' power with the entire network, makes LOTS of sense for businesses. I go to lunch, break, and then go home for the day. All the while, my computer could be donating its computing power to handling webserver requests, processing internal jobs for the mainframe, or even help run massive load and regression tests on the system to anticipate 'kinks' in the armor of the system from a scalability standpoint.
Sure, it would just be "so neato!" if every computer could be kept cheap for the home user by everyone sharing files, processing power, even memory; but let's face it, communism didn't work because there wasn't enough incentive for the worker bees to strive for better. There's always a fine balance between greed and sharing. Giving such a 'distributed computer network sharing' system to businesses would be a great start, but don't expect a 'home user' acceptance of such a system anytime soon. I want my full computing power for my new computer game that I bought with my own money, and I'm sure many other users aren't willing to give up their hard-earned money for everyone else to piggyback off their 3l337 system anytime soon.
The article looks more like an excuse for implementing a micropayment system (Creates a direct connection between your wallet and our bank account!). Enthusiasm for micropayment systems seems to come from people who want to collect the payments, not from the people expected to pay them. It's very clear that what consumers want are flat-rate services; competitively, flat-rate wins over pay-per-use as soon as the prices get close.
If you want vast amounts of CPU time and are willing to pay, you'd probably be better off cutting a deal for off-peak time on hosting server farms. You get a uniform environment, good interconnect bandwidth, and a single organization to deal with.
From: Greg Broiles
Subject: Re: Pricing spare resources and options?
At 01:44 PM 11/18/2001 -0500, dmolnar wrote:
>The recent comments on Mojo Nation prompted me to look at their site
>again. I don't see much guidance on how to set prices for network
>services. There's a mention someplace that business customers will build
>pricing schemes on top of Mojo Nation, but not much indication of what
>these schemes might be.
>
>So what is the "right" way to price resources? (Preferably beyond the
>obvious "supply and demand.")
Unfortunately, one of the evolutionary steps in Mojo Nation's development has been their abandonment, for the most part, of user-visible and user-configurable economics; they deliberately made it difficult to see how many Mojo are held by the local broker, and relatively unlikely that a broker will be able to earn significant Mojo by careful pricing - recent clients are configured such that the economic brakes on resource usage are sharply curtailed or removed entirely.
It's my impression that, given the changes in the venture capital and software markets, they've refocused their efforts away from P2P filesharing and towards speedy realtime content delivery, whereby people with limited net connections can maximize their incoming bandwidth by pulling (or getting pushes) from multiple other parties simultaneously, somewhat similar to what Morpheus/Kazaa are doing, or what Bram Cohen (a Mojo Nation alumnus) is doing with BitTorrent.
The economics seemed to attract people who wanted to experiment with pricing, etc., but that wasn't necessarily a market or constituency which is interesting to investors or businesspeople.
>A related question - I ran into a friend of mine who had just finished an
>internship in options trading. He suggested it might be worth looking at
>options on spare disk space or other resources, as a means of figuring out
>how to make Mojo-type systems eventually profitable in the real world. Now
>I have a copy of Natenberg's _Option Volatility and Pricing_ to look at...
It seems like there ought to be an interesting market here, but I know and worked with several people (with good financial backgrounds) who flogged this for awhile and never got anywhere. I guess a big part of the problem is that there's such a big difference in the perceived value of a megabyte/month of online storage .. if you're on the provider side, you
think that's pretty expensive, as you've got the investment & etc required
in building a data center, providing bandwidth to reach customers, paying
staff, etc - but if you're on the customer side, you look at an 80 Gb drive
at Fry's in the Sunday newspaper for $160 and think about a $500 1.5mb/s
frame relay connection, and wonder why the service guys want $3 per
Mb/month ..
and then the Mojo guys come along and make it sound like the people with the cheap frame relay connections and commodity PC hardware ought to be able to set up data centers in their back bedrooms or on their old laptops, but so far all of the business models proposed involve paying those guys up front for an indefinite period of storage, so there's no strong incentive to actually store the data for long, especially not if you can resell that same disk space 3 or 4 or 50 times.
Seems like the guys who really have hard data about options for bandwidth and disk usage are the disaster recovery guys. And that market hasn't been so great lately either, Comdisco declared bankruptcy and is their disaster recovery unit is getting swallowed up by Sungard, I think.
Anyway, yeah, the Enron guys thought there was something interesting to be done in bandwidth futures, too, but I don't know if they ever really got anything done before their demise beyond some demonstration projects.
--
Greg Broiles -- gbroiles@parrhesia.com -- PGP 0x26E4488c or 0x94245961
5000 dead in NYC? National tragedy.
1000 detained incommunicado without trial, expanded surveillance? National disgrace.
How long before you have to provide the government with compute cycles, as a cyber-tax?
I like the idea, but consent must remain with the owner of each computer. Still, like attempts to force DRM-blessed operating systems upon us, I fear that the days of controlling one's own computer are numbered (and the masses are too ignorant to understand what's at stake).
Oh, FWIW, I'm starting to keep a slashdot journal.
You could've hired me.
Frankly, "high latencies and frequent failures" are why such an idea is impractical, regardless of whether or not the theoretical problems can be solved (and i argue that they already have been solved).
Massive distribution should not and will not be done just because it's techno-cool... it has to produce real value. What sort of real value can it produce? That depends on what sort of problems it can solve.
First, let's look at constraints. The three obvious ones are CPU power, disk space, and network bandwidth. All three of these have been growing relatively in proportion to Moore's Law for the last couple of decades. Their relative proportions have not shifted much... the CPU is by far the fastest, followed by local disk, and then network bandwidth.
Now, let's look at the problems we want to solve. How about data storage ("Jane's computer has an encrypted fragment of someone else's movie")? Local disk space is far, far cheaper and more robust than network storage! Bandwidth is the most expensive part of the equation. I can buy another few dozen gig of disk space for $100. How long will it take to transmit a few dozen gig via DSL? Sure, network speed will scale up, but so will disk space. Unless something changes, the balance of the equation remains the same... local storage is cheaper then network, as well as more reliable.
Of course, not all files you want will be on your computer, hence peer-to-peer file sharing, which is what Microsoft is trying to solve. But in this case, local disk storage is far slower than CPU, and far faster than network... in other words, there is no reason to not use a user-level process to manage the data exchange. No OS support is necessary beyond TCP/IP and disk I/O, right? This problem has already been solved in numerous real-world ways.
Now let's look at CPU-bound problems. There are computations we may want to make that can't be done in a fraction of a second locally. These are generally math problems, sometimes with large datasets. Some of these problems can be parallelized, and some cannot. Of those that can be parallelized, some have coarse granularity, and some have fine granularity. Coarse problems, like keyspace searches for brute-force encryption cracking or SETI pattern searches, don't need OS-level support - data is most efficiently shared at the process level, which is what distributed.net and SETI do already. Others optimize at finer granularity. In those cases, data sharing and communication requirements between threads are so intense that using a slow, unreliable network is impractical! That's what big parallel supercomputers are for. So there's no need for OS-level support for parallelized number crunching that is practical in the current CPU/bandwidth ratio.
So what problem are we trying to solve that is distributed (or distributable) efficiently across multiple computers, and requires OS-level support for optimum efficiency? I don't see it.
Now, i should revise my previous statement that no one uses OS-level distributed computing. Fault tolerant databases, clusters, and massively parallel supercomputers all use it - at the local level. And even those are butting up against the network bandwidth problem. If it can't be done with gigabit connections on the backplane, how will it be done over a modem?
Hand me that airplane glue and I'll tell you another story.