OS research at MIT happens primarily in the
PDOS (Parallel and Distributed Operating Systems) research group these days.
I'm a grad student in the PDOS group; I certainly
haven't heard of this project, nor have my
colleagues with whom I've checked. This
story could use a bit more background checking;
I strongly suspect that it's completely bogus.
If you want to see the real research
going on in operating systems at MIT,
check out the PDOS web page, the Networks and Mobile Systems page, and the Advanced Network Architectures sites.
In honor of Pi Day (thanks for reminding me.:-)
I've upgraded
the Pi searcher to 100 million
digits, so folks who couldn't find themselves
last year have a better chance this year:
On Turing-completeness? Any textbook on computability and/or complexity should cover the topic. Sipser has a pretty good one that I've used personally.
Wanna show them off some time?:) (Actually, a more serious question - does the media lab have times when people can drop in and peer around? Even if those people might, say, happen to work in different lab which happens to be located in tech square?:)
Lie. Or do what a few other privacy groups have done in the past - pick a username and password (like, say, "slashdot" and "slashdot") and have all of your friends use it.
The NYT will probably have the guys in black show up at my apartment for suggesting this, of course.:)
-Dave, who uses something similar for his own NYT access...
When I was finishing my undergrad CS curriculum, we had a final "software engineering" course. The final project that we (a group of four of us) choose to do was building a scanner with legos. The scanner was a "pre-designed" thing that the Lego folks had come up with, but even then, it was absolutely incredible to see what you could do with them -- especially to those of us who had grown up building hand-moved spaceships out of Legos. (The best part was that we were scanning decent images at 150dpi b&w -- with LEGOS. It was boggling)
The moral of this post is really that you shouldn't underestimate the power of the lego philosophy; in computer science, one turing-complete language can do just as much as any other turing-complete language (with different levels of human pain!). I suspect that there's a vague analogue to legos - with the right subset of actuators, sensors, and infrastructure pieces, you can build just about anything with legos.
If you visit the google.stanford.edu site, you can find links to several papers which describe the technology and architecture which powered the original Google. They have a list of their research papers available. Some of them are moderately technical, but some are quite readable, including the www-based architecture overview paper.
So, the short answer to your question is, "I don't know." However, when google had only indexed about 24M webpages, their database was 53G compressed (at about 3:1, or 140G uncompressed).
I mean a situation analogous to the current use of telephone fees to subsidize telephone access in extremely rural areas. I don't know if this is a good idea or not - but it does mean that people on farms don't have to pay $200/month for their telephone lines. It's a mixed bag, but I don't think it'll happen without some kind of enforcement, because there isn't a big financial incentive for it. It goes back to that, "Is access a right?" debate.
> Monopoly prevention: Like the phone companies?
Well, kind of - the problem is that many of the broadband carriers already have a monopoly in their area. Does that mean they should be able to extend their monopoly in physical access to a monopoly in network access? Again, I think the answer is "it depends", which is why I think the current FCC laisez-faire approach is a good one. In Utah, where I used to live, USWest provided the only DSL access, but you could choose your ISP. As a result, US West had to make their own ISP services competitive with the other ISPs to get any subscribers - this was a *good* thing.
It seems that the Federal government is taking an increasingly hands-off position with respect to Internet-style regulation, in an attempt to maximize the growth possibilities of the Net. It's interesting to watch how they're doing this - first prohibiting ISP surcharges by phone companies, then mandating competitive DSL access, then letting things develop as they will in the Broadband market. It seems to me that their decisions are based primarily upon the relative maturity of the markets - the telephone market is old and entrenched, but other markets, like cable/broadband, are newer.
I believe the FCC is taking the right approach here, but I'm completely biased, having operated an ISP and being involved in network research (and a general net-head). I'd be interested in hearing counterarguments to this - what are the reasons the government should regulate these new industries? I can think of a few:
Standards and interoperability
Fair access
Monopoly prevention
Right now it seems like the markets aren't mature enough to determine what regulation is needed. However, I think that just like issues of pornography and illegal materials on the net, the way to keep the government OUT in the long term is by being good children from the get-go and sharing the sandbox. Hope some of the telecos and cable people are listening.
The author of the "IPv6 Privacy Threat" article failed to consider a few things. As several people have already pointed out, MAC addresses are spoofable and changeable in many circumstances.
More importantly, the IPv6 spec suggests (not mandates) the use of the 48-bit mac address for use as part of a local-use address. The local-use address as defined has only local routability scope - it will not trickle out onto the greater Internet. This was designed to provide an easy bootstrapping mechanism, and for non-Internet connected sites to configure their computers easily. However, the use of the 48-bit mac address is completely optional; it's not an automatically assigned address.
Third, people who connect to the Internet via a DSL or modem connection don't need to worry. In the DSL case, their IP address is the IP address of the DSL modem. Since their IP address is provider assigned, and their DSL modem is provider assigned, there's no difference! A user who dials up via a modem will have an IP address assigned by their provider, just like they do now, and it will have no correlation to the hardware address of anything they own.
For more infromation, Robert M. Hinden has a great article, "IP Next Generation Overview". Alternately, the story posted in the Times a few weeks ago provided a cogent introduction to the reality, not the hype, of IPv6. If you're an RFC type, check out:
Since an AC brought up Windows, I figured I'd point out one interesting technology that does a bit of what you're interested in doing, though it's target towards the LAN environment: Mango's Medely virtual file server software for 95/98.
It caches at the file block level instead of at the file level (like coda), so it's not (as) affected by things like tailing a large file. It has some other feature which could be good or bad (redundancy, file block migration, etc.) depending on your particular application.
The downside? It only runs on windows, and I suspect that they keep their cache coherence and access protocols under fairly heavy wrap. Now, if someone could come up with a good argument for how they could make money developing a similar product for Linux/*BSD... *drool*. Their basic technology is applicable most places, but their implementation right now is as a Windows disk driver device.
The only difference between "normal" FTP and "passive" mode FTP is that in passive mode, the client opens the data connection to the server, instead of vise-versa. The data connection is still a separate stream, and happens between random ports. Passive is good for things like NAT firewalls, though, because it allows all connections to remain outbound instead of requiring an inbound connection. But it will still bypass your port forwarding.
As it points out, this will leave the data connection open to sniffing/hijacking. If you only care about the integrity of the files you transfer, then verifying against (securely obtained) md5 checksums should do the trick. If you want to encrypt the datastream, you'll need to be a bit more fancy.
If it's possible, consider the use of 'scp' instead of ftp; you'll get protection of both control and data, since it's built into ssh.
Another option (if you control the clients as well) is to use ssh2's "sftp" client. Beware the licensing issues with ssh2, however.
If you really trust the clients, it's also quite easy to set up a VPN between the client and server, and then FTP directly. The ways to go about this depend on the OS you're using, so I'll leave it as an exercise to the reader.
It's not too surprising in the case of VMWare, considering its origins. It's important to distinguish between something like a word processor (which today is an engineering and design effort) and VMWare, which comes out of modern research at Stanford ( the Disco project). Except where fed by research projects, it's fairly unlikely that the OSS community is going to engage in a lot of potentially fruitless work to develop new technologies. Most OSS tends towards the "build a better mousetrap" line, because it yields more predictable - and often more useful - results. For every project like Disco which results in something neat like VMWare, there are many which go *plop*.
I'm a grad student in the PDOS group; I certainly haven't heard of this project, nor have my colleagues with whom I've checked. This story could use a bit more background checking; I strongly suspect that it's completely bogus. If you want to see the real research going on in operating systems at MIT, check out the PDOS web page, the Networks and Mobile Systems page, and the Advanced Network Architectures sites.
http://www.angio.net/pi/piquery
-Dave
You should also check out the AMS's website explaining turing machines
-Dave
Wanna show them off some time? :) (Actually, a more serious question - does the media lab have times when people can drop in and peer around? Even if those people might, say, happen to work in different lab which happens to be located in tech square? :)
The NYT will probably have the guys in black show up at my apartment for suggesting this, of course. :)
-Dave, who uses something similar for his own NYT access...
The moral of this post is really that you shouldn't underestimate the power of the lego philosophy; in computer science, one turing-complete language can do just as much as any other turing-complete language (with different levels of human pain!). I suspect that there's a vague analogue to legos - with the right subset of actuators, sensors, and infrastructure pieces, you can build just about anything with legos.
Drooling to go play with some legos,
-Dave
So, the short answer to your question is, "I don't know." However, when google had only indexed about 24M webpages, their database was 53G compressed (at about 3:1, or 140G uncompressed).
It's pretty fascinating stuff.
-Dave
> Fair access: I don't know what you mean here
I mean a situation analogous to the current use of telephone fees to subsidize telephone access in extremely rural areas. I don't know if this is a good idea or not - but it does mean that people on farms don't have to pay $200/month for their telephone lines. It's a mixed bag, but I don't think it'll happen without some kind of enforcement, because there isn't a big financial incentive for it. It goes back to that, "Is access a right?" debate.
> Monopoly prevention: Like the phone companies?
Well, kind of - the problem is that many of the broadband carriers already have a monopoly in their area. Does that mean they should be able to extend their monopoly in physical access to a monopoly in network access? Again, I think the answer is "it depends", which is why I think the current FCC laisez-faire approach is a good one. In Utah, where I used to live, USWest provided the only DSL access, but you could choose your ISP. As a result, US West had to make their own ISP services competitive with the other ISPs to get any subscribers - this was a *good* thing.
I believe the FCC is taking the right approach here, but I'm completely biased, having operated an ISP and being involved in network research (and a general net-head). I'd be interested in hearing counterarguments to this - what are the reasons the government should regulate these new industries? I can think of a few:
Right now it seems like the markets aren't mature enough to determine what regulation is needed. However, I think that just like issues of pornography and illegal materials on the net, the way to keep the government OUT in the long term is by being good children from the get-go and sharing the sandbox. Hope some of the telecos and cable people are listening.
More importantly, the IPv6 spec suggests (not mandates) the use of the 48-bit mac address for use as part of a local-use address. The local-use address as defined has only local routability scope - it will not trickle out onto the greater Internet. This was designed to provide an easy bootstrapping mechanism, and for non-Internet connected sites to configure their computers easily. However, the use of the 48-bit mac address is completely optional; it's not an automatically assigned address.
Third, people who connect to the Internet via a DSL or modem connection don't need to worry. In the DSL case, their IP address is the IP address of the DSL modem. Since their IP address is provider assigned, and their DSL modem is provider assigned, there's no difference! A user who dials up via a modem will have an IP address assigned by their provider, just like they do now, and it will have no correlation to the hardware address of anything they own.
For more infromation, Robert M. Hinden has a great article, "IP Next Generation Overview". Alternately, the story posted in the Times a few weeks ago provided a cogent introduction to the reality, not the hype, of IPv6. If you're an RFC type, check out:
It caches at the file block level instead of at the file level (like coda), so it's not (as) affected by things like tailing a large file. It has some other feature which could be good or bad (redundancy, file block migration, etc.) depending on your particular application.
The downside? It only runs on windows, and I suspect that they keep their cache coherence and access protocols under fairly heavy wrap. Now, if someone could come up with a good argument for how they could make money developing a similar product for Linux/*BSD... *drool*. Their basic technology is applicable most places, but their implementation right now is as a Windows disk driver device.
www.mango.com
The only difference between "normal" FTP
and "passive" mode FTP is that in passive
mode, the client opens the data connection
to the server, instead of vise-versa.
The data connection is still a separate
stream, and happens between random ports.
Passive is good for things like NAT firewalls,
though, because it allows all connections to
remain outbound instead of requiring an
inbound connection. But it will still bypass
your port forwarding.
http://www.uni-karlsruhe.de/~ig25/ssh-faq/
As it points out, this will leave the data connection open to sniffing/hijacking. If you only care about the integrity of the files you transfer, then verifying against (securely obtained) md5 checksums should do the trick. If you want to encrypt the datastream, you'll need to be a bit more fancy.
If it's possible, consider the use of 'scp' instead of ftp; you'll get protection of both control and data, since it's built into ssh.
Another option (if you control the clients as well) is to use ssh2's "sftp" client. Beware the licensing issues with ssh2, however.
If you really trust the clients, it's also quite easy to set up a VPN between the client and server, and then FTP directly. The ways to go about this depend on the OS you're using, so I'll leave it as an exercise to the reader.
It's not too surprising in the case of VMWare,
considering its origins. It's important to distinguish between something like a word processor (which today is an engineering and design effort) and VMWare, which comes out of modern research at Stanford (
the Disco project).
Except where fed by research projects, it's fairly unlikely that the OSS community is going to engage in a lot of potentially fruitless work to develop new technologies. Most OSS tends towards the "build a better mousetrap" line, because it yields more predictable - and often more useful - results. For every project like Disco which results in something neat like VMWare, there are many which go *plop*.