Joel thinks that rewriting things is a bad idea because it loses information embedded in the old code (original anti-rewrite essay, search for "Nancy" to find a good example) and then says in the interview:
Half the time when I go into a function to fix a little bug, I figure out a cleaner way to rewrite the whole function
This is the same guy who wrote Yet Another Bug Tracking System while observing that such things were a dime a dozen, and then went on to write Yet Another Content Management System without defining its target market, even as he criticized others for such undirected development. Apparently, Joel's quite comfortable stating commandments for others while living by different rules himself. His articles are unfailingly interesting, but should by no means be accepted as authoritative (as is true for anyone who spends more time on the pundit circuit than actually programming)
do you make this newfangled FS to sit on top of tried-and-true NTFS, or do you implement it at the kernel level and make NTFS a layer on top of that?
MS already has this functionality implemented on top of NTFS. The entire novelty here is to turn that upside down and have an NTFS compatibility layer on top (but still in kernel). Otherwise it wouldn't be interesting at all.
After all Linux already has ReiserFS and Microsoft have just started development of their 'database' FS for Win32.
Microsoft has been working on this since '92; Hans came along in about '97 and ReiserFS as it is currently constituted implements very little of his pie-in-the-sky musings. While it's true that MS has not yet released a product based on this technology, the rest of your claim is utterly untrue. Microsoft didn't "just start" development in this area, and they could yet produce something that truly qualifies as a functional fusion of databases and filesystems before Herr Reiser does.
A lot of people seem to've totally missed the point of what would be different about a database-oriented filesystem. File extensions? Not bloody well relevant! Let's consider the issue of searching. A database-oriented filesystem might allow you to create directories that are basically "views" of your filesystem, perhaps including all files that meet certain name, attribute or content criteria (like Evolution's vFolders but available to any app). These views would be up-to-the-instant accurate at all times, with no dead links and no problem with apps replacing links with actual files instead of updating the file that the link pointed to. Filesystems could also benefit from other things like referential-integrity checks, triggers, and cross-file transactional behavior. In fact, there has been a lot of work in the kernel-hacker community to figure out how just that last feature could be added to Linux filesystems. Basing a filesystem on a database also allows you to leverage all of the tools (e.g. efficient snapshots and replication) that have been developed for the database. There's a lot more here than just journaling and BeOS-style metadata.
It's not that I think basing a filesystem on a database is a great idea. For one thing, it's a pretty good bet that performance is going to suck because of all the extra DB-related overhead. Administration might become more of a PITA too. I'm just trying to explain that the idea of a database-oriented filesystem has much broader implications than the trivial crap (much of which is relevant to neither filesystems nor databases) that people seem to be focusing on in this thread so far.
Re:How to Google Whack...
on
Google Juice
·
· Score: 3, Interesting
The more common the words, the cleverer the Googlewhack is considered to be -- few Googlewhacks use words you would consider "common."
I've seen a few scoring systems that formalize this idea. Most start by multiplying the numbers of hits that each word would have gotten by itself. Personally, I like adding extra twists instead of trying to go for a high score. For example, alliterative whacks are harder to find because there are enough word lists out there that you're likely to get multiple hits on any two words that start with the same letter, so you have to pick words obscure enough not to be in the lists but real enough to be in the dictionary. It's a fun way to spend 5-10 minutes.
I don't know about the real low end, but your $/MHz ratio certainly starts to break down even in the mid-range. I'm typing this on a laptop with a 600MHz CPU, that I just got from uBid for US$700 plus shipping, and I know that I could have gotten an even better $/MHz ratio with a bulkier machine. With that CPU and memory, USB, FireWire etc. this machine will still be viable a lot longer and ultimately provide more practical use per dollar than some low-end machine that's already at the end of its lifespan. Unless you're looking for something that will basically function as an embedded system (in which case you can skip the cost of a screen and get a true embedded SBC) I suggest you consider spending a little more to get a better overall value.
If you want to play the link game and not address my points, fine.
You're the one who's persisting in a digression, Grasshopper. As I said earlier, it's not necessary to describe any X that meets a standard to prove that Y does not. Y is Freenet. Freenet does not meet semantic or integration standards to be considered a filesystem, and since I was talking about filesystems I was not talking about Freenet. All this other bullshit about whether it's possible for some other system to meet that standard is interesting but beside the point.
Either you're saying it can't be done, or it's possible with P2P, or it's client/server.
OK, Sparky, I'll spell it out just for the slowest child in the class. I do believe that strong access control can be implemented in a decentralized ("P2P" if you prefer buzzwords) system. I might be incorrect in that belief, and you're welcome to dispute that belief if you choose, but saying that it's not what I'm talking about is just a non-starter.
I have mathematical proof that any such system can be compromised as P2P authentication is isomorphic to copy protection and DRM in software.
You're misunderstanding the result, which applies only to obfuscation. There are forms of authentication that are mathematically quite distinct from what that paper discusses.
Prove you're one of those people
Ahhh, imitation is the sincerest form of flattery. It's nice to see that you're familiarizing yourself with that list of fallacies. It'd be nicer still if you read it as a list of things to avoid, and not as a list of things to try in your next post.
Firstly, I didn't intend that sentence as a refutation of your argument but as an admonition regarding the same sort of "disrespect" you complained about earlier. It's just slightly hypocritical for you to demand respect for your five minutes of thought while showing none for others' years of study.
As it happens, distributed storage systems are my professional specialty, but I wasn't actually referring to myself. I was thinking more of people like those behind MNet (formerly Mojo Nation), CFS, SFS, OceanStore or Farsite, who all seem to share a belief that decentralization does not preclude strong authentication. They're the ones who've spent years thinking about the authentication angle (I personally have focused more on efficiency and coherency angles). It's your dismissiveness of their efforts, not my own, that I find offensive.
It's impossible to retrieve two different files with the same key, by design
Reexamine how SSKs work, or DBRs, and I'm sure that even you can figure out how inconsistency can occur.
I agree that Freenet today is not a file system.....However, it can serve as a good basis from which to build one
Not really. It might or might not be possible to reconcile strong access control with decentralization, but the prospects seem even gloomier for such a reconciliation with Freenet's anonymity. Similarly, Freenet's insertion and caching behaviors conflict at a pretty fundamental level with the levels of coherency expected of a filesystem. Of course, the entire implementation would have to change as well. My contention is that Freenet and filesystems differ in enough ways - and deep enough ways - that a "Freenet-based filesystem" would no longer resemble Freenet as we know it. Again, that doesn't mean there's anything wrong with Freenet. Certainly the Freenetistas would tell you - as they've told me many times - that filesystem-like behavior is not a goal for Freenet, and that's just fine.
and arguably is as close to a real file system as one can make with P2P technology.
The truth of the matter is, a decentralized modern file system cannot be made without sacrificing security. It requires centralized authentication servers
Untrue, but I'm not paid to teach idiots the basics on Slashdot. Do your own homework.
But, if you're so certain it can be done with P2P and typical authenticated security models, why not spend a little time researching it?
I really find credentialism quite distasteful, but since you seem so insistent on making your own appeals to authority I'll play along. As I mentioned, this is my professional specialty. I have a pretty well-documented record of keeping up with developments in this area, and engaging other "leading figures" in dialog as a peer. Your insinuation that I don't know the terrain is absurd, but might apply to yourself. What can you do to demonstrate that your statements here are based on more than three minutes of reading and one minute of thought? Are you really so sure you want to pursue the issue of background, or can we get back to the actual issues?
Then you're proposing this 'shared drive network of computers' have a central server.
I really wish you'd stop telling me what I'm saying. I'm not talking about central servers now any more than I was talking about Freenet earlier.
I absolutely defy you to come up with a pure P2P way to do it with identical security to modern OSes without a central authority.
"Pure" P2P? Identical security? That's commonly referred to as moving the goalposts and it should be beneath you. It's not necessary to describe any X that meets a standard to prove that Y does not.
You obviously don't think it's possible to reconcile strong access control with decentralization. That's fine, but don't you think it's a little disrespectful to assume that other people who've spent a lot more time than you studying the problem have given up too. You're basing your argument on an axiom that's not shared with your interlocutor, but then I guess it doesn't matter because it's a digression anyway.
The article did not appear to be promoting someone running servers to authenticate users, so my assertion is entirely appropriate.
Appropriate, but inadequate. Freenet's SSKs are still not equivalent to real directories, or real access control, no matter how much you bluster.
FreeNet has (mostly) the same properties as a WORM drive's file system. Once written, data cannot be changed.
WORM drives don't drop data like Freenet does. They might reject new writes when they're full, but they don't toss some arbitrary piece of old data on the floor to make room.
FreeNet would appear to the user similarly as a cdrom drive that they can write to. Isn't ISO9660 a coherent model?
False equivalence. You haven't shown that Freenet is in any way like a CD-ROM, and in fact it differs from CD-ROM in this particular regard. Two nodes attempting to read the same data from Freenet simultaneously might well get different data, if one finds a stale copy in someone's cache first and the other finds a fresh copy in another cache. That is not consistent/coherent behavior.
Again: Freenet is not a filesystem. Not only is it not implemented as one, but its very protocol does not support features expected of filesystems (some of which I haven't even gotten around to mentioning yet). Neither of these can change without Freenet becoming something totally different from what it is now, perhaps without abandoning its central goals of strong anonymity and resistance to censorship. There's nothing wrong with Freenet not being a filesystem. Perhaps it's something better; certainly many people seem to see it that way. All it means is that when people are talking about filesystems they're not talking about Freenet and you shouldn't tell them that they are.
Perhaps I was overly curt with you earlier. I just get really tired of hearing "you mean Freenet" any time distributed storage is discussed. How would you like it if somebody said "you mean Windows" every time you mentioned operating systems, no matter how un-Windows-like the proposed operating system was? How "classy" would you be in correcting such a statement? Would you, perhaps, call it horseshit?
You're wrong about the last three points, though. It does have encrypted shared private namespaces, where people would have to have your public key to read the files. That's rudimentary file permissions for read. You also cannot publish to that directory unless you use the private key, which is rudimentary file permissions for write.
Private namespaces are not the same as directories, and the rudimentary access control they offer is in no way comparable to the sorts of permissions that any legitimate filesystem on any modern general-purpose OS is expected to support.
No data consistency? I'm not sure what you mean here, since it's checksumed and encrypted and passed around in pieces all over the place, it seems very self-consistent.
"Consistency" (a.k.a. coherency) has a very specific and generally well-understood meaning in this context, which you should learn before you start spouting off about whether Freenet exhibits it. In a consistent system, if node A writes to a location and then node B reads it, B will (assuming no other writes in the interim) receive the value A wrote and not some older "stale" value. There are varying levels and types of consistency, representing different guarantees about the conditions under which the system guarantees that B will get current data, but Freenet does not ensure consistency according to even the loosest definitions.
How does it improve the file which is only used in one place, by one person, when sitting at a specific computer? It doesn't.
You're apparently not considering the advantage of not losing data if that one computer fails. Some people would certainly consider that advantage to be considerable.
In any case, I don't think I ever said that all data should be placed in the distributed data store. In fact, I rather distinctly remember saying the exact opposite. Modern operating systems permit the use of multiple filesystem types concurrently, so there's nothing keeping you from keeping data local if you so choose.
USB drives in the 1gb range that are the size of a pocket key are available today, for about $900.
$900/GB? And you're seriously comparing that to a software-only solution that might carry zero dollar cost? Do you really think your silver USB bullet is the ideal solution for everyone, i.e. that there aren't plenty of people who would be better served by the distributed-storage alternative?
Apparently someone took seriously the suggestion of recycling the highly-moderated posts from the previous ISOS thread. The parent is an exact copy of this post by Ian Clarke on that thread.
BTW, the answer to the (implied) question in Ian's original paper is no. A useful "distributed decentralized data processing system" cannot be built on top of Freenet, or any other storage system that drops data as soon as the herd stops requesting it.
No, I am most emphatically not talking about Freenet. For one thing Freenet is not a filesystem. I can't mount it, I can't use plain old read/write (have to use Freenet-specific APIs), I can't use memory mapping (or even access individual blocks without reading the whole file), I can't execute images off it, there are no directories, no permissions, no data consistency. It flunks just about every test of whether something is a filesystem. Worse, Freenet drops data that's not being actively requested; that's OK for what Freenet was designed to do, but totally unacceptable for a filesystem. Got it? Good. Now we can move on.
Replication of data has tremendous cost: bandwidth, time, and storage space.
Replication also has tremendous benefits, most notably robustness and performance. I alluded to the latter in my last post. If nodes are smart enough to get data from the nearest replica, then total bandwidth use goes down. The more replicas there are, the fewer network resources each replica-served request will consume (unless somebody's so stupid that they put all the replicas right next to one another). It's the same principle used by FTP mirrors and content-distribution networks, and it works.
Local data is by far more manageable
...until you, or you plus multiple other people, need to access that same data from multiple places - perhaps concurrently. Then you get right into the same sort of replication/consistency mess you were trying to avoid, except that instead of having the attendant problems solved once for everyone using the filesystem each person has to solve it separately.
What does make sense is that people would prefer to carry their data with them.
Actually I'd rather not have one more physical object to carry around, drop/damage/misplace, etc., or have to remember to copy the data I want for a business presentation onto my portable device. What I'd prefer would be that when Imove to a new location connecting to the network also connects me to my files, wherever they may be, without unnecessary compromises in performance or security. Those of limited imagination might not believe it, but that will be possible quite soon.
Consider instead...a key-sized multi-gig USB memory drive.
Aside from the administrative-inconvenience issues noted above, where are you going to find such a device? How much will it cost, compared to the software-solution cost of zero dollars? How fast will it be? How reliable? What will you do when it breaks and you didn't make a backup?
No need to add the complexity of network distribution at all.
The complexity of network distribution should be hidden from the user anyway. The whole idea of a distributed data store is that the complexity is hidden in the system so that users' lives are simpler. What you're proposing is to "shield" users from complexity that they wouldn't see anyway, and leave them responsible for decisions (replication, data placement, backup) that the system should be handling for them. That's not a positive tradeoff.
Too often, visionaries put faith in a silver bullet to cure all ails
So do non-visionaries, and the word you're looking for is "ills" not "ails". Your silver USB bullet doesn't solve anything.
I wonder how upset this individual in Helsinki would be if Mary decided to format her hard disk in the midst of his movie
The Helsinki user is no worse off in this scenario than if Mary's machine were a web server.
I wonder how much bandwidth that cost to prevent this 'just in case' scenario?
We all know that such "just in cases" do actually occur. The only solution to data-loss is redundant copies of the data, maintained either manually (explicit backups) or automatically (transparent mirroring or replication). The authors' idea is to go for automatic replication, and once you have that you might as well use the replicas to improve performance by allowing them to serve data to nearby nodes. This can actually result in less overall bandwidth than traditional approaches, because each node is going somewhere relatively close to get data instead of choking up a central server.
That actually highlights a flaw in the example as given in the article. It would be quite abnormal for someone in Helsinki to be going half-way around the world to get the data, because there should be a nearer replica. It would be more accurate, though perhaps less compelling, to say that Mary's machine was being used as a "staging area" for other local users watching the same movie from Helsinki that Mary just watched ten minutes ago. That would IMO convey the idea of an ISOS (actually the data-store part of it) actually reducing network bandwidth while also improving robustness.
I don't try to imagine this Kernel32.dll part is blocked on some other computer and hang mine
The ISOS as described in the article runs on top of a traditional operating system; the files you need to boot that traditional OS would still reside locally, as would your applications. It's only the data that would reside elsewhere, which really isn't that different than happens today with NFS- or CIFS-based fileservers from the user's perspective. The difference, supposedly, is that replacing the single NAS server with a fully distributed network results in a more robust system, and one that can scale beyond the local LAN to the whole Internet.
Freenet is more of a data transmission method than a true data store. Even Ian says so, when pressed on the data-loss issue. MNet, OceanStore, Farsite or CFS would all be better examples of actual distributed storage.
The question to you was whether these people you know (let me guess.. all from the same dotBomb startup?)
Wrong. Some day you should try developing a theory to fit facts instead of making up "facts" that fit your pet theory.
had tried consulting-related Open Source business models. Judging by the fact that you didn't answer, it is my guess that what I "excluded" was what they in fact tried. And if so, I'm not surprised they failed miserably.
That is truly one of the most contorted statements I've seen in a long time. What you actually ended up saying is that you think my friends did try what you suggest, and you're not surprised that they failed. Truly, you have a dizzying intellect.
Some of the people I mentioned were involved in product-related endeavors, others were in consulting. Both groups have fared poorly, but of the two it's the consultants who've been hurt most. Don't believe me? Find any half dozen people who were out there a year ago trying to do what you suggest. Ask them whether they believe in your theories. The two who are still employed might refrain from slapping you upside the head, but I can't vouch for the other ten.
What you don't seem to realize is that when money gets tight expenditures on all forms of outsourcing - consultants, freelancers, custom development and support contracts - are the first to go. The service income that you posit as a substitute for product income dries up, leaving nothing for open-source developers. It's not a coincidence that open-source software rose to prominence during an economic boom, and has receded during the ensuing decline. That's reality for you, and it's right there for you to see if you'd just pull your head out of your ass and take a look.
What exactly did they try? If you're talking about people who wrote free software and then put a 'donations' box on their website, that doesn't count. Silly dotBomb attempts like making their own distro or trying to provide generic tech support also don't count.
How many programmers actually try? Not many. I'm trying to change that.
I personally know about two dozen who tried. About half eventually ended up working on proprietary software. The other half are unemployed. How big a sample do you need before you'll face facts? I'm sure the people I know constitute a very small percentage of all those who tried, and that more examples could be found.
What are you doing to change anything, besides ranting here? Just about everyone I know who actually has a business isn't shy about putting the word out. You, by contrast, haven't even bothered putting a link to your project/company in your profile. What do you do, exactly, that's so good for open source? BTW, it's not ad hominem when one's interlocutor has made their character or identity relevant by trying to use it as the basis for their argument.
All this "Open Source" stuff is just "statistical noise."
Strawman. What I meant, and it was quite clear from the context, was that all this "making money from open source" was statistical noise.
Been there? Done that? No, didn't think so.
How about you, bud? You ever try it? No, didn't think so.
Then again, I'm not the one claiming to be doing that, so that's not relevant. I'll take your (lack of an) answer to my question as a no, and so I suspect would anyone else reading this exchange.
Wow. Slashdot is pretty full of people with lots of ideals and no skills or experience, but you completely take the cake.
There are plenty of legitimate and highly stable ways to make money writing free software.
Bold claim. Got any proof? Got any numbers for how many people are actually doing it? How many programmers actually make a living doing open-source programming full time today? How about a year from now, if the economy doesn't pick up? How many total programmers are there in the world? This "phenomenon" you rant about, this wave that's going to overwhelm us all, was barely even statistical noise even at its peak, and that peak has passed.
Find some buddies who are also into Open Source and form a consulting group
Been there? Done that? No, didn't think so. Open source or closed, your zealotry would be fatal in business. Those few people who are making money off open source have survived by learning not to piss off the guys with the money with that kind of extremism.
There is absolutely NO need for ANY proprietary software in this world.
You might actually be right there. "Need" is a funny word. No, the world doesn't need proprietary software or copyright law. But they exist, and people - real people, not just big corps - benefit from them. You haven't provided any compelling argument that society would be better off without them. Heck, far better programmers and writers than you have tried to make such arguments, and they haven't succeeded either.
Those who argue otherwise do so only because they have a vested interest in proprietary venues and are afraid
That, my friend, is called argumentum ad hominem and it's frowned upon as a fallacy. I'm not just nit-picking either; logic and debate are essential skills in the business world, regardless of whether your source is open or closed. There are myriad reasons why people participate in the creation of open source. Lambasting them all as parasites or cowards is as absurd as characterizing all open-source programmers as thieves. There's a grain of truth in each case, but no more.
Choose your sides.
Even if I were the most ardent advocate of open source - and I've probably done more for open source than you ever will - I'm too much of a pragmatist to back the losing side in any fight. You'll find that such pragmatism is a common trait among real engineers.
Using telnet IAC for higher-level messaging is just perverse...worse than using HTTP for RPC, even. The whole point here is to use a protocol that's designed to support needed features, instead of hacking those features on top of an existing protocol that was designed for something else.
Session management would be up to the server, wouldn't it?
Session management is probably always going to involve an element of negotiation between servers and clients (and users) at some level. The important thing is that the responsibility for establishing, tracking, and maintaining sessions move out of the applications running on the server. It would be much better to have a single standard way to do these things for all applications, instead of having every application do it just a little bit differently. In-protocol support for sessions also provides a convenient way to deal with request-ordering issues, the status indicators that another poster mentioned, etc.
I don't mean Telnet as Telnet, I mean a protocol similar to what Telnet has to offer.
I'm not sure what definition of "telnet" you're using, then. Telnet is an extremely simple protocol that does almost nothing besides negotiating a few connection characteristics and terminal settings. Even the authentication you might be thinking of is not actually part of telnet; it's part of the login process on the target machine. The term "generic telnet" is therefore almost meaningless; HTTP as it exists today is much more telnet-like than what I'm suggesting, since it's based on a single bytestream over a single TCP connection instead of a real (potentially multi-stream multi-connection) session model.
The problem with HTTP, as with any stateless protocol, is that there often are (or should be) relationships between requests. Ordering relationships are common, for example, as are authentication states. Stateless protocols are easier to implement, and thus should be preferred when such "implicit state" is not an issue, but in many other situations a protocol that knew something about state could be more efficient. All of this session-related cookie and URL-munging BS could just go away if the RPC-like parts of HTTP were changed to run on top of a generic session protocol.
Another error embodied in HTTP - and it's one of my pet peeves - is that it fails to separate heartbeat/liveness checking from the operational aspects of the protocol. Failure detection and recovery gets so much easier when any two communicating nodes track their connectedness using one protocol and every other protocol can adopt a simple approach of "just keep trying until we're notified [from the liveness protocol] that our peer has died". This is especially true when there are multiple top-level protocols each concerned with peer liveness, or when a request gets forwarded through multiple proxies. As before, having the RPC-like parts of HTTP run on top of a generic failure detection/recovery layer would give us a web that's much more robust and also (icing on the cake) easier to program for.
I don't know if any of this is what Don Box was getting at, but in very abstract terms he's right about HTTP being a lame protocol.
Network servers are bandwidth-limited, not cpu limited
Not to detract from your main point, which is a good one and well made, but that particular statement is pretty dubious. Some network servers are bandwidth-bound, some are CPU-bound, some are memory-bound or disk-bound, some are crappy-API-bound, some are bound by complex synchronization/serialization requirements. Most are affected by more than one of these limitations, and by the tradeoffs that must be made between them.
None of this refutes your argument that C is not the best language for servers. Its lacks of type-safety, range/bounds checks, proper overflow handling (which requires exceptions), garbage collection and so on are all well known. Java is a much better language in these regards while still remaining fairly familiar, and even for completely CPU-bound programs there's a compelling argument for HotSpot-style JIT as an alternative to traditional compilation. If only Java supported true MI instead of the inadequate "interface" hack/substitute (and I do understand how the requirements for code mobility made that a reasonable choice at the time). Other, more "exotic", languages such as those in the Scheme or ML categories might appeal to purists, but their chances of achieving widespread adoption will remain almost nil until the "impedance mismatch" with declaratively-oriented system programming interfaces is lessened.
Yep, exactly right. Changing from a custom overlay-segment scheme to semi-real VM involves some serious pain. Switching from direct hardware access to OS-approved APIs can require hundreds or even thousands of changes, and often wholesale restructuring of the code. Resolving timing dependencies is a bitch; ask any chip designer about those, because it's the same set of issues.
If the program being ported is well designed, with an internal abstraction layer that just happens to match the new-OS API, and with a minimum of timing or hardware dependencies, porting might not be too bad. However, few old games were designed that way, and it's not just because the authors were sloppy (though that's often a factor). At the time many of these games were written, these issues were not well understood, and they're only well understood now precisely because so many missteps were made. Maybe "everyone knows that" now, just like everyone knows that CFCs are bad, but there was a time not so very long ago when pretty much nobody knew these things.
For all that is wrong with it, DMCA did not abandon or overturn this concept. This item, like a VCR, obviously has significant utility that does not involve violation of copyright. Nintendo doesn't stand a chance.
Joel thinks that rewriting things is a bad idea because it loses information embedded in the old code (original anti-rewrite essay, search for "Nancy" to find a good example) and then says in the interview:
This is the same guy who wrote Yet Another Bug Tracking System while observing that such things were a dime a dozen, and then went on to write Yet Another Content Management System without defining its target market, even as he criticized others for such undirected development. Apparently, Joel's quite comfortable stating commandments for others while living by different rules himself. His articles are unfailingly interesting, but should by no means be accepted as authoritative (as is true for anyone who spends more time on the pundit circuit than actually programming)
MS already has this functionality implemented on top of NTFS. The entire novelty here is to turn that upside down and have an NTFS compatibility layer on top (but still in kernel). Otherwise it wouldn't be interesting at all.
Microsoft has been working on this since '92; Hans came along in about '97 and ReiserFS as it is currently constituted implements very little of his pie-in-the-sky musings. While it's true that MS has not yet released a product based on this technology, the rest of your claim is utterly untrue. Microsoft didn't "just start" development in this area, and they could yet produce something that truly qualifies as a functional fusion of databases and filesystems before Herr Reiser does.
A lot of people seem to've totally missed the point of what would be different about a database-oriented filesystem. File extensions? Not bloody well relevant! Let's consider the issue of searching. A database-oriented filesystem might allow you to create directories that are basically "views" of your filesystem, perhaps including all files that meet certain name, attribute or content criteria (like Evolution's vFolders but available to any app). These views would be up-to-the-instant accurate at all times, with no dead links and no problem with apps replacing links with actual files instead of updating the file that the link pointed to. Filesystems could also benefit from other things like referential-integrity checks, triggers, and cross-file transactional behavior. In fact, there has been a lot of work in the kernel-hacker community to figure out how just that last feature could be added to Linux filesystems. Basing a filesystem on a database also allows you to leverage all of the tools (e.g. efficient snapshots and replication) that have been developed for the database. There's a lot more here than just journaling and BeOS-style metadata.
It's not that I think basing a filesystem on a database is a great idea. For one thing, it's a pretty good bet that performance is going to suck because of all the extra DB-related overhead. Administration might become more of a PITA too. I'm just trying to explain that the idea of a database-oriented filesystem has much broader implications than the trivial crap (much of which is relevant to neither filesystems nor databases) that people seem to be focusing on in this thread so far.
I've seen a few scoring systems that formalize this idea. Most start by multiplying the numbers of hits that each word would have gotten by itself. Personally, I like adding extra twists instead of trying to go for a high score. For example, alliterative whacks are harder to find because there are enough word lists out there that you're likely to get multiple hits on any two words that start with the same letter, so you have to pick words obscure enough not to be in the lists but real enough to be in the dictionary. It's a fun way to spend 5-10 minutes.
I don't know about the real low end, but your $/MHz ratio certainly starts to break down even in the mid-range. I'm typing this on a laptop with a 600MHz CPU, that I just got from uBid for US$700 plus shipping, and I know that I could have gotten an even better $/MHz ratio with a bulkier machine. With that CPU and memory, USB, FireWire etc. this machine will still be viable a lot longer and ultimately provide more practical use per dollar than some low-end machine that's already at the end of its lifespan. Unless you're looking for something that will basically function as an embedded system (in which case you can skip the cost of a screen and get a true embedded SBC) I suggest you consider spending a little more to get a better overall value.
You're the one who's persisting in a digression, Grasshopper. As I said earlier, it's not necessary to describe any X that meets a standard to prove that Y does not. Y is Freenet. Freenet does not meet semantic or integration standards to be considered a filesystem, and since I was talking about filesystems I was not talking about Freenet. All this other bullshit about whether it's possible for some other system to meet that standard is interesting but beside the point.
OK, Sparky, I'll spell it out just for the slowest child in the class. I do believe that strong access control can be implemented in a decentralized ("P2P" if you prefer buzzwords) system. I might be incorrect in that belief, and you're welcome to dispute that belief if you choose, but saying that it's not what I'm talking about is just a non-starter.
You're misunderstanding the result, which applies only to obfuscation. There are forms of authentication that are mathematically quite distinct from what that paper discusses.
Ahhh, imitation is the sincerest form of flattery. It's nice to see that you're familiarizing yourself with that list of fallacies. It'd be nicer still if you read it as a list of things to avoid, and not as a list of things to try in your next post.
Firstly, I didn't intend that sentence as a refutation of your argument but as an admonition regarding the same sort of "disrespect" you complained about earlier. It's just slightly hypocritical for you to demand respect for your five minutes of thought while showing none for others' years of study.
As it happens, distributed storage systems are my professional specialty, but I wasn't actually referring to myself. I was thinking more of people like those behind MNet (formerly Mojo Nation), CFS, SFS, OceanStore or Farsite, who all seem to share a belief that decentralization does not preclude strong authentication. They're the ones who've spent years thinking about the authentication angle (I personally have focused more on efficiency and coherency angles). It's your dismissiveness of their efforts, not my own, that I find offensive.
Reexamine how SSKs work, or DBRs, and I'm sure that even you can figure out how inconsistency can occur.
Not really. It might or might not be possible to reconcile strong access control with decentralization, but the prospects seem even gloomier for such a reconciliation with Freenet's anonymity. Similarly, Freenet's insertion and caching behaviors conflict at a pretty fundamental level with the levels of coherency expected of a filesystem. Of course, the entire implementation would have to change as well. My contention is that Freenet and filesystems differ in enough ways - and deep enough ways - that a "Freenet-based filesystem" would no longer resemble Freenet as we know it. Again, that doesn't mean there's anything wrong with Freenet. Certainly the Freenetistas would tell you - as they've told me many times - that filesystem-like behavior is not a goal for Freenet, and that's just fine.
Arguably indeed. Would anyone like some cake?
Untrue, but I'm not paid to teach idiots the basics on Slashdot. Do your own homework.
I really find credentialism quite distasteful, but since you seem so insistent on making your own appeals to authority I'll play along. As I mentioned, this is my professional specialty. I have a pretty well-documented record of keeping up with developments in this area, and engaging other "leading figures" in dialog as a peer. Your insinuation that I don't know the terrain is absurd, but might apply to yourself. What can you do to demonstrate that your statements here are based on more than three minutes of reading and one minute of thought? Are you really so sure you want to pursue the issue of background, or can we get back to the actual issues?
I really wish you'd stop telling me what I'm saying. I'm not talking about central servers now any more than I was talking about Freenet earlier.
"Pure" P2P? Identical security? That's commonly referred to as moving the goalposts and it should be beneath you. It's not necessary to describe any X that meets a standard to prove that Y does not.
You obviously don't think it's possible to reconcile strong access control with decentralization. That's fine, but don't you think it's a little disrespectful to assume that other people who've spent a lot more time than you studying the problem have given up too. You're basing your argument on an axiom that's not shared with your interlocutor, but then I guess it doesn't matter because it's a digression anyway.
Appropriate, but inadequate. Freenet's SSKs are still not equivalent to real directories, or real access control, no matter how much you bluster.
WORM drives don't drop data like Freenet does. They might reject new writes when they're full, but they don't toss some arbitrary piece of old data on the floor to make room.
False equivalence. You haven't shown that Freenet is in any way like a CD-ROM, and in fact it differs from CD-ROM in this particular regard. Two nodes attempting to read the same data from Freenet simultaneously might well get different data, if one finds a stale copy in someone's cache first and the other finds a fresh copy in another cache. That is not consistent/coherent behavior.
Again: Freenet is not a filesystem. Not only is it not implemented as one, but its very protocol does not support features expected of filesystems (some of which I haven't even gotten around to mentioning yet). Neither of these can change without Freenet becoming something totally different from what it is now, perhaps without abandoning its central goals of strong anonymity and resistance to censorship. There's nothing wrong with Freenet not being a filesystem. Perhaps it's something better; certainly many people seem to see it that way. All it means is that when people are talking about filesystems they're not talking about Freenet and you shouldn't tell them that they are.
Perhaps I was overly curt with you earlier. I just get really tired of hearing "you mean Freenet" any time distributed storage is discussed. How would you like it if somebody said "you mean Windows" every time you mentioned operating systems, no matter how un-Windows-like the proposed operating system was? How "classy" would you be in correcting such a statement? Would you, perhaps, call it horseshit?
Private namespaces are not the same as directories, and the rudimentary access control they offer is in no way comparable to the sorts of permissions that any legitimate filesystem on any modern general-purpose OS is expected to support.
"Consistency" (a.k.a. coherency) has a very specific and generally well-understood meaning in this context, which you should learn before you start spouting off about whether Freenet exhibits it. In a consistent system, if node A writes to a location and then node B reads it, B will (assuming no other writes in the interim) receive the value A wrote and not some older "stale" value. There are varying levels and types of consistency, representing different guarantees about the conditions under which the system guarantees that B will get current data, but Freenet does not ensure consistency according to even the loosest definitions.
You're apparently not considering the advantage of not losing data if that one computer fails. Some people would certainly consider that advantage to be considerable.
In any case, I don't think I ever said that all data should be placed in the distributed data store. In fact, I rather distinctly remember saying the exact opposite. Modern operating systems permit the use of multiple filesystem types concurrently, so there's nothing keeping you from keeping data local if you so choose.
$900/GB? And you're seriously comparing that to a software-only solution that might carry zero dollar cost? Do you really think your silver USB bullet is the ideal solution for everyone, i.e. that there aren't plenty of people who would be better served by the distributed-storage alternative?
Apparently someone took seriously the suggestion of recycling the highly-moderated posts from the previous ISOS thread. The parent is an exact copy of this post by Ian Clarke on that thread.
BTW, the answer to the (implied) question in Ian's original paper is no. A useful "distributed decentralized data processing system" cannot be built on top of Freenet, or any other storage system that drops data as soon as the herd stops requesting it.
No, I am most emphatically not talking about Freenet. For one thing Freenet is not a filesystem. I can't mount it, I can't use plain old read/write (have to use Freenet-specific APIs), I can't use memory mapping (or even access individual blocks without reading the whole file), I can't execute images off it, there are no directories, no permissions, no data consistency. It flunks just about every test of whether something is a filesystem. Worse, Freenet drops data that's not being actively requested; that's OK for what Freenet was designed to do, but totally unacceptable for a filesystem. Got it? Good. Now we can move on.
Replication also has tremendous benefits, most notably robustness and performance. I alluded to the latter in my last post. If nodes are smart enough to get data from the nearest replica, then total bandwidth use goes down. The more replicas there are, the fewer network resources each replica-served request will consume (unless somebody's so stupid that they put all the replicas right next to one another). It's the same principle used by FTP mirrors and content-distribution networks, and it works.
...until you, or you plus multiple other people, need to access that same data from multiple places - perhaps concurrently. Then you get right into the same sort of replication/consistency mess you were trying to avoid, except that instead of having the attendant problems solved once for everyone using the filesystem each person has to solve it separately.
Actually I'd rather not have one more physical object to carry around, drop/damage/misplace, etc., or have to remember to copy the data I want for a business presentation onto my portable device. What I'd prefer would be that when Imove to a new location connecting to the network also connects me to my files, wherever they may be, without unnecessary compromises in performance or security. Those of limited imagination might not believe it, but that will be possible quite soon.
Aside from the administrative-inconvenience issues noted above, where are you going to find such a device? How much will it cost, compared to the software-solution cost of zero dollars? How fast will it be? How reliable? What will you do when it breaks and you didn't make a backup?
The complexity of network distribution should be hidden from the user anyway. The whole idea of a distributed data store is that the complexity is hidden in the system so that users' lives are simpler. What you're proposing is to "shield" users from complexity that they wouldn't see anyway, and leave them responsible for decisions (replication, data placement, backup) that the system should be handling for them. That's not a positive tradeoff.
So do non-visionaries, and the word you're looking for is "ills" not "ails". Your silver USB bullet doesn't solve anything.
The Helsinki user is no worse off in this scenario than if Mary's machine were a web server.
We all know that such "just in cases" do actually occur. The only solution to data-loss is redundant copies of the data, maintained either manually (explicit backups) or automatically (transparent mirroring or replication). The authors' idea is to go for automatic replication, and once you have that you might as well use the replicas to improve performance by allowing them to serve data to nearby nodes. This can actually result in less overall bandwidth than traditional approaches, because each node is going somewhere relatively close to get data instead of choking up a central server.
That actually highlights a flaw in the example as given in the article. It would be quite abnormal for someone in Helsinki to be going half-way around the world to get the data, because there should be a nearer replica. It would be more accurate, though perhaps less compelling, to say that Mary's machine was being used as a "staging area" for other local users watching the same movie from Helsinki that Mary just watched ten minutes ago. That would IMO convey the idea of an ISOS (actually the data-store part of it) actually reducing network bandwidth while also improving robustness.
The ISOS as described in the article runs on top of a traditional operating system; the files you need to boot that traditional OS would still reside locally, as would your applications. It's only the data that would reside elsewhere, which really isn't that different than happens today with NFS- or CIFS-based fileservers from the user's perspective. The difference, supposedly, is that replacing the single NAS server with a fully distributed network results in a more robust system, and one that can scale beyond the local LAN to the whole Internet.
Freenet is more of a data transmission method than a true data store. Even Ian says so, when pressed on the data-loss issue. MNet, OceanStore, Farsite or CFS would all be better examples of actual distributed storage.
Wrong. Some day you should try developing a theory to fit facts instead of making up "facts" that fit your pet theory.
That is truly one of the most contorted statements I've seen in a long time. What you actually ended up saying is that you think my friends did try what you suggest, and you're not surprised that they failed. Truly, you have a dizzying intellect.
Some of the people I mentioned were involved in product-related endeavors, others were in consulting. Both groups have fared poorly, but of the two it's the consultants who've been hurt most. Don't believe me? Find any half dozen people who were out there a year ago trying to do what you suggest. Ask them whether they believe in your theories. The two who are still employed might refrain from slapping you upside the head, but I can't vouch for the other ten.
What you don't seem to realize is that when money gets tight expenditures on all forms of outsourcing - consultants, freelancers, custom development and support contracts - are the first to go. The service income that you posit as a substitute for product income dries up, leaving nothing for open-source developers. It's not a coincidence that open-source software rose to prominence during an economic boom, and has receded during the ensuing decline. That's reality for you, and it's right there for you to see if you'd just pull your head out of your ass and take a look.
I personally know about two dozen who tried. About half eventually ended up working on proprietary software. The other half are unemployed. How big a sample do you need before you'll face facts? I'm sure the people I know constitute a very small percentage of all those who tried, and that more examples could be found.
What are you doing to change anything, besides ranting here? Just about everyone I know who actually has a business isn't shy about putting the word out. You, by contrast, haven't even bothered putting a link to your project/company in your profile. What do you do, exactly, that's so good for open source? BTW, it's not ad hominem when one's interlocutor has made their character or identity relevant by trying to use it as the basis for their argument.
Strawman. What I meant, and it was quite clear from the context, was that all this "making money from open source" was statistical noise.
Then again, I'm not the one claiming to be doing that, so that's not relevant. I'll take your (lack of an) answer to my question as a no, and so I suspect would anyone else reading this exchange.
Wow. Slashdot is pretty full of people with lots of ideals and no skills or experience, but you completely take the cake.
Bold claim. Got any proof? Got any numbers for how many people are actually doing it? How many programmers actually make a living doing open-source programming full time today? How about a year from now, if the economy doesn't pick up? How many total programmers are there in the world? This "phenomenon" you rant about, this wave that's going to overwhelm us all, was barely even statistical noise even at its peak, and that peak has passed.
Been there? Done that? No, didn't think so. Open source or closed, your zealotry would be fatal in business. Those few people who are making money off open source have survived by learning not to piss off the guys with the money with that kind of extremism.
You might actually be right there. "Need" is a funny word. No, the world doesn't need proprietary software or copyright law. But they exist, and people - real people, not just big corps - benefit from them. You haven't provided any compelling argument that society would be better off without them. Heck, far better programmers and writers than you have tried to make such arguments, and they haven't succeeded either.
That, my friend, is called argumentum ad hominem and it's frowned upon as a fallacy. I'm not just nit-picking either; logic and debate are essential skills in the business world, regardless of whether your source is open or closed. There are myriad reasons why people participate in the creation of open source. Lambasting them all as parasites or cowards is as absurd as characterizing all open-source programmers as thieves. There's a grain of truth in each case, but no more.
Even if I were the most ardent advocate of open source - and I've probably done more for open source than you ever will - I'm too much of a pragmatist to back the losing side in any fight. You'll find that such pragmatism is a common trait among real engineers.
Using telnet IAC for higher-level messaging is just perverse...worse than using HTTP for RPC, even. The whole point here is to use a protocol that's designed to support needed features, instead of hacking those features on top of an existing protocol that was designed for something else.
Session management is probably always going to involve an element of negotiation between servers and clients (and users) at some level. The important thing is that the responsibility for establishing, tracking, and maintaining sessions move out of the applications running on the server. It would be much better to have a single standard way to do these things for all applications, instead of having every application do it just a little bit differently. In-protocol support for sessions also provides a convenient way to deal with request-ordering issues, the status indicators that another poster mentioned, etc.
I'm not sure what definition of "telnet" you're using, then. Telnet is an extremely simple protocol that does almost nothing besides negotiating a few connection characteristics and terminal settings. Even the authentication you might be thinking of is not actually part of telnet; it's part of the login process on the target machine. The term "generic telnet" is therefore almost meaningless; HTTP as it exists today is much more telnet-like than what I'm suggesting, since it's based on a single bytestream over a single TCP connection instead of a real (potentially multi-stream multi-connection) session model.
Wrong. Telnet doesn't have session-management or heartbeat behavior either.
The problem with HTTP, as with any stateless protocol, is that there often are (or should be) relationships between requests. Ordering relationships are common, for example, as are authentication states. Stateless protocols are easier to implement, and thus should be preferred when such "implicit state" is not an issue, but in many other situations a protocol that knew something about state could be more efficient. All of this session-related cookie and URL-munging BS could just go away if the RPC-like parts of HTTP were changed to run on top of a generic session protocol.
Another error embodied in HTTP - and it's one of my pet peeves - is that it fails to separate heartbeat/liveness checking from the operational aspects of the protocol. Failure detection and recovery gets so much easier when any two communicating nodes track their connectedness using one protocol and every other protocol can adopt a simple approach of "just keep trying until we're notified [from the liveness protocol] that our peer has died". This is especially true when there are multiple top-level protocols each concerned with peer liveness, or when a request gets forwarded through multiple proxies. As before, having the RPC-like parts of HTTP run on top of a generic failure detection/recovery layer would give us a web that's much more robust and also (icing on the cake) easier to program for.
I don't know if any of this is what Don Box was getting at, but in very abstract terms he's right about HTTP being a lame protocol.
Not to detract from your main point, which is a good one and well made, but that particular statement is pretty dubious. Some network servers are bandwidth-bound, some are CPU-bound, some are memory-bound or disk-bound, some are crappy-API-bound, some are bound by complex synchronization/serialization requirements. Most are affected by more than one of these limitations, and by the tradeoffs that must be made between them.
None of this refutes your argument that C is not the best language for servers. Its lacks of type-safety, range/bounds checks, proper overflow handling (which requires exceptions), garbage collection and so on are all well known. Java is a much better language in these regards while still remaining fairly familiar, and even for completely CPU-bound programs there's a compelling argument for HotSpot-style JIT as an alternative to traditional compilation. If only Java supported true MI instead of the inadequate "interface" hack/substitute (and I do understand how the requirements for code mobility made that a reasonable choice at the time). Other, more "exotic", languages such as those in the Scheme or ML categories might appeal to purists, but their chances of achieving widespread adoption will remain almost nil until the "impedance mismatch" with declaratively-oriented system programming interfaces is lessened.
Yep, exactly right. Changing from a custom overlay-segment scheme to semi-real VM involves some serious pain. Switching from direct hardware access to OS-approved APIs can require hundreds or even thousands of changes, and often wholesale restructuring of the code. Resolving timing dependencies is a bitch; ask any chip designer about those, because it's the same set of issues.
If the program being ported is well designed, with an internal abstraction layer that just happens to match the new-OS API, and with a minimum of timing or hardware dependencies, porting might not be too bad. However, few old games were designed that way, and it's not just because the authors were sloppy (though that's often a factor). At the time many of these games were written, these issues were not well understood, and they're only well understood now precisely because so many missteps were made. Maybe "everyone knows that" now, just like everyone knows that CFCs are bad, but there was a time not so very long ago when pretty much nobody knew these things.
For all that is wrong with it, DMCA did not abandon or overturn this concept. This item, like a VCR, obviously has significant utility that does not involve violation of copyright. Nintendo doesn't stand a chance.