Kahn Overhauling the Internet
Whanana sent us an article about information objects as visualized by Robert Kahn.
The article is written from a fairly childish place (it explains DNS for crying out loud, and the bulk of it is a history lesson obviously designed for a mainstream paper) but Kahn's Digital Object Identifier concept is interesting. If anyone has links to RFCs and the like, please post them in the comments.
Try http://www.handle.net/. Stumbled across it some time ago with some Python doco, IIRC. I had no idea it had acquired any kind of acceptance.
First of all, what's scary about the DoD putting its library on-line?
Second of all, only people who create nothing think that the creative work of others should be free. If copyright holders want to be able to track their work and make sure that their work is only available to people who have acquired a licence, I don't see a problem. In fact, it will be a HUGE help to individual authors/musicians/artists/whatevers, since they can take care of managing distribution all by themselves without needing a big company to handle it. Of course, promotion is still an issue, but that's another debate...
If you want to be a thief, you'll hate this. If you want to actually use the net to find stuff and be reimbursed for the things you create, you'll love it.
-jon
Remember Amalek.
The idea of objects being passed around by handles is the original concept for the Internet as espoused by Dr. Alan Kay. This is how he originally envisioned Object-Oriented information models. Now the Internet is being re-invented to change it from the simple collection of connection paths to a real highway where real self-contained objects can be passed around. This may be better, maybe not. I guess it depends on how it's implemented. If each object has to be accompanied by a slew of "helpers" to allow the recieving node to interpret it, this could get ugly. But if a single, open, method is used, this could be beautiful. Imagine a fully portable object going from platform to platform totally transparent to the user!
.NET and I just hope the geniuses who are behind this idea don't get mown down by Microsoft's marketing muscle.
Of course, it'll have to compete with
Steven
-- I have marked myself unwilling to moderate-- I don't have other accounts to artificially inflate the karma of
A central database would not necissarily have the same problems as our current DNS system if OIDs were not human readable. Unfortunatly, three would still be two serious problems:
(1) Who would do the human readable -> OID translation.
(2) Using a centralized database to find things would make censorship really easy. I've seen a lot of people here asking "who would own the centralized database." This question is totally irrelevent as any government would strongly regulate the database owners. the real question is "what country would be able to pass laws about it?" i.e. who's version of censorship are we going to force on the world.
First, there maybe be a solution to (1), but it's not totally clear how to implement it. Specifically, you need a "philosophical" cross between search engines and alternative DNS servers. I do not see how to d this, but it seems like you want to have the "athoritative" qualities of DNS, but allow eople to switch as easily as going to a diffrent search engine.
Second, the only real solution to (2) is to eliminate the centralized database. Actually, you really should just junk all this guys ideas an use freenet. Now, information on freennet is not perminant, but there are soltions to that too. Specifically, get people to permenently rehost thngs they think are importent.
Anyway, issue (1) is central to freenet too, so there is really no point in even considering this guys proposals. Freenet is beats these proposals in every way.
The Christian religion has been and still is the principal enemy of moral progress in the world. -- Bertrand Russell
I think it would be grat to be able to access the closest copy of an article (or music, or the drawings of a historical organ, or the latest Linux kernel) without worrying whose computer it is on, and if they have moved it to a different location.
As far as I can see, this scheme does nothing towards solving the (admittedly real) problems of intellectual property. If I can fire up a nslookup or its relative, and translate the ID to anb URL and then to an IP address and a filename, then at most it can obscure the path to direct access. And we all know how badly "security by obscurity" has performed...
This brings up the whole philosophical discussion of what is information, and how it can be or should be owned or controlled. Not all information wants to be free - at least my credit card number wants not.
No matter how the legal and philosophical discussions go, this scheme may provide a valuable tool for identifying information, and that I see that as something positive. But will it take off? Only time will show.
In Murphy We Turst
I'm guessing that real radicals like to eat and have shelter. If they are going to give away what they produce for the Greater Good, they have to live off of the money/goods/etc. from other people, and they can get it either by force or the good graces of other people. The second option is wonderful in theory, but the first is more common in practice.
The sole reason why the Open Source movement exists is students and professors who have been living off of parents and/or government grants.
Granted, there are companies which are trying to make money from Open Source projects, but they are trying to profit from obscurity; their products are so hard to use, that people are willing to pay for support. I don't see that happening with books or music any time soon, and when people start putting easy-to-use interfaces on these Open Source products, these companies are sunk.
I've written code in my spare time which I've given away, but if I (or my company) was unable to make money from the code I write for them, I wouldn't be writing code, and I probably wouldn't have the spare time to write code that I give away. As much as I love to code, I love to take care of my family even more.
-jon
Remember Amalek.
If there is already a DNS root in place, why not visualize setting up a central object database at the specific site? I.E. cod.slashdot.org/whatever.object.you.want. that way, each individual site would administer it's own object database. this would eliminate the need for standardizing the server systems and would take the power away from a central authority(that could possibly screw it up...)
This could be implemented in the current DNS system.
DNS has a record called "hinfo" for Hardware Information, however due to security concerns, not many people use them now. The record is just a text string that can be almost anything to discribe the machine including hardware information, physical location, etc.
We could use this record for the IHS information without any changes to the current DNS system.
Comments?
Did anybody notice that to be able to assign handle's you had to have a "naming authority" as in:
.biz domains. I mean, so if I as a person want to use this new scheme, I've not only got to apply for an ICANN controlled domain name, I've now got to apply and pay for a "naming authority". What's to keep them from pricing this naming authority out of reach from the common person? I think this is a looming large threat to independent posting of material on the internet. Or am I being paranoid (again, heh)?
Under the handle system, my last column might have an identifier like: "10.12345/nov0700-zaret". "10.12345" is MSNBC's naming authority, and "nov0700-zaret" is the name of the object. MSNBC would then keep a record in its handle registry that told the computer what server the object is on, what file it's stored in, as well as the copyright information and anything else it may want in that record.
Scary stuff given the recently introduced $2000 price of the
EMUSE.NET
"We're sorry, but the website you're trying to reach has been disconnected."
And that's simply not true. If you remove the cost of creation, then distribution and mass production costs predominate. It would be trivial for a large company to steal a novel, song, movie, whatever from a person who works alone and produces something. It's virtually certain that the creator will be screwed and the large company will profit. This is what people refuse to understand: copyright and patent are intended to protect the little guy against the big guy and the tyranny of the masses, not the other way around.
others: Linus Torvalds has a day job and still finds time to direct kernel development, the KDE team is largely made up of people who work for TrollTech, and there are many many sysadmins who Open Source tools they have created to help themselves in their jobs.
Linus started Linux when he was at school. His work for Transmeta is owned by Transmeta, not him. It pays for him to spend time working on Linux. If he didn't get paid by Transmeta, he probably wouldn't be working on Linux.
I'm not sure about TrollTech's funding, but how do they make money? VC funding? How profitable is the company? Companies based on open source are probably long-term doomed.
Sys admins who are contributing stuff done during work hours are technically stealing from their company; it's almost certain that they signed a work contract which stated that anything they create during work hours is company property, and anything RELATED to company work created at any time is owned by the company, too. Just wait until the first lawsuits which try to remove that sort of code from an Open Source project...
If you believe in "Intellectual Property", that's your business, but it doesn't give you the right to denigrate the work of those who believe differently than you. There is not *yet* a law that states that all intellectual activity must be undertaken in service of the profit motive.
If you subsidize your salable creative work with non-creative work, even if you don't enjoy your non-creative work solely because you think it's morally wrong to profit from creative work, you're either a saint or a moron. I can't decide which. If you're doing some creative work for money and some creative work for free, then you're a hypocrite.
-jon
Remember Amalek.
jim
BTW, if you are looking for the current incarnation of Xanadu, look for zigzag.
jim
It seems to me this sort of discussion has been handled many times with the same results. Object descriptions and addresses ought to remain separate. DOI looks like a big directory structure for the net; your objects be they computers, printers, or individual files are given handles which are are then in turn given directory registrations. Am I following so far? It seems like this is just restructuring overhead without making it particularly more efficient or effective. TCP/IP packets can be run through a stack which pretty quickly gives the receiver information about the packet but leaves the content alone. This is very simple and amorphous which is why it caught on (you can even use different routing/addressing schemes as long as it follows the header-has-little-to-do-with-the-packet concept). Directory structures on the other hand need alot of overhead due to the fact something somewhere has to know exactly where something is. Lets say all of the DNS servers around today had to hold references for every file available on the internet. That is amazing overhead just to access a text file on a server someplace. Overhead that is distributed over the WHOLE network (the entire internet) as you've only got so many directory servers you can possibly access. TCP/IP combined with transfers that overhead to the computers that are actually talking rather than the entire network. Its easy to upgrade the speed of your hardware to handle an increased demand or whatnot which is generating the extra overhead but is truely hard to squeeze more umph out of a network that is forced to access a limited number of nodes to do absolutely everything.
I'm a loner Dottie, a Rebel.
The handle approach wants to extend and replace URLs. Right now you type a URL into your browser and it goes to a DNS server and places a query. The DNS server matches up slashdot.org with an IP address which your browser sends an HTTP request to. The slashdot server then finds the file and sends it to you. With the handle approach the file and address are stored in a directory so you type in slashdot.org.index and your browser authomatically goes to the index file on the slashdot server. That directory entry to slashdot.org.index is dynamic though, if the file moves to a different computer or server the name still points to it.
I'm a loner Dottie, a Rebel.
> More content-id, rights management, copy
/. and the discussion on on-line legistation at k5
0 0/11/22/17051/683
> control stuff. Very interesting but users
> will reject it. Sorry!
You presume that you will have a choice. Bad mistake.
I refer you to the volume of deCSS discussion @
http://www.kuro5hin.org/?op=displaystory;sid=20
May I also remind you that there is nothing stopping either nationalisation or "registration" of ISPs and POPs.
Afterall, a modem might be considered a burglary tool.
-- Butlerian Jihad NOW!
Is he going to use the Genesis device?
More content-id, rights management, copy control stuff. Very interesting but users will reject it. Sorry!
sulli
RTFJ.
RFC's can be found at http://www.cotse.com/references.htm
Why don't we just use the current DNS system to resolve to the hostname, and each host has its own database of object id's? This seems most reasonable to me. Each site can (if it chooses to) migrate to using OID's at its own leisure. Then, we could use this along with the current protocols and filesystems, without having to create a whole new internet. It sounds like this is a good solution for administering a single domain, but not for the entire internet. Can you imagine the size of the database necessary to store id & location of every page on the net? Geez...
It may look like I'm doing nothing, but I'm actively waiting for my problems to go away.
--Scott Adams
Freenet stores files under a unique name in a distributed filesystem (i.e. freenet). All you need to retrieve a file is it's name. It appears to me that this is Kahn's idea taken to the extreme. Freenet takes care of storing and retrieving objects with a unique identifier. The system could easily be extended with databases coupling relevant keywords to the identifier. Also it is safe, freenet is explicitly designed to hide the location of the files. Even the owner can't touch it after it has been put into freenet.
Jilles
A system serial number with bits reversed, and packed against the top of the 64 bit word.
An object creation counter for that system serial number -- under localized control/increment.
I had to continually fight off people who wanted to subdivide the 64 bits into fields, the way IP was. The primary discipline I wanted people to follow was to keep routing information out of the object identifier so that object locations could be changed dynamically. It was amazing how many times I had to explain this to people who should have known better.
Unfortunately, I didn't explain it to the right people at DARPA, although I did have a couple of meetings with David P. Reed about it when he was still at MIT's LCS.
I touch on some of this history in a couple of documents, one written recently and one written at the time.
Until I read the article about Kahn, I didn't realize that DARPA chose the IP nonsense at almost exactly the time that the AT&T/Knight-Ridder project that was funding me made a bad choice of vendors that resulted in my resignation from that particular high-profile effort and try to strike out on my own turning 8MHz PC's into multiuser network servers (which I actually succeeded in doing after a lot of blood letting, but that's another story).
Seastead this.
Since I've been involved in this discussion for some time I thought I'd recycle some of my old comments :-)
Date: Thu, 20 May 1999 16:46:26 -0400 (EDT)
From: "Arthur P. Smith"
To: discuss-doi@doi.org
Subject: Re: [Discuss-DOI] DOI: Current Status and Outlook
On Wed, 19 May 1999, Norman Paskin wrote:
> A paper which provides a summary of the current thinking on DOI has
> just been published in D- Lib magazine at
> http://www.dlib.org/dlib/may99/05paskin.html
This does answer a lot of questions we had, mostly in what seems
to be the right direction. The relationship with INDECS on metadata
issues looks like a particularly good resolution ("functional granularity"
is essentially what I was looking for in one of my earlier
questions). It looks like a specific metadata "Genre" needs to be
worked out in detail for journal articles (re reference linking) - and
it's not clear who has responsibility for this (the IDF or someone else?)
but at least at the level specified in this article it looks workable.
But to some extent the paper shows the DOI is a solution in search
of a "killer application" (mentioned several times in the article).
There's a chicken-and-egg problem here: the potential applications seem
to require widespread adoption before they become useful.
As one of the final bullets says: "Internet solutions are unlikely to
succeed unless they are globally applicable and show convincing power
over alternatives" - does the DOI as described show convincing power
over the alternatives?
It's sometimes hard to know what counts as an alternative, but the
following systems (some listed in the article) could be
alternatives for at least some of the things the DOI does:
1. the handle system itself
2. uniform resource names
3. IETF's DNS-based Naming Authority Pointer
4. Persistent URL's (PURL's)
5. rule-based reference linking (link managers, Urania, S-Link-S)
6. a global LDAP/directory service
Alternatives 1-4 provide a variety of routes for creating a unique
digital identifier for something - we really don't NEED the DOI just
to have digital identifiers, though DOI does provide a handy rallying
point for those of us providing intellectual property in digital form.
Alternative 2 is the highest level of digital identifier, but perhaps
that is all we really need? There is room for many "naming authorities" -
perhaps even each publisher could be their own naming authority. That
would depend on widespread adoption of (3) which may or may not happen,
and resolution of general registration processes too.
As the article mentions, general implementation of URN's is quite
limited even after almost a decade of work. Is there a reason why
nobody has found it particularly useful yet?
Alternative 1 is, to some extent, a non-issue (a DOI is, after all,
just a handle) and is also, to some extent, the same issue. Any
publisher could, with or without DOI, register as a handle naming
authority and create handles for its digital objects. Is some of
the DOI work duplicating what has already been done (or should have
been done) for the handle system itself? As the handle system web
pages mention (http://www.handle.net/) it is at least receiving some
use as a digital identifier of intellectual property by NCSTRL,
the Library of Congress, DTIC, NLM, etc. Does the DOI provide
convincing power over using the handle system directly?
Alternative 4 (PURL's) is critiqued at length in the article,
particularly on the issue of resolution (section 3). Perhaps I
don't understand properly, but I don't quite agree with some of
the arguments against PURLs. Any digital identifier can be used to
offer great flexibility in resolution - a local proxy can redirect to a local
cache or resource, for example, for ANY of the unique identifiers
under question. Once resolved, the "document" resolved to can
itself contain multiple alternative resolutions. And a handle is only
going to have multiple resolutions if the publisher puts it there
(who else has the authority to insert the data?). So I think the
single vs. multiple redirection issue is a red herring. I do agree it's
nice to have a more direct protocol (though from looking at the details
of the way handles are supposed to resolve there is a lot of
back-and-forth there too). As far as being a URN or not, there's
no reason why PURLs couldn't be treated as legitimate digital identifiers,
even if they are simply URL's at the moment. On "scalability" - the
current handle implementation doesn't seem particularly scalable
either. Only 4 million handles per server? Only 4 global servers
(with 4 backups that seem to point to the very same machines on
different ports)? And those servers seem to all be in the D.C. area...
Not that I think PURLs are wonderful, but does the DOI provide
convincing power over using PURLs, as far as identification and
resolution goes?
Which is presumably why we've been told DOI's have to do
more than just identification and resolution. Hence metadata, to
provide standard information to allow "look-up", multiple-resolution,
and digital commerce applications. This actually makes a lot of
sense. And the other id/resolution alternatives do not
seem to meet the INDECS criteria as well as the DOI can.
But what does this have to do with reference linking, the
first "killer application" mentioned? The look-ups required there
are almost certainly going to be more easily performed with
specialized databases (A&I services) or direct rule-based
linking (alternative 5) and in fact this is already
being done, generally without the use of DOI's. The DOI does not seem to
make the linking process easier, so there's no "convincing power"
here it would seem.
I added alternative 6 (global directory service) as a wild-card -
this seems to be a major focus of "network operating system" vendors -
Novell's NDS, Oracle's OID, Microsoft's Active Directory - these seem
to be systems intended to hold information on hundreds of
millions of "objects" available on a network - an example being the
personal information of a subscriber to an internet service provider.
But another potential application of these is to identify and provide
data on objects available on the net - intellectual property or other
things available for commerce. Is this something the DOI could
fit into, or is it something that could sweep URN's, handles, DOI and
all the rest away? I really don't know, but it seems like
something to watch closely over the next year or so.
Energy: time to change the picture.
People won't take the IP protection crap and once it leaves the protected lands for the real world they'll rip it to bits.
The idea of content objects with unique ID's isn't at all new but is a good one. I always liked the idea of using encryption signatures as the keys. give it sig for itself and one for it's owner and build a simple search engine mechanism into the Net itself and you have a nice lil system. An important note might be that such a system does not need to, and possibly should not, replace TCP/IP or even rely on TCP/IP as it's only supported carrier. It should be as agnostic about transports as possible for the most flexibility.
Jabber might be a good start for this layer since it is a very flexible system for transporting XML-ized content and contact-type information. I really expect something like this to assimilate the web in a couple years. Maybe Jabber merged w/ FreeNet.
Someone who doesn't know the resource they want could search for it by known facts just as they do now at Yahoo, Google, etc.. once they find it they could store the objects unique id and then every time they needed that object again they could ask the net for it and the closest copy found would be returned.
At what price learning? At what cost wisdom? The price is a man's peace of mind, and the cost is his life.
The general concensus is that RFCs have been obsoleted by patents. The only difference is that with a patent, comments must be officially sanctioned by the holder. Woohoo! fuzzylogic!
Anyone else notice the part about the central Object Id database? Just think how much grief ICANN has caused with the DNS root. I love to think what the Object Id Root owner will be like. This is a lame duck. Companies will probably love it (in theory), however, it cant function in the real world unless almost everyone adopts it. And god knows that wont happen.