Slashdot Mirror


Kahn Overhauling the Internet

Whanana sent us an article about information objects as visualized by Robert Kahn. The article is written from a fairly childish place (it explains DNS for crying out loud, and the bulk of it is a history lesson obviously designed for a mainstream paper) but Kahn's Digital Object Identifier concept is interesting. If anyone has links to RFCs and the like, please post them in the comments.

28 of 72 comments (clear)

  1. CNRI's handle system website by Zigg · · Score: 2

    Try http://www.handle.net/. Stumbled across it some time ago with some Python doco, IIRC. I had no idea it had acquired any kind of acceptance.

  2. Re:DOIs are cool and scary by TWR · · Score: 3
    If you read the MSNBC article carefully, you notice a few scary things mentioned, like "[it] is using it to build the Defense Virtual Library" and "another problem is with copyrights and other protections of intellectual property."

    First of all, what's scary about the DoD putting its library on-line?

    Second of all, only people who create nothing think that the creative work of others should be free. If copyright holders want to be able to track their work and make sure that their work is only available to people who have acquired a licence, I don't see a problem. In fact, it will be a HUGE help to individual authors/musicians/artists/whatevers, since they can take care of managing distribution all by themselves without needing a big company to handle it. Of course, promotion is still an issue, but that's another debate...

    If you want to be a thief, you'll hate this. If you want to actually use the net to find stuff and be reimbursed for the things you create, you'll love it.

    -jon

    --

    Remember Amalek.

  3. Internet becoming what was originally envisioned? by Mtgman · · Score: 3

    The idea of objects being passed around by handles is the original concept for the Internet as espoused by Dr. Alan Kay. This is how he originally envisioned Object-Oriented information models. Now the Internet is being re-invented to change it from the simple collection of connection paths to a real highway where real self-contained objects can be passed around. This may be better, maybe not. I guess it depends on how it's implemented. If each object has to be accompanied by a slew of "helpers" to allow the recieving node to interpret it, this could get ugly. But if a single, open, method is used, this could be beautiful. Imagine a fully portable object going from platform to platform totally transparent to the user!

    Of course, it'll have to compete with .NET and I just hope the geniuses who are behind this idea don't get mown down by Microsoft's marketing muscle.

    Steven

    --
    -- I have marked myself unwilling to moderate-- I don't have other accounts to artificially inflate the karma of
  4. Re:ICANN part 2 by Weezul · · Score: 2

    A central database would not necissarily have the same problems as our current DNS system if OIDs were not human readable. Unfortunatly, three would still be two serious problems:

    (1) Who would do the human readable -> OID translation.

    (2) Using a centralized database to find things would make censorship really easy. I've seen a lot of people here asking "who would own the centralized database." This question is totally irrelevent as any government would strongly regulate the database owners. the real question is "what country would be able to pass laws about it?" i.e. who's version of censorship are we going to force on the world.

    First, there maybe be a solution to (1), but it's not totally clear how to implement it. Specifically, you need a "philosophical" cross between search engines and alternative DNS servers. I do not see how to d this, but it seems like you want to have the "athoritative" qualities of DNS, but allow eople to switch as easily as going to a diffrent search engine.

    Second, the only real solution to (2) is to eliminate the centralized database. Actually, you really should just junk all this guys ideas an use freenet. Now, information on freennet is not perminant, but there are soltions to that too. Specifically, get people to permenently rehost thngs they think are importent.

    Anyway, issue (1) is central to freenet too, so there is really no point in even considering this guys proposals. Freenet is beats these proposals in every way.

    --
    The Christian religion has been and still is the principal enemy of moral progress in the world. -- Bertrand Russell
  5. Philosophical by heikkile · · Score: 2
    This is damned difficult! The idea of labeling information instead of data is a good one - but we need to sort out what is information, what is labels, what is data, and how to make it all work, and why!

    I think it would be grat to be able to access the closest copy of an article (or music, or the drawings of a historical organ, or the latest Linux kernel) without worrying whose computer it is on, and if they have moved it to a different location.

    As far as I can see, this scheme does nothing towards solving the (admittedly real) problems of intellectual property. If I can fire up a nslookup or its relative, and translate the ID to anb URL and then to an IP address and a filename, then at most it can obscure the path to direct access. And we all know how badly "security by obscurity" has performed...

    This brings up the whole philosophical discussion of what is information, and how it can be or should be owned or controlled. Not all information wants to be free - at least my credit card number wants not.

    No matter how the legal and philosophical discussions go, this scheme may provide a valuable tool for identifying information, and that I see that as something positive. But will it take off? Only time will show.

    --

    In Murphy We Turst

  6. Re:DOIs are cool and scary by TWR · · Score: 2
    However, just because selfish and immature cynics exist doesn't mean that real radicals don't.

    I'm guessing that real radicals like to eat and have shelter. If they are going to give away what they produce for the Greater Good, they have to live off of the money/goods/etc. from other people, and they can get it either by force or the good graces of other people. The second option is wonderful in theory, but the first is more common in practice.

    The sole reason why the Open Source movement exists is students and professors who have been living off of parents and/or government grants.

    Granted, there are companies which are trying to make money from Open Source projects, but they are trying to profit from obscurity; their products are so hard to use, that people are willing to pay for support. I don't see that happening with books or music any time soon, and when people start putting easy-to-use interfaces on these Open Source products, these companies are sunk.

    I've written code in my spare time which I've given away, but if I (or my company) was unable to make money from the code I write for them, I wouldn't be writing code, and I probably wouldn't have the spare time to write code that I give away. As much as I love to code, I love to take care of my family even more.

    -jon

    --

    Remember Amalek.

  7. Why make it central? by toddler420 · · Score: 2

    If there is already a DNS root in place, why not visualize setting up a central object database at the specific site? I.E. cod.slashdot.org/whatever.object.you.want. that way, each individual site would administer it's own object database. this would eliminate the need for standardizing the server systems and would take the power away from a central authority(that could possibly screw it up...)

    1. Re:Why make it central? by Graymalkin · · Score: 2

      What do you think HTTP already does? You store (or link) the files you want people to access in your www directory. Your server receives a HTTP request looking for a file, the HTTP server either finds it and sends it back or doesn't find it and tells the person so. You can transfer anything you want over HTTP and it allows for contextual information to be transfered (size and MIME type) of the object requested. DNS doesn't touch your system's internal components it just helps computers find each other whcih is all it ought to do.

      --
      I'm a loner Dottie, a Rebel.
  8. Re:IHS (Information Handle Server) by n3rd · · Score: 2

    This could be implemented in the current DNS system.

    DNS has a record called "hinfo" for Hardware Information, however due to security concerns, not many people use them now. The record is just a text string that can be almost anything to discribe the machine including hardware information, physical location, etc.

    We could use this record for the IHS information without any changes to the current DNS system.

    Comments?

  9. Naming authority? by laetus · · Score: 2

    Did anybody notice that to be able to assign handle's you had to have a "naming authority" as in:

    Under the handle system, my last column might have an identifier like: "10.12345/nov0700-zaret". "10.12345" is MSNBC's naming authority, and "nov0700-zaret" is the name of the object. MSNBC would then keep a record in its handle registry that told the computer what server the object is on, what file it's stored in, as well as the copyright information and anything else it may want in that record.

    Scary stuff given the recently introduced $2000 price of the .biz domains. I mean, so if I as a person want to use this new scheme, I've not only got to apply for an ICANN controlled domain name, I've now got to apply and pay for a "naming authority". What's to keep them from pricing this naming authority out of reach from the common person? I think this is a looming large threat to independent posting of material on the internet. Or am I being paranoid (again, heh)?

    EMUSE.NET

    --

    "We're sorry, but the website you're trying to reach has been disconnected."
  10. Re:DOIs are cool and scary by TWR · · Score: 2
    Putting patentable ideas into the public domain, or GPL'ing them, doesn't prevent the creator from making money by being first to sell products based on them or by creating better products based on those ideas than others can make.

    And that's simply not true. If you remove the cost of creation, then distribution and mass production costs predominate. It would be trivial for a large company to steal a novel, song, movie, whatever from a person who works alone and produces something. It's virtually certain that the creator will be screwed and the large company will profit. This is what people refuse to understand: copyright and patent are intended to protect the little guy against the big guy and the tyranny of the masses, not the other way around.

    others: Linus Torvalds has a day job and still finds time to direct kernel development, the KDE team is largely made up of people who work for TrollTech, and there are many many sysadmins who Open Source tools they have created to help themselves in their jobs.

    Linus started Linux when he was at school. His work for Transmeta is owned by Transmeta, not him. It pays for him to spend time working on Linux. If he didn't get paid by Transmeta, he probably wouldn't be working on Linux.

    I'm not sure about TrollTech's funding, but how do they make money? VC funding? How profitable is the company? Companies based on open source are probably long-term doomed.

    Sys admins who are contributing stuff done during work hours are technically stealing from their company; it's almost certain that they signed a work contract which stated that anything they create during work hours is company property, and anything RELATED to company work created at any time is owned by the company, too. Just wait until the first lawsuits which try to remove that sort of code from an Open Source project...

    If you believe in "Intellectual Property", that's your business, but it doesn't give you the right to denigrate the work of those who believe differently than you. There is not *yet* a law that states that all intellectual activity must be undertaken in service of the profit motive.

    If you subsidize your salable creative work with non-creative work, even if you don't enjoy your non-creative work solely because you think it's morally wrong to profit from creative work, you're either a saint or a moron. I can't decide which. If you're doing some creative work for money and some creative work for free, then you're a hypocrite.

    -jon

    --

    Remember Amalek.

  11. Been done, didn't work, but fragments are in use by Jim+McCoy · · Score: 2
    The sort of read capabilities Kahn is talking about were the conerstone of the Xanadu project and its plans for handling copyright protection and payments for creators. Systems like Mojo Nation and Freenet create these sorts of absolute references (usually based on SHA1 hashes and the like) and flexible addressing schemes a la SPKI/SDSI deal with all of the namespace issues Kahn is talking about. This is basically a not-well-researched rehash of some old ideas; the bits of those old ideas which are of value are already being incorporated into systems, but the central registry/indirection via tollbooths bit is new and does not seem to add much real value to the users of such system.

    jim

  12. Re:What about Xanadu? by Jim+McCoy · · Score: 2
    There was nothing particularly crappy about Xanadu, it just tried to do too much and expected the rest of the world to stop what it was doing for a few years while they finished this uber-cool thing. It goes something like this: Ted Nelson has an idea of the 6 things hypertext "must have" to work and gathers too many mad scientists and not enough hunchback to work on things. A couple of years later Tim Berners-Lee figures out that you only need two of the six "requirements" and creates the web. Five years later Ted presents a variation of the original idea that is trying to find footing against a system (http/html) which is demonstrably inferior, but good enough. The rest, as they say is history...

    BTW, if you are looking for the current incarnation of Xanadu, look for zigzag.

    jim

  13. Shooting a camel out of a canon by Graymalkin · · Score: 2

    It seems to me this sort of discussion has been handled many times with the same results. Object descriptions and addresses ought to remain separate. DOI looks like a big directory structure for the net; your objects be they computers, printers, or individual files are given handles which are are then in turn given directory registrations. Am I following so far? It seems like this is just restructuring overhead without making it particularly more efficient or effective. TCP/IP packets can be run through a stack which pretty quickly gives the receiver information about the packet but leaves the content alone. This is very simple and amorphous which is why it caught on (you can even use different routing/addressing schemes as long as it follows the header-has-little-to-do-with-the-packet concept). Directory structures on the other hand need alot of overhead due to the fact something somewhere has to know exactly where something is. Lets say all of the DNS servers around today had to hold references for every file available on the internet. That is amazing overhead just to access a text file on a server someplace. Overhead that is distributed over the WHOLE network (the entire internet) as you've only got so many directory servers you can possibly access. TCP/IP combined with transfers that overhead to the computers that are actually talking rather than the entire network. Its easy to upgrade the speed of your hardware to handle an increased demand or whatnot which is generating the extra overhead but is truely hard to squeeze more umph out of a network that is forced to access a limited number of nodes to do absolutely everything.

    --
    I'm a loner Dottie, a Rebel.
  14. Re:Will it hang together? by Graymalkin · · Score: 2

    The handle approach wants to extend and replace URLs. Right now you type a URL into your browser and it goes to a DNS server and places a query. The DNS server matches up slashdot.org with an IP address which your browser sends an HTTP request to. The slashdot server then finds the file and sends it to you. With the handle approach the file and address are stored in a directory so you type in slashdot.org.index and your browser authomatically goes to the index file on the slashdot server. That directory entry to slashdot.org.index is dynamic though, if the file moves to a different computer or server the name still points to it.

    --
    I'm a loner Dottie, a Rebel.
  15. Re:Yawn. by thoglette · · Score: 2

    > More content-id, rights management, copy
    > control stuff. Very interesting but users
    > will reject it. Sorry!

    You presume that you will have a choice. Bad mistake.

    I refer you to the volume of deCSS discussion @ /. and the discussion on on-line legistation at k5
    http://www.kuro5hin.org/?op=displaystory;sid=200 0/11/22/17051/683

    May I also remind you that there is nothing stopping either nationalisation or "registration" of ISPs and POPs.

    Afterall, a modem might be considered a burglary tool.

    --
    -- Butlerian Jihad NOW!
  16. heh by British · · Score: 3

    Is he going to use the Genesis device?

  17. Yawn. by sulli · · Score: 2

    More content-id, rights management, copy control stuff. Very interesting but users will reject it. Sorry!

    --

    sulli
    RTFJ.
  18. RFC's by Anonymous Coward · · Score: 2

    RFC's can be found at http://www.cotse.com/references.htm

  19. Why ICANN? by dmatos · · Score: 2

    Why don't we just use the current DNS system to resolve to the hostname, and each host has its own database of object id's? This seems most reasonable to me. Each site can (if it chooses to) migrate to using OID's at its own leisure. Then, we could use this along with the current protocols and filesystems, without having to create a whole new internet. It sounds like this is a good solution for administering a single domain, but not for the entire internet. Can you imagine the size of the database necessary to store id & location of every page on the net? Geez...

    --

    It may look like I'm doing nothing, but I'm actively waiting for my problems to go away.
    --Scott Adams
  20. freenet seems to be similar by jilles · · Score: 2

    Freenet stores files under a unique name in a distributed filesystem (i.e. freenet). All you need to retrieve a file is it's name. It appears to me that this is Kahn's idea taken to the extreme. Freenet takes care of storing and retrieving objects with a unique identifier. The system could easily be extended with databases coupling relevant keywords to the identifier. Also it is safe, freenet is explicitly designed to hide the location of the files. Even the owner can't touch it after it has been put into freenet.

    --

    Jilles
  21. The Net Object ID That Wasn't by Baldrson · · Score: 3
    In 1982, Apple, Atari, Packet Cable, AT&T, Knight-Ridder News and Xerox PARC were all part of a group I was putting together to push for a standard object identifier for network communications. It was going to be 64 bits with 2 pieces:

    A system serial number with bits reversed, and packed against the top of the 64 bit word.
    An object creation counter for that system serial number -- under localized control/increment.

    I had to continually fight off people who wanted to subdivide the 64 bits into fields, the way IP was. The primary discipline I wanted people to follow was to keep routing information out of the object identifier so that object locations could be changed dynamically. It was amazing how many times I had to explain this to people who should have known better.

    Unfortunately, I didn't explain it to the right people at DARPA, although I did have a couple of meetings with David P. Reed about it when he was still at MIT's LCS.

    I touch on some of this history in a couple of documents, one written recently and one written at the time.

    Until I read the article about Kahn, I didn't realize that DARPA chose the IP nonsense at almost exactly the time that the AT&T/Knight-Ridder project that was funding me made a bad choice of vendors that resulted in my resignation from that particular high-profile effort and try to strike out on my own turning 8MHz PC's into multiuser network servers (which I actually succeeded in doing after a lot of blood letting, but that's another story).

  22. DOI's and alternatives to them by apsmith · · Score: 3

    Since I've been involved in this discussion for some time I thought I'd recycle some of my old comments :-)

    Date: Thu, 20 May 1999 16:46:26 -0400 (EDT)
    From: "Arthur P. Smith"
    To: discuss-doi@doi.org
    Subject: Re: [Discuss-DOI] DOI: Current Status and Outlook
    On Wed, 19 May 1999, Norman Paskin wrote:

    > A paper which provides a summary of the current thinking on DOI has
    > just been published in D- Lib magazine at
    > http://www.dlib.org/dlib/may99/05paskin.html

    This does answer a lot of questions we had, mostly in what seems
    to be the right direction. The relationship with INDECS on metadata
    issues looks like a particularly good resolution ("functional granularity"
    is essentially what I was looking for in one of my earlier
    questions). It looks like a specific metadata "Genre" needs to be
    worked out in detail for journal articles (re reference linking) - and
    it's not clear who has responsibility for this (the IDF or someone else?)
    but at least at the level specified in this article it looks workable.

    But to some extent the paper shows the DOI is a solution in search
    of a "killer application" (mentioned several times in the article).
    There's a chicken-and-egg problem here: the potential applications seem
    to require widespread adoption before they become useful.
    As one of the final bullets says: "Internet solutions are unlikely to
    succeed unless they are globally applicable and show convincing power
    over alternatives" - does the DOI as described show convincing power
    over the alternatives?

    It's sometimes hard to know what counts as an alternative, but the
    following systems (some listed in the article) could be
    alternatives for at least some of the things the DOI does:

    1. the handle system itself
    2. uniform resource names
    3. IETF's DNS-based Naming Authority Pointer
    4. Persistent URL's (PURL's)
    5. rule-based reference linking (link managers, Urania, S-Link-S)
    6. a global LDAP/directory service

    Alternatives 1-4 provide a variety of routes for creating a unique
    digital identifier for something - we really don't NEED the DOI just
    to have digital identifiers, though DOI does provide a handy rallying
    point for those of us providing intellectual property in digital form.

    Alternative 2 is the highest level of digital identifier, but perhaps
    that is all we really need? There is room for many "naming authorities" -
    perhaps even each publisher could be their own naming authority. That
    would depend on widespread adoption of (3) which may or may not happen,
    and resolution of general registration processes too.
    As the article mentions, general implementation of URN's is quite
    limited even after almost a decade of work. Is there a reason why
    nobody has found it particularly useful yet?

    Alternative 1 is, to some extent, a non-issue (a DOI is, after all,
    just a handle) and is also, to some extent, the same issue. Any
    publisher could, with or without DOI, register as a handle naming
    authority and create handles for its digital objects. Is some of
    the DOI work duplicating what has already been done (or should have
    been done) for the handle system itself? As the handle system web
    pages mention (http://www.handle.net/) it is at least receiving some
    use as a digital identifier of intellectual property by NCSTRL,
    the Library of Congress, DTIC, NLM, etc. Does the DOI provide
    convincing power over using the handle system directly?

    Alternative 4 (PURL's) is critiqued at length in the article,
    particularly on the issue of resolution (section 3). Perhaps I
    don't understand properly, but I don't quite agree with some of
    the arguments against PURLs. Any digital identifier can be used to
    offer great flexibility in resolution - a local proxy can redirect to a local
    cache or resource, for example, for ANY of the unique identifiers
    under question. Once resolved, the "document" resolved to can
    itself contain multiple alternative resolutions. And a handle is only
    going to have multiple resolutions if the publisher puts it there
    (who else has the authority to insert the data?). So I think the
    single vs. multiple redirection issue is a red herring. I do agree it's
    nice to have a more direct protocol (though from looking at the details
    of the way handles are supposed to resolve there is a lot of
    back-and-forth there too). As far as being a URN or not, there's
    no reason why PURLs couldn't be treated as legitimate digital identifiers,
    even if they are simply URL's at the moment. On "scalability" - the
    current handle implementation doesn't seem particularly scalable
    either. Only 4 million handles per server? Only 4 global servers
    (with 4 backups that seem to point to the very same machines on
    different ports)? And those servers seem to all be in the D.C. area...

    Not that I think PURLs are wonderful, but does the DOI provide
    convincing power over using PURLs, as far as identification and
    resolution goes?

    Which is presumably why we've been told DOI's have to do
    more than just identification and resolution. Hence metadata, to
    provide standard information to allow "look-up", multiple-resolution,
    and digital commerce applications. This actually makes a lot of
    sense. And the other id/resolution alternatives do not
    seem to meet the INDECS criteria as well as the DOI can.

    But what does this have to do with reference linking, the
    first "killer application" mentioned? The look-ups required there
    are almost certainly going to be more easily performed with
    specialized databases (A&I services) or direct rule-based
    linking (alternative 5) and in fact this is already
    being done, generally without the use of DOI's. The DOI does not seem to
    make the linking process easier, so there's no "convincing power"
    here it would seem.

    I added alternative 6 (global directory service) as a wild-card -
    this seems to be a major focus of "network operating system" vendors -
    Novell's NDS, Oracle's OID, Microsoft's Active Directory - these seem
    to be systems intended to hold information on hundreds of
    millions of "objects" available on a network - an example being the
    personal information of a subscriber to an internet service provider.
    But another potential application of these is to identify and provide
    data on objects available on the net - intellectual property or other
    things available for commerce. Is this something the DOI could
    fit into, or is it something that could sweep URN's, handles, DOI and
    all the rest away? I really don't know, but it seems like
    something to watch closely over the next year or so.

    --

    Energy: time to change the picture.

    1. Re:DOI's and alternatives to them by apsmith · · Score: 2

      Another old comment here:

      Date: Mon, 24 May 1999 13:21:35 -0400 (EDT)
      From: "Arthur P. Smith"
      To: discuss-doi@doi.org
      Subject: Re: [Discuss-DOI] DOI: Current Status and Outlook

      On Sun, 23 May 1999, Larry Lannom wrote:

      > [ ...] I agree with
      > Stu's comments on policy development being key. In talks about the
      > handle system I usually describe DOI and other handle uses as policy
      > laid on top of infrastructure.

      I found myself agreeing with Stu's comments on this too. But policies
      and practices won't be adopted unless they are either evolutionary,
      based on existing well-tested standards, or truly revolutionary,
      allowing some wonderful new thing to be accomplished that can't
      be done any other way. As I was trying to convey earlier, we have a lot of
      choices for both the technology and the content of unique identifiers,
      including long-lived ones, and it doesn't look like DOI's or even handles
      meet the revolutionary criteria. There are also more application-specific
      alternatives to the DOI (such as SICI) that I didn't include earlier, many
      of which have also not received much use despite their ease of creation.
      If we're talking about identification for the purposes of intellectual
      property, shouldn't the Copyright Clearance Center and the other
      Reproduction Rights Organizations be at the center of
      determining such standards? Don't they already have unique identifiers
      that they use (there is some CCC number at the foot of every page
      we publish now)?

      > [...] there
      > are hard technical issues around ease of use, both from an end user as
      > well as an administrative point of view. Especially from an
      > administrative side, there is a 'good intentions' factor that I believe
      > has been here since we all starting talking about this stuff almost ten
      > years ago now. The net makes it easy to distribute information in an ad
      > hoc fashion. It also makes it easy to lose things.

      Things get "lost" either through neglect, deliberate removal, or
      relocation (though I would call that "misplaced" rather than "lost").
      DOI is unlikely to help either of the first two situations.
      If there is no economic incentive for anybody to
      support the preservation of some piece of digital information, there
      will certainly be no incentive to keep the DOI pointer up to date
      for it. And if the owner of a piece of information wants to remove
      it, how could a DOI stop them?

      Where the DOI would help is if a piece of information is relocated -
      but so would any other unique identifier coupled with a location
      system (PURL in general, and S-Link-S, Urania, PubMed, etc. specifically
      for scholarly articles already exist - A&I services are also doing a lot
      in this area). The more such systems pop up
      and gain "market share" in different applications, the stronger the
      incentive for the publisher never to change the location of anything
      ever again because of the work required to keep them all up to date.

      Administrative ease is basically a factor of how much work is required
      to register each new published item, plus how much work is required
      to change all the location information when things are relocated.
      One can even write an equation for this:

      Burden/year = B * New items/year + R * (total items) * relocations/year

      where B is the "burden" associated with inserting a new item,
      and R is the "burden" associated with updating an existing item.
      Even if much of this is handled with automated systems that make
      the initial per-item burdens tiny, there is still a need for quality
      control, ensurance of the interoperability of systems (for example,
      what is the standard for representation of author names containing
      special characters? mathematics in titles? etc) and programming
      work whose complexity is at least proportional to the per-item
      information and translations required. DOI without metadata
      had the advantage that the per-item information required
      was minimal. With metadata it's not clear which would have
      lowest burdens, though the unfamiliarity and lack of applications
      for the handle system could be a disadvantage to DOI here (increasing
      the required programming effort).

      Except that this formula does not apply to S-Link-S, and in
      some cases PURLs. S-Link-S uses rules to locate ALL the articles
      for a particular scholarly journal, not on an article by article basis.
      PURLs can handle relocation of a large number of URL's with a single
      change - but the "suffix" URL's must be unchanged for this to work,
      which is not true of many publisher relocations. In those cases
      where it is true, and especially for S-Link-S, the burden becomes:

      Rule-based Burden/year = B' * New journals/year +
      R' * (total journals) * relocations/year

      where B' and R' are probably larger than B and R, but comparable
      at least for smaller publishers that don't have enough items
      to justify a lot of programming work. Once a journal has 10 or so
      items to publish, rule-based locating is the easiest approach, and
      for larger publishers the zero per-item burden would always be
      an advantage.

      Now rule-based locating systems are not global unique digital identifiers -
      but they keep the administrative burden very low, and so are by
      far the most likely candidates to solve the "lost" information problem
      as far as it can be solved.

      > [...]
      > Re. Arthur Smith's wondering about handle system scalability and the
      > number of current servers: the global system currently consists of four
      > servers - two on each of the US coasts. The primary use of the global
      > service currently is to point to other services, e.g., the DOI service,
      > for clients who don't know where to start. Most handle clients, e.g.,
      > the http proxy, do know where to start most of the time since they cache
      > this information, so in fact the global service is not much stressed and
      > four servers are plenty at the moment.

      Thanks for the clarification - however if we're proposing to put direct
      HDL or DOI clients in every web browser, that burden is going to
      go way up, unless we get cracking on installing local handle
      resolvers in the same way we have local DNS resolvers all
      over the place. And then who's going to administer them and ensure
      that every client is configured to point to the local servers rather
      than the global ones? We at least have an established system for DNS,
      that when new machines are configured with an IP address they are
      also assigned a local DNS resolver, with several backups. Are we
      proposing to add another "local HDL resolver" to the setup
      procedure of every machine on the net?

      The http proxy of course is even less scalable, since it's a single
      machine somewhere (admittedly http servers can be scaled pretty
      large, but this really doesn't solve the problem).

      And as far as I could tell, the handle system doesn't seem to have
      the same redundancy built in that DNS has. Perhaps I misunderstood,
      but the four global handle servers seem not to contain duplicate
      information - rather they each are responsible for a different group
      of handles based on the MD5 hash. The redundancy is really just
      a single secondary server, which also as far as I could tell right
      now resides on the same physical machine (at least the same IP
      address) for all four existing global servers.

      And remember the DOI/HDL system needs to be able to handle
      hundreds of millions or billions of digital objects - that is
      one or two orders of magnitude beyond what DNS has to deal with now.

      > [...]
      > The four million handles per server is a specific implementation
      > limit that will go away later this year, to be replaced by some
      > extremely large number that escapes me at the moment.

      Well that's good. I'm guessing a 2GB or 4GB file size limit was
      the problem? The DOI has several hundred thousand items with
      handles - how many do the global handle servers contain right now
      for DOI and other uses?

      Arthur (apsmith@aps.org)

      --

      Energy: time to change the picture.

    2. Re:DOI's and alternatives to them by apsmith · · Score: 2

      And another old reference on this:

      Date: Tue, 1 Jun 1999 14:02:29 -0400 (EDT)
      From: "Arthur P. Smith"
      To: discuss-doi@doi.org
      Subject: December '98 JEP article?

      See: http://www.press.umich.edu/jep/04-02/davidson.html

      An article by L. Davidson and K. Douglas in the December 1998 issue of
      the Journal of Electronic Publishing raised in a different sense many
      of the issues I recently expressed some concern on with the DOI, as
      well as other issues I haven't seen discussed here at all. Was there
      ever a discussion here of the points in the Davidson and Douglas paper?
      The authors indicate a feeling of encouragement that these problems
      will be resolved, but has much changed in the six months
      since their paper appeared? I'm enclosing their "summary of selected
      concerns" below. Point 2 was the one that I particularly was concerned
      with in the most recent exchange.

      Arthur Smith (apsmith@aps.org)

      ----------------------------
      Summary of Selected Concerns
      The importance of the work being done on the design of the DOI
      System, and its consequences with respect to digital identifiers in
      general, would be difficult to overrate. Solving the problems of
      identifying specific objects on the Internet is extremely important,
      and the work being done on the DOI System will help with that
      solution. Still, there are a number of current issues concerning this
      system that have no easy solutions and particularly concern us:

      1.At present, only established commercial and society
      publishers are purchasing publisher prefixes and so are
      allowed to issue DOIs. This means that most individual or
      non-traditional publishers are not participating directly in
      the DOI System, but are merely acting as end users. Since
      the biggest problems with URL stability and the lack of
      persistence of Internet objects lies outside the products
      provided through large publishers, it is unclear how the DOI
      System is going have any generally beneficial effect on the
      solution of the Internet's problems.

      2.Those who participate in the DOI System will need to
      include in their operating costs the overhead of detailed
      housekeeping of the DOIs and each item's associated
      metadata, upon which many of the DOI's more advanced
      functions will depend. In addition, there are the fees that the
      Foundation will need to levy to support the maintenance of
      the resolver-databases server for the continued tracking of
      traded, retired, erased, or simply forgotten and abandoned
      identifiers. Even with computerized aids, the cost to
      publishers of maintaining the robust and persistent matrix of
      numbers and descriptive text that a handle-based system
      requires will be considerable. Under the current model, the
      annual fees exacted by the Foundation from its participating
      publishers must cover operating expenses. Since no one yet
      knows how high these fees might be, we are concerned that
      costs for smaller publishers and not-for-profit participants
      might be so prohibitive that they will be largely excluded.

      3.At up to 128 characters, DOIs are simply too long to be
      practical outside of the digital universe. The Publisher Item
      Identifier (PII), for example, at seventeen characters, is a
      much more reasonable length and probably is still long
      enough to identify every item we will ever need to identify.
      Indeed, Norman Paskin estimates that only 10^11 digital
      objects will ever require identification.[33] Since it is
      unlikely that we will never need to copy DOIs manually
      from print into electronic format, and since both their length
      and limited affordance (mnemonic content) will make it
      difficult to transfer them accurately by any manual means,
      this could turn out to be a nuisance factor that will hinder
      their widespread acceptance. Long identifiers are also
      harder to code into watermarks, especially in text objects
      that lack background noise in which to hide such data.

      4.DOIs will probably not lead to more open access to online
      materials, at least to those commercially published. In fact,
      most DOI queries from most users, except for those that can
      demonstrate access rights, will probably lead to invoice
      forms of one sort or another rather than directly to the
      primarily requested object. This aspect of the DOI System
      could make the Internet even more frustrating for the
      majority of the users than it is now.

      --

      Energy: time to change the picture.

  23. Good idea but scrap the IP protection crap.. by MikeFM · · Score: 2

    People won't take the IP protection crap and once it leaves the protected lands for the real world they'll rip it to bits.

    The idea of content objects with unique ID's isn't at all new but is a good one. I always liked the idea of using encryption signatures as the keys. give it sig for itself and one for it's owner and build a simple search engine mechanism into the Net itself and you have a nice lil system. An important note might be that such a system does not need to, and possibly should not, replace TCP/IP or even rely on TCP/IP as it's only supported carrier. It should be as agnostic about transports as possible for the most flexibility.

    Jabber might be a good start for this layer since it is a very flexible system for transporting XML-ized content and contact-type information. I really expect something like this to assimilate the web in a couple years. Maybe Jabber merged w/ FreeNet.

    Someone who doesn't know the resource they want could search for it by known facts just as they do now at Yahoo, Google, etc.. once they find it they could store the objects unique id and then every time they needed that object again they could ask the net for it and the closest copy found would be returned.

    --
    At what price learning? At what cost wisdom? The price is a man's peace of mind, and the cost is his life.
  24. RFCs? Who are you fooling? by Andrew+Dvorak · · Score: 2

    The general concensus is that RFCs have been obsoleted by patents. The only difference is that with a patent, comments must be officially sanctioned by the holder. Woohoo! fuzzylogic!

  25. ICANN part 2 by PureFiction · · Score: 3

    Anyone else notice the part about the central Object Id database? Just think how much grief ICANN has caused with the DNS root. I love to think what the Object Id Root owner will be like. This is a lame duck. Companies will probably love it (in theory), however, it cant function in the real world unless almost everyone adopts it. And god knows that wont happen.