Slashdot Mirror


Open Meta Tools Make It Big

Morgahastu writes "Byte.com has a great article about open meta tools and open software in general: "After more than 10 years of open-software development in the scientific community, open software now holds a preeminent place in the operation of the computing community. The three products I have written about simply scratch the surface of the powerful tools available. OpenLDAP and OAI both enable a wide variety of sharing and automated access.""

1 of 68 comments (clear)

  1. Article text in case of /. effect by Rock+'N'+Troll · · Score: -1, Redundant

    I was going to post this anonymously, but I really need the karma;

    Open Meta Tools

    By Bill Nicholls

    February 25, 2002

    I've been writing about metaclass systems for some time now; see Meta Clusters in the Wild, Meta Clusters; OS Updates, and Digital Libraries in the Large.

    "Meta" is often used as a prefix to other words, such as metadata, which means higher level data, or data about data. Analogously, metatools are tools that control tools. This hierarchy may be extended beyond two levels. Meta Tools (MT) are tools that enable use and control of resources. This implies that an MT is logically a higher level function than the resources themselves.

    In this column, I use the term Meta Tools to define a class of tools that enable the use of complex distributed resources. These individual resources may themselves be complex entities that use a different set of tools to control their own components.

    I'll start with an open implementation of the Lightweight Directory Access Protocol (OpenLDAP). This provides an interface to a database of information about people and resources, defining permissions and access from one to the other. Its use is not limited solely to that application.

    Next will be the Open Archive Initiative (OAI), which defines an interface for harvesting metadata from an archive, usually containing published scientific papers. It too is not limited to that one specific application.

    Finally I'll explore some of the capabilities of the Globus Toolkit (GT), which is a set of tools that allows a collection of distributed computer systems to be operated as a single resource. Each of these systems may span multiple different computers and storage in a local network.

    Open LDAP
    LDAP is a lightweight open protocol for accessing information resources. It is an alternative to the X.500 Directory Access Protocol (DAP) for use on the Internet. It uses the TCP/IP protocol rather than the complex OSI protocol.

    OpenLDAP treats a directory as a database rather than a rigidly defined set of names and pointers to files. The information in this database is usually related to people and their permissions to use resources defined there. However, this is not the limit of its application.

    Staying within the broad definition of resource directory, resources for computers as well as people can be defined. This could also be used as a powerful library card file replacement, where books and other types of information sources could be defined, and access enabled through the powerful search functions.

    OpenLDAP is designed to be used in a global environment. It has replication capabilities to distribute information that was entered locally, and can access information from other LDAP servers. A combination of OpenLDAP and UDDI could be used to create a directory of computer services that can be discovered and used by computer agents, enabling completely automated operation.

    Open LDAP can be found at http://www.openldap.org/. The current release of OpenLDAP is 2.0.22. Downloads of the LDAP server, replicator, and libraries, including Java Class libraries contributed by Novell, are available at that site.

    Version 3 of LDAP is a "Proposed Standard" and is documented by the following RFCs:

    RFC2251: Lightweight Directory Access Protocol (v3)
    RFC2252: LDAPv3: Attribute Syntax Definitions
    RFC2253: LDAPv3: UTF-8 String Representation of Distinguished Names
    RFC2254: The String Representation of LDAP Search Filters
    RFC2255: The LDAP URL Format
    RFC2256: A Summary of the X.500(96) User Schema for use with LDAPv3
    RFC2829: Authentication Methods for LDAP
    RFC2830: LDAPv3: Extension for Transport Layer Security

    Copies of these RFCs are available at http://www.rfc-editor.org/. Version 3 of the standard is currently being revised for publication as a "Draft Standard," and will replace the current V2.

    In essence, OpenLDAP is a powerful enabler of services, starting with its own service of search and replication. It could replace the function of DNS (Domain Name Service), though the overhead of a general-purpose DNS directory would be larger than a special-purpose DNS server. The big difference is that OpenLDAP is not limited to one function, and can be easily extended without forcing a complete software upgrade.

    Its uses are only bounded by our imagination.

    Metadata Is Not Enough
    The Open Archives Initiative is an organization that is developing tools to enable metadata harvesting with a standard interface. This means that search and directory tools can collect metadata from any OAI-compliant system for access by people or computers.

    The initial release of OAI 1.0 was in early 2001. It was specifically planned as a prototype; the project developers wanted to gain experience for a year before beginning work on an enhanced version. Work on V2 has just been organized, but OAI has already achieved significant use and is a part of a number of tools that are freely available.

    One of the original motivations for OAI was to solve a problem with scientific papers archived on the Web. The problems existed both for the users of the archive and the archive supporters, in that each group had to develop its own tools and metadata, and each user had to get or build tools to access that metadata.

    Once there were more than a few archives, this would become so expensive as to prevent broad use of these new sources of information. People involved in the development of digital libraries discovered this early on and started the development of OAI. Here are some links to selected OAI web sites from members of the OAI community:

    Cyclades Project -- an Open Collaborative Virtual Archive Environment funded by the European Union.
    Kepler's Home Page -- self-contained software that allows the user to create and maintain a small OAI-compliant archive.
    Open Archive Forum -- an EC-funded "accompanying measures" initiative, supporting projects interested in using an open archive approach to interoperability.
    Virginia Tech Digital Library Research Lab Projects.
    Open Language Archives Community -- a cross-archive searching service for the language community.
    OAI is now used by many organizations, such as arXiv, which uses the software developed by eprints. Eprints software is currently used by many organizations -- a link list is on their site. Version two of eprints is now in alpha and is expected to be released shortly.

    Kepler -- A Personal OAI Server
    OAI is designed for large collections of peer-reviewed information, typically scientific papers. OAI can be used by individuals, but that was not its original purpose, and for that task it is unnecessarily complex.

    In place of having to develop or adapt a full OAI server, a part of the Digital Library Group at Old Dominion University has developed an open-software implementation named Kepler. Kepler is a OAI-compliant package for individuals that is tested for Windows, Solaris, and Linux. It is written in Java and includes the tested Java run time.

    One of the major issues that Kepler solves is the problem of unreliable access to the small archives they call "archivelets." An organization that handles OAI professionally can support high levels of uptime. The individuals and small organizations who are likely to use Kepler don't have the resources for sustained uptime. Kepler addresses this through an architecture designed to be robust in the face of unreliable access.

    The installation instructions for Kepler are less than a page long. An article in D-Lib Magazine for April 2001 has a good overview of Kepler. This package should fit the "small community" model for shared access to published information while making part or all of it available more broadly.

    Meta Computer Toolkit
    The Globus Project has developed a toolkit needed to build computational grids (metacomputing). Grids are virtual environments, collections of computers with a common interface. This toolkit enables computer and information resources to be used from any location despite their being owned by organizations in geographically distributed locations.

    The Globus Toolkit has just reached the first public beta of Version 2. Current support includes toolkits for Linux 2.x, IRIX 6.5, and Solaris 2.8. Along with the toolkit, the Globus Project has developed a specification for integrating grids and web services, called the Open Grid Services Architecture (OGSA). This new architecture will be the design goal for Version 3 of the GT.

    A prototype of the OGSA toolkit was demonstrated on Jan 29, 2002 at Argonne National Laboratory during the Globus Toolkit tutorials (see http://www.globus.org/ogsa/deliverables/prototype. html). This demonstration of the integration of a weather service was implemented and exposed on XMethods.com, which has an extensive list of remotely accessible services and their type, function, and implementation.

    Grid Computing Research
    If you're interested in research about Globus and the grid capability, the Globus Project has a list of technical papers that cover overviews of the Globus project and system; Globus toolkit components; higher level services and tools built on the Globus toolkit; applications; and other Globus-related research. Currently, over 75 papers are listed in PS and/or PDF formats, and range from The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration to Enabling Technologies for Web-Based Ubiquitous Supercomputing.

    Overview of the Toolkit
    The new features in the Globus Toolkit 2.0 are centered around three core areas that provide the main functions. They are the Data Grid, the Metacomputing Directory Service (MDS), and the Globus Resource Allocation Manager (GRAM). The whole system has been enhanced by substantial packaging and installation enhancements.

    The Globus Data Grid project is currently developing several core capabilities for data and communications. GridFTP is a high-performance, secure, robust data transfer mechanism; the Globus Replica Catalog is a mechanism for maintaining a catalog of dataset replicas; and Globus Replica Management is a mechanism that ties together the Replica Catalog and GridFTP technologies, allowing applications to create and manage replicas of large datasets.

    The Metacomputing Directory Service (MDS) is the information services component of the Globus Toolkit. Major feature enhancements include the following:

    High-performance GIIS and GRIS.
    Revised resource representation model.
    Support for GSI authentication and access control.
    Integrated GRIS/GIIS server.
    Newer versions of OpenLDAP and OpenSSL.
    Support for the new Globus Toolkit 2.0 packaging model.
    The Globus Resource Allocation Manager (GRAM) is the resource management component of the Globus Toolkit. The major feature change in GRAM 1.5 is the addition of new GRAM protocol and API features to support more robust job submission and management capabilities. The API specification is not yet available on the Web.

    In addition to these areas of development, Globus has made installation and configuration substantially easier with a complete repackaging of the products. They have now enabled installation from binary or source, whole systems or selected parts, patches to track third-party software changes, and easier configuration. This enhanced packaging system for the Globus Toolkit provides easier installation for Grid builders, application developers, and end users.

    It's an Open World
    After more than 10 years of open-software development in the scientific community, open software now holds a preeminent place in the operation of the computing community. The three products I have written about simply scratch the surface of the powerful tools available. OpenLDAP and OAI both enable a wide variety of sharing and automated access.

    The Globus Toolkit is a major accomplishment in integration of diverse systems into a virtual global supercomputer. While much has been accomplished, much remains to be done. The way ahead for Globus is pointed to by the Open Grid Services Architecture document.

    As an observer of all this activity in the software space, I continue to be excited about the potential for our country and the world in the near future. We are approaching the capability to model the real world accurately, with benefits for everyone.

    With specific reference to weather, we will be able to model storms, tornados, and hurricanes well enough to understand their structure in detail, and learn ways to steer their paths away from populated areas. For cancer and other illnesses, the development of models of the disease should lead to better understanding and treatment, possibly even cures.

    The key to all of these breakthroughs is understanding. Computer models assist this process by processing data and images, testing theoretical models against reality, searching real data and models for ways to change them, and just by marshalling the intellectual effort needed to develop an accurate model.

    I can't wait to see what develops next!