Domain: dublincore.org
Stories and comments across the archive that link to dublincore.org.
Comments · 27
-
Web server with a script and a cron jobCome up with some classification system that works for you.
- Don't organize by color... my wife did this to me once, and I could never find things again.
- Don't use a library-strength scheme like Dewey, LoC or Cutter... you'll kill yourself later. I promise.
- It's much easier to split things into bookstore classification: by general subject, then by author, then by title... but if something else makes sense for you, then do that instead. After all: this is your library.
- If you want to give ID numbers to each book, don't get too hung up on order: this is just a way to find the book in a database. It only needs to make sense to a computer.
Make yourself a basic SQLite database, maybe hosted on a PHP server or whatever you dig. I like Sinatra. After that, the interface is just a matter of how much pizzazz you want to add, and if you want it to be public.
Once you have items tagged with an ID number, saved in your trusty database, you can play with metadata. For something simple, try Dublin Core. If you want to show your collection to the world, try Omeka.
After that, you're going to need a script with reminders for people that inevitably will want to read your book. Every time a person borrows a book, make the script use cron to email them every week, to remind you to give it back.
-
Re:Why is there such a thing as XML encryption?
XML has all kinds of extras - XML-RPC, for example. A list of XML markup languages at Wikipedia suggests there are waaaaaaay too many. There are even two different competing standards for marking up web pages for search engines (besides the archaic metatags) - Schema, a Google/Microsoft invention, and DublinCore, invented by everyone else and a kitchen sink. Of course, XML isn't the only meta-language these days. RDF is the basis for SparQL - the W3C's answer to Cold Fusion.
The entire point of HTML was that it was simple. A billion custom standards, many of which require some sort of library or other handler specifically for them, isn't simple. I'm not clear that any of them provide anything that cannot be provided more efficiently, more effectively and in a more distributed/cloud-friendly manner using servers and utilities that have been around longer, been tested more thoroughly and are genuinely "enterprise-ready" (which I'll take to mean Mr Spock wouldn't object to installing it).
-
Re:HTML5 Will Help Change The Web
There's already a lot of metadata available that is ignored by search engines.
Back in 2006 Slashdot covered an analysis of the tag content of of 1 billion web pages.. Dublin Core tags/attributes were found to be widely used, but at that time were ignored by the search engines.
-
Ah, if anybody could follow Dublin Core...
http://dublincore.org/ is making effort for documetn metadata, imrpoving indexation through document headers.. to me this is a stright line to follow.
-
RDF promotes interoperability and extensibilityStephen's argument is based on the belief that "The Semantic Web will never work because it depends on businesses working together, on them cooperating." He says:
"But the big problem is they believed everyone would work together:
While the argument he makes is grounded in his distrust of corporations, which I share to some degree, his second point above is off the mark, at least for RDF.- would agree on web standards (hah!)
- would adopt a common vocabulary (you don't say)
- would reliably expose their APIs so anyone could use them (as if)"
One of the features of the W3C's model (based on RDF) is that it doesn't push the idea that everyone should adopt the same vocabulary (or ontology) for a topic or domain. Instead it offers a way to publish vocabularies with some semantics, including how terms in one vocabulary relate to terms in another. In addition, the framework makes it trivial to publish data in which you mix vocabularies, making statements about a person, for example, using terms drawn from FOAF, Dublin Core and others.
The RDF approach was designed with interoperability and extensibility in mind, unlike many other approaches. RDF is showing increasing adoption, showing up in products by Oracle, Adobe and Microsoft, for example.
If this approach doesn't continue to flourish and help realize the envisioned "web of data", and it might not after all, it will have left some key concepts, tested and explored, on the table for the next push. IMHO, the 'semantic web' vision -- a web of data for machines and their users -- is inevitable.
-
Dublin Core ... a cautionary tale
So while ago, there was an agreed standard for web metadata, the Dublin Core Metadata Initiative aka ISO Standard 15836-2003. Very few people use it.
-
Uhmm...
The "Semantic Web" is basically the intersection of RDF+OWL, that is to say, it is entirely about taxonomy. The whole idea is that you have a certain nomenclature that you assert against known values, someone else has a different nomenclature that they assert against the same values. You can now cross-reference with a high degree of confidence. For example, using the Dublin Core.
I get people all the time dismissing the whole idea because "man, you'd have to agree on definitions" or "how does 'it' know?" Right. "It" doesn't unless it is explicitly told. If what you call a "House" is in a well-known schema, you simply add an equivalency in your schema et voila, une maison est une 'House.' So, someone else comes along and they want to assert that "'ein Haus ist ein 'maison'," so they assert against the previous schema, and now implicitly ein Haus=une maison=a House. No one had to make the last assertion as it was implicitly true from the previous assertions. So now, in your schema, you make all sorts of categorical assertions about other things relating to houses. Your French and German counterparts now have them for free, as do you theirs. Yes, it takes work, no it isn't completely automatic, yes it is limited to strict taxonomies, but it is still very, very powerful. -
maybe Semantic Web is close...
Too bad that the Semantic Web is a pipe dream at the moment.
You can download the Semantic MediaWiki extension right now and add semantics to a wiki. Currently all the links between pages in a MediaWiki have no meaning, and all the facts in each page can only be extracted by humans reading it. With the upgrade a page can state [[is located in::California]] to explain the type of relationship implied by a link, and can express attribute values like [[population:=1,305,736]]. The current version summarizes all such facts in each page and can export them as RDF. It's a simple extension, but once it's implemented in Wikipedia, you could query for, e.g. the population of every major city in California. Doing such semantic queries using Google is basically impossible, you'll just get a list of pages and have to read and filter each one to create your own list.
Sharing semantics between datastores would require people agreeing on ontologies, which according to people like Clay Shirky is indeed a pipe dream. I'm not so sure, that's like saying categories in Wikipedia are useless because they're disorganized. Just using the Dublin Core metadata to identify authors of information in a common way would be a big breakthrough, and there are simple enough ways to do it in XHTML that I think it'll pick up steam in the next few years.
-
Think about the Dublin Core
In HTML, you can consider the data in the head to be 'metadata'. See the Dublin Core Metadata Initiative. The data in the head is 'invisible' to a web surfer (save for the title), but quite useful for the upcoming 'Semantic Web' and even for filtering on Google. However, since statisitically speaking, there are more people that lie than correclty use this metadata, it doesn't seem that it helps your PageRank with Google to have accurate metadata. In any case, this sort of data will not corrupt the rest of the file, e.g. the 'body' if the html.
-
Re:Who cares, they both suck.
if I pick an arbitrary feed I see ISO8601 dates. Has the spec changed?
No, most are no longer using pubDate now, an extension (Doublin Core) is used instead (that's why the tag name is "dc:date"). If anybody puts an ISO8601 date into pubDate, he deserves to be shot (yes, I had to write a parser for RSS).
-
Half way there
Marking up Hilton as <motel> or <celebrity> is all very well. This is what XML is for.
One of the key points behind the semantic web is to define meanings to your meta tags. My system has a <partnumber> tag and so does yours, but that doesn't mean they're the same. I can publish my definition of <partnumber> so that other apps can know how to interpret my partnumbers. Complex definitions can be provided in computer-readable format, which can then be looked up, referenced, shared etc. with other systems.
Take Dublin Core, for example. A standard set of tags to describe document attributes, such as title and author. Why should I write my own <author> tag when I can simply pull-in part of Dublin Core's vocabulary. Not only does that save me (the developer) time, but it means any app that knows about Dublin Core will know what I mean when I say "author". Or, if an app doesn't know about a particular term it can simply go look it up.
Sharing vocabularies is time-saving, but also helps computers process information automatically. Mr Berners-Lee and some colleagues had a good article published in Scientific American a while ago which explains their vision of intelligent software agents doing the sorts of things computers should be doing with the information the web has to offer. Such as automatically adjusting your schedule if your gym's online timetable has changed and your squash game needs to be moved. OK, that's a very basic example, but the point is that although the information needed to do this sort of stuff is already on the web, it is currently only readable by humans.
If anyone is interested in learning more about this stuff then have a look at the Resource Description Framework (RDF) which is a foundation technology of the Semantic Web (There's more to it than HTML META tags!). There's a lot of activity involving RDF-based technologies such as OWL, FOAF and the popular RSS.
-
Re:How can a court enforce the ruling6) Does anyone really use meta-keywords other than spammers
No, and it is unfortunate. I work with a web cataloguing effort in computational science education (The Computational Science Education Reference Desk) and I spend a lot of time trying to define standard metadata for pages on the web.
The job of building digital libraries will be much easier and will better reflect the intention of people who create web content when web content creators put, at a minimum, title, description, and keyword metadata into their pages (and preferably much much more.
The more that meta-spam is used to beat search engines, the less that people will put metadata into their pages, and as a result, the less time that people will spend actually thinking about and creating good metadata.
I have no idea what impact this law will have, if any, but I would like to see more search engines that use metatags, but include some sort of "meta-spam" filter, perhaps a penalty on excessive use of keywords.
-
How about the TEI XML format?
> However, it insists on at least a plain vanilla version of a text, as that format has proven to be the most durable and accessible.
Sometimes the illustrations that accompany a text are crucial for its understanding.
How about using the Text Encoding Initiative's TEI XML format instead? Graphics can be included using its figure tag. Combine the TEI XML markup with Dublin Core metadata and people could search PG's library by author, publication date, publisher, etc.
The markup can be stored as ASCII text and edited with a simple text editor. This format can also be rendered to ASCII for legacy purposes...
-
Check out Dublin Core Metadata
You might find Dublin Core Metadata as an easier way to start than the W3C page for OWL.
-
Re:Meta data is seductive, but its a fools method.Insightful? Splutter, choke, coffee splatter on VDU.
There so many things wrong with this post it's hard to decide where to bite. And at AC too...I feel foolish even typing this, but...
In the context of an image file, the datum(*) is the image. The metadata is information that is describing that datum. Whether it is stored in the file or outside the file is irrelevant, conceptually. I could have a text file, then I could write some metadata describing the text across various defined categories (Dublin core fields, perhaps). I could store this in another file. Then I could concat the two files into one. What do I have? One file, two files, doesn't matter essentially. Conceptually I have a datum, and metadata. Regardless. To me, storing metadata in the same file as the datum itself is MUCH MUCH more sensible as it keeps everything together. You can't lose or unlink to associated files / databases etc. Unfortunately, the format du jour, the JFIF (JPEG basically) is not very rich in this regard. SPIFF is way better, as is GIF and PNG The concept itself is very, very good and has not yet come of age. The file formats are improving their capability in this regard.
A great search capability can be made with internal metadata just as easily. More so, perhaps. There are only a few image formats to worry about, so it isn't that hard to support them all.
And, files can be objects, dumbass. At least, they absolutely can be static representations of objects, dumberass.
On a personal note, you are a ignorant jerk. Go away. Irritant.
(* or data, as you prefer)
--
Slashdot sucks
-
I Used to Work for OCLC
I obviously can't speak for them, but I can provide some background on what they do. OCLC is a nonprofit org providing services for approx 45,000 libraries around the world. If you are a librarian and need to figure out how to catalog a new book in your collection, you go to OCLC to see how others have done it. Ever needed an item that wasn't in your library? OCLC handles the system for arranging inter-library loans. They do a fair amount of original research for libraries and they even open source some of the results. PURL is another OCLC project that some of you may be familiar with. The Dublin Core MetaData Initiative was co-founded by a researcher who got his start at OCLC and is now running the W3C's Symantic Web Initiaitve. OCLC is very well known and respected in the library community.
Library budgets the world over are under attack given the current economic situation. This leaves less and less money available for building the kind of common infrastructure that will help libraries continue to provide new and relevant services for their patrons as more and more of the content becomes digital. OCLC certainly has both the right and the need to defend the Dewey Decimal Trademarks from infringers. -
Re:Open source, anyone?
No need to reinvent the wheel. Since the use of physical cards, paper, or books is obviously an obsolete method of cataloging in the electronic era, the following are likely what future cataloging systems will look like (based on XML, of course):
W3C's RDF Specification: http://www.w3.org/RDF
Dublin Core: http://dublincore.org -
A better link for AVELHere is a better link for the AVEL website. The AVEL portal is based on DSTC's MetaSuite software for managinq Dublin Core metadata. The portal provides simple and advanced search functions, as well as browsing by category.
(Disclaimer: I work for DSTC
...) -
Not original but not bad.
I like this guys enthusiasm for open source.
I have questions though about the users ability to apply meaning attributes to the numerous amounts of content. If the user fails to provide meaningfull attributes the system fails to provide the user with meaningful results. In which case I would judge this system to less user-friendly because the files would be returned in a 1 big lump.
This idea stricks me as an implementation of something similar to the Dublin Core Metadata Initiative except for local content. Wouldn't this project benefit from enabling the user to manage ALL types of information, even remote. It wouldn't be a large stretch of the imagination to take that step.
If anybody is interested how the Dublin Core works in application you might want to check out the Zope CMF(Content Management Framework).
My experience from using Zope's CMF is that the initial learning process of a user using this method of organiztion was slow and bumpy. Although I must point out that my experience with the system was only with using a single implementation, so I'm not making the assertion that an implementation couldn't be designed that could improve the learning curve for users.
I would also like to point out to the people that have said this would ruin Linux that they don't understand exactly what this tool does. Its a means of effeciently catalogging and managing content. Any use of the tool does not restrict the user to that tool alone; it can be used in conjunction with the traditional HFS. The author even says so in the article. -
metadata not filesystem
From a quick look at the spec, this is a metadata format, not a filesystem. It's intended for more than CD/DVD media as well, notably flash filesystems, which are different of technical necessity.
It seems to be mostly oriented toward labelling, describing and presenting collections of images. For what a first look is worth it doesn't necessarily suck, either. They mention dublin core metadata.
A nice add on to comments in the jpeg header, anyway.
-
The Weblog MetaData Initiative
I like sites like this ... but isn't their already an effort to define and tie blogging communities via the The Weblog MetaData Initiative?
I mean, Waypath is at one level convenient, but no more so than well established weblog communities such as
blo.gs, the Eaton WebPortal and blogs4God. Moreover, when it comes to gleaning headline news via a blog, I would suspect the real weapon of cohice would be our personal aggregators such as Amphetadesk and HotSheet?
Which is where the WMDI comes in. It helps me identify sites via xml-ish mechanisms such as the Dublin Core Initiative ... which is why I would think someone who's blogging their brains out for the hottest headlines might not be better served by the WMDI.
Then again, your mileage may vary.
-
Re:Librarians
I doubt it. Someone still has to understand the standards, how it all fits together. Your library catalog might have a slick user interface, but there's a lot more to library science than just the dewey decimal system. (If you don't believe me, knock yourself out reading MARC standards, for starters). Librarians will do more and more with technology, but somebody needs to understand at a deep level how the technology maps to the underlying standards and practices, and if AI has taught us anything, it's that it's a lot harder to encode human expertise than you might think. Knowing how to (re)search is far from a trivial skill, and knowing how to assign meaning or metadata to data is something I think computers will never be able to do as well as humans.
-
meta tags GOOD
Author, generator, description (very important when your content doesn't look too hot in a search engine summary; hello ALA and your dumbass "this site will look much better in.." blurb), content type and the way too often overlooked text encoding, and things like DCMI.
They're also useful for keeping your documents in a form you can process later; you can, for instance, embed creation dates, CVS revisions, shorter/alternate titles and summaries for links.
<slaps timothy for spreading FUD against a perfectly useful HTML tag>
EAT FLAMING DEATH TIMMY! -
Re:Better Metadata
It seems to be a chicken-and-egg situation at the moment -- I'm doing quite a lot of work producing Dublin Core metadata in XHTML and RDF format for a content management system, however no search engines yet support the indexing or searching of this metedata.
When they do then a proposal like this might make (some) sense.
-
LINK tags
For general reference, the HTML4 LINK tags are defined here
You can add your own, but if you do, you should use a profile statement. See the Dublin Core for the usual example.
-
What's wrong with America ?
This dispute seems to encapsulate a lot of what's wrong with America these days. 8-(
Why does anyone need an "Official Geocaching Site" ? Get off your fat SUV-encased butts, get out there and be your own "official" leagues and teams. You don't need some corporate Disney-wannabee telling you how to enjoy yourselves!
Geocaching needs a minimum of two people, some cheap tech, and a flyposted wall poster to communicate between them (oh, and several billions of technology funded by those nice people at the military-industrial complex). You don't need an "official" site, a hierarchy, a league, or a figurehead chairman (especially not a self-appointed one).
Ignore geocaching.com. Don't boycott it, that's itself too organised, just go and do something else instead. There's a whole internet to play with (thanks again to those helpful mil-ind people) - read PhilG's book, and build your own geocaching list server.
What is it with America, "Land of the Free", that can't even fix itself lunch these days without a degree of regimentation and standardised prole-feeding-centres that would put North Korea to shame ? Did you throw off the yoke of colonial British Redcoats, just so that you could be fed by uniformed redshirts ?
Secondly, the map site is legally screwed. He's not providing map references, he's providing direct references to someone else's collection of information. As any amount of legal precedent has shown, a collected work like this is material protected by copyright (and rightly so).
If this map site just listed links to locations, links as DCMI points or to the Getty Thesaurus, then there would be no problem -- but that's not what it's doing.
-
RDF vocabulary
If you're looking for a standard "vocabulary" to use in the context of RDF, W3's RDF FAQ has a link to suggestions about how to implement the Dublin Core tags via RDF. For a more specific and extensive vocabulary, you're probably right - there's very little agreement about what sort of standard to use. It's kind of ironic actually; libraries have been using one of two different organizational systems (Dewey or LOC) for roughly a century, either of which seems like it would lend itself handily to indexing the web topically. Yet in the quickest-growing body of knowledge on the planet, nobody wants either of those, and nobody seems to be able to agree on anything new either.