Slashdot Mirror


RDF For Desktop Metadata?

claes writes "There is an article "Metadata for the desktop" that suggests that RDF should be used to describe data in desktop environments. This is an interesting idea. RDF is already used by Creative Commons to attach license metadata to its works. Mozilla also supports it. RDF was designed for the web, but can it also find its way to the desktop? And what metadata is most important to describe?"

167 comments

  1. The killer app for metadata on the desktop by foidulus · · Score: 5, Funny

    is porn!
    Suppose today I want to see shaved asian hardcore action. Now provided that metadata searches are integrated into the OS(like they will be in Tiger), all I need to do is a quick metadata search on my hard drive and boom, there is what I am looking for.
    I mean provided there was a decent standard(a porn standards body would rule!) and good regex capabilities built into the OS, I would be willing to pay for porn. I know that there are comments built into the jpeg standard, but there are all sorts of porn file formats, it would be helpful to have a universal standard across them. It saves time, beats trying to search on google and going through a lot of crap just to get to something good. I am a man on the run, I have places to go, I can't be bogged down by my porn. Plus, think of the people that get to catagorize this stuff(well, the fun stuff anyway, not goatse), what an awesome job that would be!
    I should probably post AC, but I figure this post is bound to earn me at least one fan and/or freak.

    1. Re:The killer app for metadata on the desktop by PowerBook2k · · Score: 5, Funny
      Suppose today I want to see shaved asian hardcore action.


      Just check your email. If it's not there now, it will be soon enough.
    2. Re:The killer app for metadata on the desktop by lawpoop · · Score: 1
      "... today I want to see shaved asian hardcore action."

      Are you, perchance, a geek?

      --
      Computers are useless. They can only give you answers.
      -- Pablo Picasso
    3. Re:The killer app for metadata on the desktop by foidulus · · Score: 1

      That obvious huh?
      I guess even moreso in that I watch but, have yet to participate in it.
      Geek squared!

    4. Re:The killer app for metadata on the desktop by scupper · · Score: 1

      You speak for the "silent majority". Lead on!

    5. Re:The killer app for metadata on the desktop by Anonymous Coward · · Score: 0

      "Suppose today I want to see shaved asian hardcore action."

      Why do I get the feeling this isn't just "supposition"?

    6. Re:The killer app for metadata on the desktop by Anonymous Coward · · Score: 1, Funny

      Awesome! I love it. I'd clap, but, uh... I can't right now...

    7. Re:The killer app for metadata on the desktop by Anonymous Coward · · Score: 0

      I don't think that virgin, "shaved asian hardcore action" watching, slashdot subscribers are any kind of majority...

    8. Re:The killer app for metadata on the desktop by Anonymous Coward · · Score: 0

      I mean provided there was a decent standard(a porn standards body would rule!)

      Have you ever actually been to a porn site? Nearly half of the metadata would be wrong or misleading. Your "shaved asian hardcore action" would give you a hairy fat lady in Nebraska fucking goats. Or, more likely, a list of links to sites listing links to sites that claim to have porn, but really are just an experiment in showing small flashing penises to geeks.

    9. Re:The killer app for metadata on the desktop by foidulus · · Score: 1

      Yeah, that is why I would only pay for CORRECT stuff, people would try to game the system, just like they try to game every other system, eventually(one hopes) the cheaters fall out and the only ones left are the ones who are actually providing a wanted service.

    10. Re:The killer app for metadata on the desktop by cbogart · · Score: 1

      Any porn you download will pop up in pretty much every search you do on your own system, just like it does on google, because they'll copy the entire dictionary into their metadata fields.

      I guess we'll need "don't index" flags for files that we know are maliciously mismetadated.

  2. Definition:...? by bogaboga · · Score: 5, Interesting

    Why don't slashdoters define what meta-data is in the first place? Google's define: metadata lists not less than 20 definitions. Are we talking about "data about data"?

    1. Re:Definition:...? by ResidntGeek · · Score: 2, Informative

      Yes.

      --
      ResidntGeek
    2. Re:Definition:...? by doshell · · Score: 2, Insightful

      So, data describing metadata would be called metametadata?

      --
      Score: i, Imaginary
    3. Re:Definition:...? by Anonymous Coward · · Score: 3, Informative

      In short, Yes.

      Say you have a digital photo. It's from a vacation you took in 2002, to hawaii, and contains photos of you, your partner, one of your children, but not your other kids and no pets. All that info could be kept as metadata of those pictures, and more.

      The same can be done for finance info for the year 1999 for you, or 2001 for your partner, or music files bought from a certain place, by a certain artist and band.

      While each of the filetypes above can have their own metadata (exif for images, comments for excel spreadsheets and mp3 tags for music) not all of it is singularly accessible and searchable by the one mechanism by the OS.

      This is a good goal.

    4. Re:Definition:...? by Jugalator · · Score: 3, Insightful

      Yep, it's called like that.

      I don't see with the thread started wanted a definition by Slashdotters in the first place, since it's already pretty well described and AFAIK the word doesn't have several meanings.

      --
      Beware: In C++, your friends can see your privates!
    5. Re:Definition:...? by Jugalator · · Score: 0, Offtopic

      Wow, I'm typing like I'm drunk...

      s/with the thread started/why the thread starter/

      --
      Beware: In C++, your friends can see your privates!
    6. Re:Definition:...? by ResidntGeek · · Score: 1

      Oh, but there were 20 definitions in google's define:metadata, and that's just so much to read. Had he read it, he'd have noticed that 13 have the exact phrase "data about data", 3 say "information about data" instead, one says "information about a file", one says "data about the data", one says "data that describes something", and one says "Data that provides information about, or documentation of, other data". I wouldn't have modded it interesting.

      --
      ResidntGeek
    7. Re:Definition:...? by 0racle · · Score: 1

      All those definitions say the same thing, so what was your problem?

      --
      "I use a Mac because I'm just better than you are."
    8. Re:Definition:...? by zephc · · Score: 2, Funny

      I never Metadata I didn't like.

      --
      "I would say that 99 per cent of what my father has written about his own life is false." - L. Ron Hubbard Jr.
    9. Re:Definition:...? by Lehk228 · · Score: 1

      it is file info, resolution, dimensions, bitrate, keywords, framerate, previous owners of the file, access history, what colors are most common in a picture, who is in a pic. basically any information about a file you may want to know to sort or find that file

      --
      Snowden and Manning are heroes.
    10. Re:Definition:...? by Anonymous Coward · · Score: 0

      Why don't _you_ define define?

    11. Re:Definition:...? by perlchild · · Score: 1

      You got it, except that "data about data" or "information about information" is usually interesting, but a bit vague. In this case, we'd be talking about "Data about computer contents". We're actually super-classing, as it were, RDF, which is usually "data about written text", an article on slashdot, for instance.

      Your computer already stores data about its files and such, but that's metadata's readability by humans is a bit questionable(all the concepts except file name and an eventual comment only make sense inside a computer). What's interesting about this new idea, is that RDF will allow to encode data that makes no sense to your computer, like say, "Saved voice mail from hottie" Yes this could be in a comment, but it might also be say, a MacOSX label, or something more formal, note that the article also implies you can attach several different types of metadata to a single "datum", and you can also specify a role to each. Say you could associate a second bit of metadata once you learn the hottie's name. Using comments means you'd have to alter the comment.

      To get back to your question, we can't define metadata, not in this case, since you have the exact definition. What we need to do, is specify what DATA we're talking about.

      Data about desktop contents is a bit different than say webserver contents, or database contents, or metadata about confidential information about a patient, but it's still data about data, hence, metadata.

  3. Implicit feedback for filesystem information by PureFiction · · Score: 3, Informative

    I am a big fan of implicit filesystem feedback. This can support all kinds of services from file sharing to most recently accessed search requests. Even fine tuning access controls in an RSBAC security policy.

    The big concern is keeping this data protected and private. You dont want to share all of your metadata with everyone, so security of these systems should be something to look at carefully.

  4. What happened to forked files? by Amiga+Lover · · Score: 4, Insightful

    Are there any filesystems left that use forked files? Resource, Data and Metadata forks? Any at all?

    While MacOS was at a disadvantage being one of the only ones to use it, wouldn't it have been an excellent advantage for ALL filesystems to be forked?

    (I don't know the answer to this - anyone who knows more about filesystems, give your thoughts)

    1. Re:What happened to forked files? by djcapelis · · Score: 1

      I think NTFS actually has a similar stream feature where such things could be embedded. Reiser has a concept that everything should be a file, so you might as well hope for M$ to release a driver for Reiser than Reiser to do forks... not sure about ext.

      NOTE: take this with a grain of salt, I know very little about filesystems.

      --
      I touch computers in naughty places
    2. Re:What happened to forked files? by ResidntGeek · · Score: 1

      According to http://www.tux.org/lkml/#s9-15, it's not happening in Linux filesystems.

      --
      ResidntGeek
    3. Re:What happened to forked files? by Jugalator · · Score: 4, Informative

      Forks? Would that be the NTFS streams?

      I think the new filesystem WinFS in Longhorn is basically just an evolution of NTFS streams to make them more accessible for the users. They've always been there, just not very accessible besides a limited set of text fields in the file properties dialog box in Windows. (i.e. they've always been able to hold custom data and have custom key names)

      --
      Beware: In C++, your friends can see your privates!
    4. Re:What happened to forked files? by k98sven · · Score: 2, Informative

      While MacOS was at a disadvantage being one of the only ones to use it, wouldn't it have been an excellent advantage for ALL filesystems to be forked?

      Well, one problem immediately springs to mind: The translation between different metadata formats. It's already a pain in the butt when using transferring files of not-so-popular types to the Mac.

      The second gripe I have with the Mac is that it's so friggin' hard to edit the metadata. AFAIK you can't even do it on OS 9 without software. Now assuming the user is too stupid to change this manually is good. But not providing the ability at all, even for people who know what they're doing is just stupid.

      (Windows first hides the extensions, then if you try to change them, it warns you first. That feels about right for me. - Not that extensions isn't a klugde.)

      Apart from that, I agree.. anything is better than file extensions.

    5. Re:What happened to forked files? by Anonymous Coward · · Score: 3, Insightful

      > wouldn't it have been an excellent advantage
      > for ALL filesystems to be forked?

      Yes, but the trouble of compatibility remains. But there is a simple solution for this: fork as dir bundles: Instead of a file with a metadata fork you simply put the metadata file and the datafile into a dir and give that folder the name of the datafile. The current users copy the dir around and use its contents. But modern OSes treat the dir as if it is the datafile when the user interacts with it.

      The metadata file says 'treat this dir as a file, when the user opens it please open the datafile called ... instead'

      This is what Mac OS X does.

      This has some cool advantages for the future of metadata because the metadata file can refere to multiple files inside the dir. Not just point out the datafile but also point out the Mac OS X icon (which is simply a tiff file) and even a custom kde icon. Yes you could have complete container documents like a webpage where individual objects can be individually for the knowledgeable user simply by opening the dir or access them as a whole.

      It gets even better when you look at Applications in Mac OS X. Seemingly a file you can doubleclick to execute but actually a dir you can access with file organized in subdirs. Language dirs with UI files and text files you can translate, executables for different platforms, the required libs. It could even contain the source code yet it looks like, and by default works like, a single file which you can copy to the harddisk to install and drag to the trash to uninstall. That's how simple computing should be.

    6. Re:What happened to forked files? by Shachaf · · Score: 0

      WinFS is not a filesystem. It's "the active storage subsystem in "Longhorn" that is used for searching, organizing, and sharing data".

    7. Re:What happened to forked files? by roshi · · Score: 1
      A couple of things about this:

      First off, the ability to use file type and other arbitrary metadata still exists in OSX (or HFS+, as the case may be). (More here.) This is above and beyond the much maligned resource fork.

      The real issue both with resource forks and (to a lesser extent) filesystem level metadata is inter-system transport, ie how do you ftp the metadata along with the file. This is what made resource forks such a PITA.

      Apple, it seems, has now moved away from putting the metadata in the FS, despite having the ability to do so, even as MS scrambles to stick metadata in their FS. I'm skeptical of the centralized DB approach, the FS approach seems a cleaner design, but the central DB does have the advantage of constantly pulling metadata out of the files and apps, thereby updating itself on the fly. Furthermore, in the separate DB approach, if the DB gets corrupted, you can trash and rebuild, if your FS gets corrupted.... that's a bigger headache. Time will tell which approach is better.

      The author of the linked article seems to propose RDF as a solution, but I'm not convinced how well storing all that metadata as text in a "dot-file" will scale. And you still have the problem of getting that metadata from one system to another, despite having a common format.

      One hopes that both Apple and MS can solve the problem of having their own systems cleanly exchange metadata.

      Just some thoughts...

    8. Re:What happened to forked files? by mrchaotica · · Score: 1

      The weird thing is, though, that Mac OS X has bundles and single-file resource forks. I understand using one or the other, but both?

      --

      "[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz

    9. Re:What happened to forked files? by sydtsai · · Score: 0

      Spotlight
      Here is your solution.
      It's pretty extensible so you can add-in the file formats that you wanna search.

    10. Re:What happened to forked files? by sploo22 · · Score: 1

      I thought Ext2/3 supported "extended attributes", which are basically the same thing.

      --
      Karma: Segmentation fault (tried to dereference a null post)
    11. Re:What happened to forked files? by hunterx11 · · Score: 1

      You mean Apple will have to make its system exchange data with Windows which will use a system without the slightest thought to portability. I'm not trying to troll--this is less an issue of Apple believing in better code than it is a consequence of the fact that almost everybody uses Windows, so MS can afford to act like they are the only player, whereas Apple would be stupid to act the same.

      --
      English is easier said than done.
    12. Re:What happened to forked files? by The+Vulture · · Score: 1

      Going back someways, GEOS (on the Commodore 64 and 128, I don't know about other versions) had a version of this that they called VLIR (Variable Length Index Record) files.

      A VLIR file would have one sector, and that one sector pointed to multiple other sectors. One of the sectors was used for the "information sector" (info on the file), and simple VLIR files would then have the data in one of the other pointers.

      More complex applications, like geoWrite, would use one pointer per page of the document, this limiting you to a page/graphics count.

      VLIR files were nice, but they caused problems for almost every non-GEOS program, since files were expected to be a series of raw bytes (sectors), not segments. Even in GEOS, I had problems with VLIR files, when writing applications (I tried writing a DeskTop replacement, eventually I succeeded with a geoWrite word counter Desk Accessory).

      GEOS VLIR Information

      -- Joe

    13. Re:What happened to forked files? by kinema · · Score: 1

      Wouldn't extended attributes be a type of metadata fork?

    14. Re:What happened to forked files? by Anonymous Coward · · Score: 0
      The second gripe I have with the Mac is that it's so friggin' hard to edit the metadata. AFAIK you can't even do it on OS 9 without software.

      You can do it with AppleScript, which is kind of like using a program, except that it's included with the system.

      I still have a bunch of file-typers that I used to keep in a folder in the lower-left corner of my desktop. They're just AppleScript droplets. Don't need 'em anymore.

    15. Re:What happened to forked files? by womby · · Score: 1

      the forks are there for historical reasons
      bundles are the replacement technology

      until all data on a macos system has no resource fork they cant remove the support.

      --
      **** lying is wrong even for sleeping dogs
    16. Re:What happened to forked files? by bcrowell · · Score: 1
      The real issue both with resource forks and (to a lesser extent) filesystem level metadata is inter-system transport, ie how do you ftp the metadata along with the file. This is what made resource forks such a PITA.
      People used Binhex format for that, back in the MacOS 5-ish days, and there was nothing about it that was inherently a pain. It was just that in those days, the open-source movement hadn't really taken off, and there were a lot of people still wasting their time on the dead-end shareware scene. So yes, finding free encoding and decoding software was often a pain, but that's ancient history. Since we were running neither open-source apps nor an open-source OS, the system and the apps also couldn't be taught to encode and decode Binhex automatically, but again, that's just a historical accident.

      The other thing that made it a bit of a pain was that old MacOS had no real concept of a plan old vanilla text file, without metadata attached. The typical user would not have any application on his system that would even allow him to create such a file. Well nowadays, things are different. Emacs is installed on default on MacOS X, for instance. If I don't need metadata, I can create a plain text file with emacs on a Mac.

      I think metadata is a good idea, but to make it an "I gotta have it" kind of thing for Unix, you'd really have to redesign the Unix shell and all the command-line utilities completely. What people have always liked about Unix was the ability to do stuff like 'cat -n foo | grep "bar" | ...'. The designers of Unix made text files the main event. To achieve that level of greatness in design with a metadata system, you'd have to have equivalent of cat and grep and sh that knew which files were plain text, which were formatted text, and which were goatsex pictures.

    17. Re:What happened to forked files? by Anonymous Coward · · Score: 0

      It gets even better when you look at Applications in Mac OS X. Seemingly a file you can doubleclick to execute but actually a dir you can access with file organized in subdirs. Language dirs with UI files and text files you can translate, executables for different platforms, the required libs. It could even contain the source code yet it looks like, and by default works like, a single file which you can copy to the harddisk to install and drag to the trash to uninstall. That's how simple computing should be.

      Yeah, and aren't Apple amazing for thinking of it!

      Oh, wait, RiscOS had that feature like a decade earlier.

      Really we owe more to Acorn than many people realise. The Archimedes was the first RISC home computer (way before Apple moved over to PowerPC), the first true multitasking 32-bit desktop OS... and it never made an impact outside the British education market. Sad or what?

    18. Re:What happened to forked files? by Anonymous Coward · · Score: 0

      huh? Archie only had coop multitasking. Amiga had preemptive multitasking and was out a year earlier in the US.

    19. Re:What happened to forked files? by MobyDisk · · Score: 1

      Windows XP uses them on NTFS filesystems. If you set Explorer to Thumbnail preview mode, it a hidden file named thumbs.db with a separate stream that has the actual preview data in it. It's a terrible misfeature in many ways:

      1) The thumbnail file can get corrupted and the folder cannot be viewed.
      2) The thumbnail file takes space.
      3) The thumbnail file cannot be copied -- so explorer complains every time you do a select-all of the folder, or try to copy the file.
      4) If you burn the folder to disk, it prompts you to ask if you really want to burn that file too.

    20. Re:What happened to forked files? by Anonymous Coward · · Score: 0

      Amiga STILL doesn't have memory protection, what the hell is multitasking good for if anything can and does trash everything else at will?

  5. Integration by mrchaotica · · Score: 4, Interesting

    Why does the document complain about the lack of integration, then mention that Microsoft, Apple, the ReiserFS people, etc. are coming up with solutions, and then adds a completely new one? Shouldn't they just be supporting one Apple's or ReiserFS's efforts?

    --

    "[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz

    1. Re:Integration by ResidntGeek · · Score: 1

      Nah, that's how they tried to fix the Western Schism. Everyone started lining up on different sides, until they gave up, kicked all three popes out, and elected a new one.

      --
      ResidntGeek
  6. What is wrong with you people? by Anonymous Coward · · Score: 2, Insightful

    Sure. I have no objection to a more extensive use of metadata. In fact I crave it - must have it.

    But why oh why do people think that XML-based solutions is the way to go? An RDF solution would be bloat beyond belief. Ok, so it's not that bad for a few files, but when we get down to it - we don't have just a few files. We have plenty of them.

    So why not use something smaler? A simpler protocol?
    We can still have RDF-frontends for those that crave their daily XML-fix. Get real.

    1. Re:What is wrong with you people? by claes · · Score: 1

      RDF does not equal XML. RDF is a way to express relationship through graphs. RDF/XML is one way to express these relationships, but there are other ways too. I thought that RDF always had to be expressed with XML too, but then I read the
      RDF primer. At first I thought it was extremely overcomplicated, but after reading some more I started to grasp the concepts. And they are not about storage formats. They are about semantics.

    2. Re:What is wrong with you people? by jsled · · Score: 1

      Yeah ... RDF is the simplest thing in the world. Subject+Property+Object ... Object is either another Subject, or a literal value.

      And N3 [or Turtle] is a far better serialization than XML.

      Semantics are important, but _agreement_ is even more so. The hope of RDF is that when we get away from the sillyness of XML and start agreeing about how to speak about relations [in terms of SPO], we can start talking about more interesting things like schema and semantics.

    3. Re:What is wrong with you people? by kfg · · Score: 1

      XML is also about semantics and graphs, which is why RDF is expressed in it. As such it isn't about relationships, it's about heirarchy. It most certainly isn't about data storage at all. That's a hardware issue that can't be solved by a markup language. The fact that some people are stretching it beyond the breaking point to try to make into a heirarchical database doesn't alter that fact. It's a semantic markup language. Period. Its theoretical basis in mathmatics is graph theory.

      In fact, XML is only intended for transfering data, which can be more efficiently, and more easily, transfered by the simple expediant of agreeing to semantics ahead of time and not bothering with all the bloody tags. XML itself recognizes this by bundling that agreement with the document in the form of a DTD, which, once you have, you don't need all the bloody tags.

      RDF, from what I can gather quickly, may be formulated with a predefined semantics, so you you don't need to use XML and thus you don't need all the bloody tags, but, as per above, the same goes for XML itself.

      Did I mention that you don't need all the bloody tags?

      KFG

    4. Re:What is wrong with you people? by Anonymous Coward · · Score: 0

      And what about all the bloody tags. Do we still need them?

    5. Re:What is wrong with you people? by kfg · · Score: 1

      And what about all the bloody tags. Do we still need them?

      Did I mention that we don't need all the bloody tags?

      KFG

  7. This is largely irrelevant if you have experience by Real+Troll+Talk · · Score: 4, Insightful

    Since most of us are advanced computer users or even computer experts, I think we largely know how to search for content.

    For one thing, I always give my filenames relevant titles, not things like document06.doc.

    Also, I already know how to search through files for content using basic grep or advanced Windows searching.

    I mean, sure, meta data like ID3 tags for MP3s that I steal offline are important because my Nomad mp3 player indexes based on that info, but in general I'd say meta data is not quite as important as some may suspect.

    --

    If you liked my post,
  8. Calling Autopr0n! by mrchaotica · · Score: 2, Funny

    If ever there was an appropriate thread for him to post in, this is it! : D

    --

    "[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz

    1. Re:Calling Autopr0n! by Anonymous Coward · · Score: 0

      Yeah, I too thought he'd be "all over" this thread, wonder where he i...oh, he might already be hacking away at triples and ontologies like there's no tomorrow huh?

  9. FS support for metadata by doshell · · Score: 5, Interesting

    I've heard the NTFS file system is designed to allow the system to add any number of properties (besides the obvious filename, last access time and permissions) to any stored file. This is likely to be exploited by Longhorn, which is planned to be capable of appending metadata to newly created files (for example, if you download a file from the Internet, the system would likely append a Originated-From-URL property to it).

    What I wonder is, is there any filesystem in the FOSS world that supports something like this, or are there plans to make it supported before 20??, when Longhorn hits the stores? I see this as a critical feature that must be made available by non-Windows OSes.

    --
    Score: i, Imaginary
    1. Re:FS support for metadata by PhrostyMcByte · · Score: 2, Insightful

      I don't think you can attach metadata to files with NTFS. If you can, I havn't seen the API for it anywhere while coding.

      Longhorn is using WinFS, which afaik is just a metadata layer slapped on top of NTFS.

    2. Re:FS support for metadata by doshell · · Score: 1

      Quoting from http://www.digit-life.com/articles/ntfs/:

      Each file on NTFS has a rather abstract constitution - it has no data, it has streams. One of the streams has the habitual for us sense - file data. But the majority of file attributes are also streams! Thus we have that the base file nature is only the number in MFT and the rest is optional. The given abstraction can be used for the creation of rather convenient things - for example it is possible to "stick" one more stream to a file, having recorded any data in it - for example information about the author and the file content as it was made in Windows 2000 (the most right bookmark in file properties which is accessible from the explorer).

      Does this qualify as provision for metadata in the FS?

      --
      Score: i, Imaginary
    3. Re:FS support for metadata by Tobias+Luetke · · Score: 1
      Longhorn is using WinFS, which afaik is just a metadata layer slapped on top of NTFS.

      The storage engine for WinFS will come from the mssql team so thats hardly "slapped on top"

    4. Re:FS support for metadata by Jeff+DeMaagd · · Score: 1

      I do know that NTFS supports "threads" or some such that there are alternate streams within a file. Alternate streams aren't called unless requested. There was a warning that a virus could hide itself within an alternate stream, such that a scanner wouldn't find it because they ignored the concept. Several years later there was an exploit made.

      Streams don't look too hard to deal with, it was just an ignored feature, like Windows Scripting, no few paid attention until it was exploited with a virus.

    5. Re:FS support for metadata by Anonymous Coward · · Score: 1, Funny

      Yeah, it's more like dropping an aircraft carrier on top of a cow - there really isn't any way to describe it.

    6. Re:FS support for metadata by pizzarobot · · Score: 5, Informative

      Actually, you can. To add a metadata item called "hidden.txt" to a file called picture.jpeg, just type on the command line:

      notepad picture.jpeg:hidden.txt

      Notepad should say that it "created the file." You should notice that no new files have been created: just look for them with explorer. But you can later open this "file" and read and edit it.

      You can do this with any file with any metadata name.

    7. Re:FS support for metadata by fedux · · Score: 1

      Do you mean this?:

      Extended Attributes

      Extended attributes are arbitrary name/value pairs which are associated with files or directories. They can be used to store system objects like capabilities of executables and access control lists, as well as user objects. The attr(5) manual page describes which kinds of extended attributes are defined.

      http://acl.bestbits.at/about.html

    8. Re:FS support for metadata by Anonymous Coward · · Score: 0

      Arbitrary name=value metadata for files is supported by several "FOSS" filesystems, including ext3fs, XFS, JFS and ReiserFS.

      On Linux you need a relatively new kernel, and you need to read the documentation, the concept is called Extended Attributes (EAs). If popular it may eventually become the default behaviour, rather than an optional feature.

      In the real world the usefulness of this feature depends on application software. Ordinary users don't manually tag metadata onto everything, even if goaded to do so. So the main source of metadata for such attributes will be applications. If your applications don't do anything useful with it, the feature itself is worthless.

    9. Re:FS support for metadata by follower-fillet · · Score: 1

      > if you download a file from the Internet, the system would likely append a Originated-From-URL property

      Software on Classic Mac OS did this years ago with the comment field--it was mighty handy.

    10. Re:FS support for metadata by janbjurstrom · · Score: 1

      Forgive my ignorance, but when I copy/move 'picture.jpeg' - does 'hidden.txt' follow with it (either "physically" or with a reference)? If not, how do I keep the connection ('predicate' in RDF-speak I guess) between the two?

      --
      668.5
    11. Re:FS support for metadata by EvanED · · Score: 1

      At least between different NTFS folders and partitions. I don't know about zipping it, or emailing it, or moving it to FAT and back to NTFS. The'd be good experiments, but I don't feel like it at the moment.

      (Just to point out: hidden.txt doesn't actually show in the filesystem anywhere. You can name it whatever you want too.)

    12. Re:FS support for metadata by doshell · · Score: 1

      Yup. That seems promising enough. Make it as good as it can be, slap the sources into the kernel, and make sure it compiles by default.

      --
      Score: i, Imaginary
    13. Re:FS support for metadata by zsau · · Score: 2, Informative

      I think XFS does; at least, some versions of ROX-Filer are capable of writing additional metadata about the filetype on XFS drives. My understanding was that ReiserFS v 3.x can, but I've never seen anything that uses it. Of course, Reiser4 will be able to, but I think it and Longhorn have joined Duke Nukem Forever in a race to the bottom...

      --
      Look out!
    14. Re:FS support for metadata by fedux · · Score: 1

      It's been in the kernel for 'long' now. It's part of XFS and ReiserFS at least.

  10. How much do you pay for HDD's by oliverthered · · Score: 1

    Last time I checked you can pick up HD storage space for $0.70 a GB.

    --
    thank God the internet isn't a human right.
  11. let's keep the Meta data simple... by howman · · Score: 3, Insightful

    Who
    What
    Where
    When
    Why
    and possibly How...

    --
    flinging poop since 1969
    1. Re:let's keep the Meta data simple... by Anonymous Coward · · Score: 0

      and Which.

      Can I have my +1 Insightful now?

    2. Re:let's keep the Meta data simple... by jsled · · Score: 1

      Yup, there's a vocabulary for that... http://ideagraph.net/xmlns/ibis/w6/

    3. Re:let's keep the Meta data simple... by bert.cl · · Score: 1

      While this might be a good idea, I don't totally agree.

      If you were to get to know people then you would really want to know more about them who, what, where, when, why and how.

      When you want to "know" your data (searching in every possible way) you might want to know more stuff about it. That's why the xml syntax used is so extensible.

      But since this is slashdot, getting to know people might be a bad analogy.
    4. Re:let's keep the Meta data simple... by Anonymous Coward · · Score: 0

      That's a good start (and one obvious enough that it's been used all over in the metadata world), but you need to go a lot further to have metadata that's useful to humans (who can handle a lot of implicit context) let alone machines (that need more hand-holding)

      Who: John; Dave; Sarah McLachlan; Dr Foster
      What: "I like it"
      Where: 148906,23805
      When: 2004-01-06T17:21:80.105

      That looks like a lot of useful stuff, right? But look again. Is "I like it" a song, an album, or what? Did Sarah sing on this track, or produce it, or play the guitar or...? Is Dr Foster actually a doctor of any sort, a medical doctor or is it just a name? Is this "John" the same as any other "John" in my metadata catalog? Why is the time specified so precisely? So that's when it was... recorded? edited? mastered? pressed? ripped? The location co-ordinates don't point anywhere sensible in any co-ordinate scheme I have, so what system was used?

      That doesn't even begin to touch on problems like temporality (you get some metadata that says Dave is Jo's boyfriend, or that Fred lives at 18 Spring Crescent but WHEN was the metadata itself written? it may be out of date, and relying on it may make you look very foolish).

    5. Re:let's keep the Meta data simple... by jsled · · Score: 1

      XML != extensibility.

      If you think about it, the S-P-O relation from RDF is the thing that actually allows interoperability between different [namespaced] properties... since it's clear into which role all extension goes -- as new Properties.

    6. Re:let's keep the Meta data simple... by jsled · · Score: 1

      Yeah, W6 is "cute", but ultimately not semantically rich enough.

      DublinCore covers a lot of the digital artifact information [title, authors, publisher, &c.] Where is wgs48, and when is actually covered pretty well by W6, if only because it's not clear what "when" is referring to.

      If anything, I'd say W6 is really useful as a set of stakes in the ground for being super-classes ... it might work better as a the roots of an Upper ontology.

  12. Spotlight by Kesh · · Score: 2, Interesting

    I'm mostly wondering if the new Spotlight feature of MacOS X 10.4 is going to be based on this, or a proprietary technology. I've been itching for cross-platform metadata file support for years now...

    1. Re:Spotlight by aristotle-dude · · Score: 2, Informative
      I don't see how considering that Spotlight is a search technology that leverages metadata already existing in files on OSX today and this article talks about tagging files with metadata.

      The search technology in Spotlight probably is inspired by live query from BeOS but first appeared at Apple in iTunes and later Preview for Panther.

      Many former Be Inc. employees work at Apple now and some had worked at Apple before joining Be.

      --
      Jesus was a compassionate social conservative who called individuals to sin no more.
    2. Re:Spotlight by Kesh · · Score: 1

      However, I believe Spotlight would also have to allow the end-user to tag other files in order for it to be really useful. Otherwise it would only return results from that narrow list of filetypes.

    3. Re:Spotlight by aristotle-dude · · Score: 1
      That is not the job of Spotlight but rather the application developer to provide metadata in the file format. I believe that most major developers such as Adobe, Macromedia and Microsoft include metadata in their formats.

      It might be useful to have an interface in the finder to access/edit this metadata however in Tiger.

      Currently, there is a way to tag items with the comments field accessed from Get Info. The rest of the metadata is created in application. Steve Jobs touted Spotlight as working with current applications so it seems they expect the user to be entering in the project name into their document's properties within the application they are using to create it.

      Searches by date of would go by file creation/modification times.

      --
      Jesus was a compassionate social conservative who called individuals to sin no more.
  13. Can't wait.. by bigattichouse · · Score: 2, Funny

    for when I can just throw out the whole desktop in favor of a "cloud" of data... using google-like interfaces to find my stuff. I think it would be interesting to figure out how to tell a compiler where to find stuff...

    --
    meh
    1. Re:Can't wait.. by mrchaotica · · Score: 1

      I've thought about using hard links (or maybe simlinks would do) to turn my file tree into a graph. I was particularly interested in sorting things like MP3s, where I could have all of them in one big /Music directory, but also have /Music/Artist/[ArtistName]/[MusicFile] and /Music/Genre/[GenreName]/[MusicFile] without actually duplicating the file. The only hard part would be writing tools to create the links automatically.

      It would be good for doing things like grepping, but I wonder if a system-wide SQL database kind of thing would be better?

      --

      "[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz

    2. Re:Can't wait.. by tono · · Score: 1

      you just said a whole lot of stuff that doesn't mean anything.

      --
      cheese logs keep my wang warm at night.
    3. Re:Can't wait.. by tono · · Score: 1

      that's not a graph, it's a more detailed tree, and it's already arranged in a tree. I already arrange it by genre and then have itunes put it one big playlist.. there done.

      --
      cheese logs keep my wang warm at night.
    4. Re:Can't wait.. by bigattichouse · · Score: 1

      Let me Clarify. I would like to get rid of the entire "tree" nature and move to something where I can interact with my files in a sort of "cloud"... by searching for things, or referencing things directly... things are found by reference, or relationships, and not by tree organization.

      --
      meh
    5. Re:Can't wait.. by mrchaotica · · Score: 1

      I thought it was a graph because there was more than one path to the node (file), i.e. "/Music/Albuquerque.mp3", "/Music/Weird Al/Running With Scissors/Albuquerque.mp3", and "/Music/Pop/Albuquerque.mp3" were the exact same file.

      Or as another example, for a "various artists" album you could have the songs available as /$Artist1/$Album/$Song1, /$Artist2/$Album/$Song2, /$Artist3/$Album/$Song3, etc., as well as /$Album/$Song1, 2, 3, etc. (in the same directory)

      --

      "[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz

    6. Re:Can't wait.. by tono · · Score: 1

      so you like not knowing what you're looking for, or not knowing what it is you're looking for? I'm sorry but I fail to see how your idea of a cloud is better than a well organized file tree?

      --
      cheese logs keep my wang warm at night.
    7. Re:Can't wait.. by EvanED · · Score: 1

      I too have wanted to do something like this. Is anyone aware of a tool or filesystem that would do this? It's sorta in the back of my mind as an idea for a thesis, but I dunno how appropriate it'd be.

    8. Re:Can't wait.. by tono · · Score: 1

      so it's a family tree that doesn't branch.. heheh. So let me get this straight, metadata takes data that already exists in the file, and makes it easy to search your files. So there's this whole big thing on making it easier to find your files on your harddrive that you put there?? Maybe we should consider a national spring clean your harddrive day..

      --
      cheese logs keep my wang warm at night.
    9. Re:Can't wait.. by pyrrhonist · · Score: 1
      that's not a graph, it's a more detailed tree, and it's already arranged in a tree.

      Trees are graphs. A tree is a connected acyclic simple graph (i.e. any two vertices are connected by exactly one path).

      What mrchaotica has done is add edges to the tree so that it has more than one path between some vertices. This makes the tree into just a graph.

      --
      Show me on the doll where his noodly appendage touched you.
    10. Re:Can't wait.. by pyrrhonist · · Score: 1

      You were right. See here.

      --
      Show me on the doll where his noodly appendage touched you.
    11. Re:Can't wait.. by ryanmfw · · Score: 1

      While I agree that it is stupid to get rid of the tree, but, having a cloud does not eliminate knowledge of what you're looking for. Well, that's all I have to say, I couldn't care one way or the other, all I want is for Linux to support this *and* normal filesystems.

      --
      Hurricane Ivan: A 17th century prison collapsed. All of the inmates escaped.
    12. Re:Can't wait.. by mrchaotica · · Score: 1

      Oh, I don't actually do that (I use iTunes now that I got a Mac; before then I just did mpg123 * and hit ctrl-c every time something came on I didn't like, and before that I just double-clicked mp3 files in Windows Explorer), I just thought it might be a neat idea.

      --

      "[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz

    13. Re:Can't wait.. by mrchaotica · · Score: 1

      I know it; I learned that in CS 1321 : )

      --

      "[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz

    14. Re:Can't wait.. by pyrrhonist · · Score: 1
      I know it; I learned that in CS 1321 : )

      How I learned it:

      Father: Is this your graph? Son: No... I mean I... Father: Answer me! Who taught you how to do this? Son (tearful): You, OK? I learned it by watching you!
      Parents who use graphs have kids who use graphs.
      --
      Show me on the doll where his noodly appendage touched you.
    15. Re:Can't wait.. by Anonymous Coward · · Score: 0

      So let me get this straight you want it to be a "Cloud"? I'm sorry, but I liken this more to a "Clod" than a "Cloud," and am reminded of an old "Piles of Trash" joke on the ZSNES board about the directory layout.

      (Basically, there's ./zsnes, and ./zsnes/Piles_Of_Trash, holding everything else. It's actually called "Saves" though).

    16. Re:Can't wait.. by ryanmfw · · Score: 1

      No, not necessarily a cloud, that was the OP's point. I was really just saying that a cloud does not really remove knowledge of what individual files are. That's what metadata is for. :-)

      --
      Hurricane Ivan: A 17th century prison collapsed. All of the inmates escaped.
    17. Re:Can't wait.. by Tony-A · · Score: 1

      I'm sorry but I fail to see how your idea of a cloud is better than a well organized file tree?

      The well organized file tree is best assuming that you use the exact same tree to store the file as to access the file. Problems are that well organized doesn't come cheap and that the optimum tree structure changes over time. Add to that the fact that you really want the ability to recover information from files based on entirely different criteria than those used to initially store the file. The cloud is not a mechanism for finding stuff, it is the initial starting point from which he wants to be able to find stuff.

    18. Re:Can't wait.. by smallfries · · Score: 1

      It is a graph, to be precise its a DAG - Directed Acyclic Graph, which in non-technical terms is like a tree but with bits that appear in multiple places.

      I guess it's a common enough idea, I've been planning to write some scripts to do it for ages, but they have to cope with the crappy organisation in my music folder at the moment. Naming consistency is the key to making it work sucessfully. I was thinking of having one big flat folder with album sub-folders, each containing the artist / album name, then having text files with genre tags. A script can run over the data and generate a hierarchy built out of sym-links.

      Ahh, I sense a busy sunday afternoon coming on...

      --
      Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
    19. Re:Can't wait.. by zog+karndon · · Score: 1

      Yes, it's called the Single-Instance Store on Windows Server 2003. When you create files on a Single-instance volume, the system creates a hash of the file, and (lazily) merges files with identical hashes. Copy-on-write semantics apply, so if you modify one of the merged files, the file is split.

  14. discussions about winfs and rdf by scupper · · Score: 3, Interesting

    Danny Ayers has some interesting discussion on his blog about winfs and rdf. There's also discussion of Jon Udell's Questions about Longhorn.

  15. Re:This is largely irrelevant if you have experien by k4_pacific · · Score: 1

    Well, I have a file called DocumentNo5.mp3, but its a rip of an R.E.M. album.

    --
    Unknown host pong.
  16. Haystack and Metadata efforts by Knight2K · · Score: 4, Interesting

    A group at MIT is using RDF for an integrated data management system. It's sorta like Outlook (or Kontact, if you prefer ;-) on steroids. It's called Haystack.

    I have to say, their ideas are intriguing, but after using it... I think the big shortcoming is that it's tough to come up with a generalized user interface for manipulating any data thrown at it. Haystack tries at this, and I think, fails at providing any kind of cues or context that tells you what your are dealing with. In Haystack, every task and piece of information you deal with looks very much like every other piece of data, because, as a design choice, Haystack every piece of data has the same rank as every other piece of data.

    Having different applications for different types of data usually make sense, if only to limit the amount of options presented to the user so they can make an intelligent decision about what action they want to perform. See this article on Slashdot about how users need limited since it makes decision-making too difficult psychologically.

    Inevitably, discussions around RDF and metadata always devolve into hand-wavy discussions on how the computer will be able to "magically" do smart things based on the metadata. But it really isn't magic and it isn't automatic at all. Equivalencies and mappings have to be created by humans along with the rules about what to do.

    RDF uses many concepts from AI research. Anybody who has read about this branch of computer science knows that the discipline has pretty much given up on creating AI in the 'sci-fi' sense as an impractical dream. That's what makes the Loebner prize so controversial. I don't expect that computers will be intelligent enough able to relieve users of too much of the burden in assigning metadata.

    RDF is a promising approach, but if you read the article, it makes a lot of assumptions about what needs to happen to make the benefits real. Among them are establishing standards for what metadata fields apply to different types of objects: photos, people, music, etc. That kind of standardization won't happen overnight, if at all.

    The computer also needs to know what to do when it encounters that kind of data. The article mentions MIME and browsers and, in effect, says the browser can make a rational decision even if it hasn't seen a particular MIME type before. That isn't really true.. you have to install a plugin that tells the browser what to do, or have a registry that someone has put together where the browser can install the right plugin at the right time.

    That said, KDE's unification of contact information and passwords does show some of the promise of metadata efforts. And Apple's Spotlight looks like a good solution as far as it goes. I guess I'm just trying to make the point that the magic of metadata needs to be taken with a fairly large hunk of salt.

    --
    ======
    In X-Windows the client serves YOU!
  17. Many community websites don't permit RDF by MichaelCrawford · · Score: 2, Interesting
    I have a couple of articles that have Creative Commons licenses, and I tried at first to include RDF in them.

    But when I tried to publish one article at Kuro5hin, the RDF code, which took the form of HTML comments, was displayed literally in the visible body of my article. That is, all the tags had been turned into entities so the tags appeared literally in the rendered text.

    I think Kuro5hin's Scoop content management system doesn't permit HTML comments. Maybe it's not trying to suppress comments, but it didn't occur to scoop's developers to allow them.

    RDF on the web would likely be much more popular if one could count on publication sites allowing it in the submitted markup.

    Another problem I had is that Creative Commons' recommended way to apply a license to a web page is not permitted by any of the community sites I frequent. CC-licensed web pages usually have a small banner that links to the license text. But for obvious reasons, sites like Slashdot and Kuro5hin don't permit images in article or comment submissions.

    The result is that, even for the copies of my articles on my own website, I use neither RDF nor the CC banner, because I want to make it easy for others to copy my CC-licensed articles to site that don't permit RDF or graphics.

    The way I apply the license is the much-less-cool method recommended for plain text files. I have the following text appear in the body of my articles:

    This work is licensed under the Creative Commons Attribution-NoDerivs License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nd/1.0/ or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.

    --
    Request your free CD of my piano music.
    1. Re:Many community websites don't permit RDF by tono · · Score: 1

      So a glorified forum is a content management system now? Christ.

      --
      cheese logs keep my wang warm at night.
    2. Re:Many community websites don't permit RDF by mlinksva · · Score: 2, Informative

      You have another option -- put the RDF in a separate file and reference it with a link tag. See http://creativecommons.org/technology/metadata/ext end#link

  18. You answered your own question by Anonymous Coward · · Score: 0

    There's no need to provide definitions for terms which are so easily looked up. Metadata isn't an obscure concept, as you found when you did your search.

  19. libferris && RDF by monkeyiq · · Score: 1

    Hi,
    A bit of a shameless plug, but none the less: I think that folks who
    liked the ideas in Edd's article might also be interested in my
    project, libferris.

    Ferris allows metadata to be extracted from files and presented through
    a uniform interface. It supports inference on metadata and has the
    ability to index that metadata in many ways (eg. Berkeley db, odbc
    LDAP). Note that the metadata index can be used to index anything
    libferris can mount (XML, ODBC, RDF, LDAP, http, ftp...)

    A cool thing related to Edd's piece is that you can read an inferred
    attribute "as-rdf" to obtain all the metadata that libferris knows
    about for a file as a single RDF/XML file.

  20. I have 300,000 files on my Windows box by MichaelCrawford · · Score: 2, Informative
    I know this because ad-aware tells me so when I have it scan all my disks.

    The vast majority are very small files. How much more space would be required to give each one some RDF? And remember disk space is allocate in terms of sectors, or sometimes in blocks of several sectors, so small files waste proportionately more space.

    And that's just on the Windows installation for my PC. I also have Slackware Linux and BeOS on other partitions. Quite likely there are very nearly a million files on my PC alone.

    --
    Request your free CD of my piano music.
    1. Re:I have 300,000 files on my Windows box by k8to · · Score: 1

      Yes, let's optimize our operating system information architecture for the physical layout of our current filesystems, becaue god knows we're already certain that it could never be done efficiently and the cost would certainly be too high in all cases.

      BZZT.

      There are many solutions, such as allocating the metadata and file contiguously in the filesystem, optimizing the filesystem for small files or file bits (eg. BeFS or Reiser), and perhaps future techniques that have not been considered. This problem has been solved on 8mhz machines using 800k floppy disks. I'm certain it can be solved again for your p4 and it's 500GB hard drive.

      --
      -josh
  21. It's coming by Anonymous Coward · · Score: 0

    Of course you need a few extra things like a universal schema database so everyone can use he same schemss and a way to organise them, ratings, recommendation engine etc I can say, as an AC that it is nearly here, Windows to begin with (the core is cross platform C++ only the GUI is Wondows specific) before transitioning to a Mozilla based crossplatform app, the possibilites are far greater than the article discusses, a final unity of metadata, and more importantly search and aggregation across systems, datatypes, languages (human and machine).

    1. Re:It's coming by tono · · Score: 1

      The "core" of windows is what makes windows, well, windows.. and is therefore not cross platform. do you even know what you just said?

      --
      cheese logs keep my wang warm at night.
  22. Come on, mods... by Anonymous Coward · · Score: 0

    Interesting? The Voynich Manuscript is interesting. The Linux kernel is interesting. A guy who can't be bothered to read more than 3 words of 1 definition of a word isn't interesting. This post should be at -1, Troll. Might as well get a good mod on this post too: That and installing Gentoo on my home box.

    1. Re:Come on, mods... by Anonymous Coward · · Score: 0

      And by "This post should be at -1, Troll", I meant the grandparent should be at -1, Troll. Oops.

  23. An incomplete acronym by mangu · · Score: 1

    Read Da Fucking what?

    1. Re:An incomplete acronym by Anonymous Coward · · Score: 0

      Just shut up and read it, n00b. We can't be bothered to tell you everything.

  24. Isn't it the same problem? by pyrrhonist · · Score: 3, Insightful
    After reading this article, I'm wondering if metadata is really going to be as effective as the author thinks it is. The author points out that, "the computer makes us do the work of a filing clerk". In other words, when you place a files on your computer, you normally place them into a folders to organize them, which is, "not fun". The author implicitly claims that metadata will solve this situation.

    But that's the problem! If it's not fun to organize items into folders, how is it anymore fun to add metadata to a file? I'm not talking about text files. Text files are easy, because you can pull the metadata out of them automatically (in fact, you can do this now with search tools). I'm talking about files that have to be explicitly tagged with metadata, like pictures. How is adding metadata to each picture file to categorize your vacation pictures any less laborious than placing the vaction pictures into their own directory?

    That's the problem as I see it. You still end up being a filing clerk! If people don't even organize their folders now, are people going to use metadata when it's available? Will improved search capabilities make users want to be clerks?

    In a nutshell, isn't it the same problem?

    --
    Show me on the doll where his noodly appendage touched you.
    1. Re:Isn't it the same problem? by tono · · Score: 1

      in a nutshell, yes, it is the same problem, one that can easily be solved with appropriate filenames. And don't tell me about the wonders of how metadata can hold more than just the name of the file and all this, by the time you add any useful information to the metadata the data describing the data will be larger than the original data itself. it's rediculous, and as I see it is just more self-maturbation on the part of bored jobless software engineers that aren't solving any problems that need solving.

      --
      cheese logs keep my wang warm at night.
    2. Re:Isn't it the same problem? by value_added · · Score: 2, Insightful

      When I was a kid and would ask aloud where something was, my mum would say, "Look where you put it." It annoyed me to no end, of course, but years later I find myself "putting things where they belong" and emptying my mind of everything else, much like putting phone numbers in a phone book so one doesn't have to clutter up one's my mind remembering any of them.

      My own opinion is that there is no substitute for "putting things in folders." Boring, but true. Regular expressions and databases can go a long way (even for the average Joe), but it's as brainless as it is fast to look in an appropriately named folder. Not everyone agrees, of course:

      Apple Unveils Faster Searching
      Apple Throws Spotlight on Search

    3. Re:Isn't it the same problem? by CustomDesigned · · Score: 1
      How is adding metadata to each picture file to categorize your vacation pictures any less laborious than placing the vaction pictures into their own directory?

      It isn't. The file names are metadata. Links and Symlinks let you have multiple "metadata" entries. If directories represent categories, then you can link a picture into as many categories as applicable.

      In terms of power, metadata support is equivalent to support for links. In fact, metatdata could also be encoded into long file names - but that could get pretty ugly. For instance, my company uses a homebrew filesystem where filenames can contain null chars. The normally visible part of a filename is terminated by a null char, and followed by arbitrary metadata in a conventional format - typically a database field table.

      So there is no real need for special metadata support. Anything stored as metadata can be equivalently stored using some filename or directory bundle convention. The important thing is to define common conventions.

      With the unix approach, in the worst case, with thousands of competing conventions, you can still backup and restore with your favorite tar or cpio like utility. If you go the special metadata route, on the other hand, you have to have specialized backup and restore utilities. This is a great feature for M$ (yet another way to lock you into their platform), but a huge drawback for open source.

    4. Re:Isn't it the same problem? by adamscottphotos · · Score: 1

      I do a lot of reading. I have somewhere in the neighborhood of 1200 folders on various topics. I am also a photographer. I have over 26000 digital shots from all around North America, plus scans of my analog shots. I plan my trips in excruciating detail; I have at least 10000 links to various online maps, logs, and trail descriptions.

      It's neither brainless NOR fast for me to 'look in an appropriately named folder'; there are simply vastly too many. Even with the best heirarchy I could conceive, I have over 70 top level folders.

      Short of dewey-decimal, heirarchial folder systems simply do not scale from the user's perspective.

      --
      So quit your job, pack your bags, and move on out to snow country!
    5. Re:Isn't it the same problem? by RdsArts · · Score: 1

      Imagine this (mainly because I imagine it daily, and slowly, ever so slowly, am hoping to code something like it :P ;) ):

      You insert your camera, and drag it's folder (USB drive) to your image program. It opens up, and you have them all in a wee slide-show esqe format. Now you notice the little info box over the image. You double-click it, and it brings up something like:

      "Location:
      People:
      Year:"

      There can be more then 3 fields, but this is just a "I'm lazy and making a comment" thumbnail. ;) So you look at it, and you check two boxes for the fields "Location" and "People," then fill in the people and location. All the photos then get those values. It was a vacation, so the contents are relatively set. (granted, it'd be a bit more laborous with just random photos, but that's a problem solved by a better UI. As I said, this is just a thumbnail) Later that year, you want to find all those photos. You open a filer window, and enter "vacation." It pulls up all your photos. Then you say "2004." It pulls up all the photos from 2004. Why? Because it snagged the date data for the EXIF tags. Most importantly, though, you didn't have to go looking by filename, which since it's from a digital camera is meaningless, and since it's on a computer is too short and nondescript to mean anything. Instead you searched with the metadata, and now you go looking for it by thumbnail from a population that's already pretty close to what you want.

      As more apps did this (and more users want the camera to store more info the in EXIF tags so they don't have to ;) ) grabbing this info with even less user input at the app becomes more and more possible. In the end entering any metadata would be a simple to do and wholely replace using filenames.

      You can already (to a point) do this with audio, and RoxCD comes with defaults that enters the artist, album, track number, and track name into the OGG Vorbis file it generates. (Which is one of the reasons I started it - I was annoyed so many rippers never bothered with that) Eventually I hope to get something coded up so I never have to search through the file system for a track again.

      So while it doesn't take all the work out of making files, and most importantly doesn't stop those of us (like me) who do keep files in a deep directory heirchy ;), it does take the work out of storing and finding the files. And that is where the user spends most of their 'filing clerk' time. :)

    6. Re:Isn't it the same problem? by abreauj · · Score: 1
      If it's not fun to organize items into folders, how is it anymore fun to add metadata to a file?

      Metadata solves the same problem that folders solve. The only meaningful difference that I see is that metadata allows you to have multiple ways to file a document.

      Suppose I have a folder for my "vacation 1999" photos. I also have a folder for "photos of my nephew timmy". In which folder do I file a photo of Timmy taken during my visit to his family during my 1999 vacation? Perhaps I've submitted this photo in a contest and it won first prize, and I have a third folder for "published photos" that it needs to be in.

      Sure, I can waste space with multiple copies that have no indication they're really the same photo, or I can screw around with hard links or symlinks, but managing multiple views like this with folders is a pain in the butt. Using metadata instead as the primary way of organizing these photos would make it a lot easier to manage this.

  25. Creative Commons & Desktop Metadata by mlinksva · · Score: 1

    CC is interested in desktop metadata developments. See this CC weblog post from a few days ago.

  26. Separate the apps, not the data. by Cardinal · · Score: 1

    Having different applications for different types of data usually make sense, if only to limit the amount of options presented to the user so they can make an intelligent decision about what action they want to perform.

    I agree wholeheartedly that unifying desktop applications into one nebulous interface isn't a very useful way to give users access to their data. Mail clients make good mail clients, but they make lousy photo gallery browsers.

    That said, what I do wish we'd see more of is an effort for different applications to share the same information, because the dividing line between which application to use is much clearer than the dividing line between which application should be the keeper of particular types of data.

    I don't want to have to open my web browser to see if I've bookmarked a URI that somebody mentioned in an IRC channel. I also don't want to have to open my PIM to find the phone number of somebody who I'm talking to in that IRC channel.

    These are the sorts of data access issues I'd like to see resolved, and I do see RDF as a possible, even attractive, approach to solving the problem. However, as you've pointed out, we can't simply modify our applications to all spit out RDF, and expect everything to fall into place. Some degree of consensus about how to represent data is required. Rather than writing new applications like Haystack, or looking for new approaches to managing one's information, I'd rather see efforts to modify existing applications to share data sources more effectively.

  27. RDF is not practical by Anonymous Coward · · Score: 1, Insightful

    I've yet to see a real world example of how to use RDF that wasn't for research(ie to prove RDF works) purposes. Most of the projects listed for semantic web are purely research, toy projects, or completely unproven. I know of several companies that have tried, but they usually end up extending the hell out of RDF to make it practical and useful. That makes me think RDF is flawed.

    1. Re:RDF is not practical by Anonymous Coward · · Score: 0

      Actually, several "unamed" social-networking sites are using foaf (defined in rdf) for information exchange via xml-rpc. Its pretty damn cool actually.

  28. Hey thanks for the tip by MichaelCrawford · · Score: 1
    I can do that. Thank you very much.

    --
    Request your free CD of my piano music.
  29. what metadata is most important to describe? by nusratt · · Score: 1

    1. Ditto to the post which said, "Separate the apps, not the data." The current proliferation of app-specific formats is absurd and counter-productive.

    2. I file hundreds of docs &/or URLs per day. I need something which offers some degree of assistance in immediate auto-categorization (e.g. Bayes) with feedback, while still allowing user-defined hierarchy. "Yes, thank you for intelligently recognizing that this new info is about device interrupts; but now I need to tell you that it's about kernel-coding vs. crash-debugging vs. performance-analysis."

    3. One poster calls the article, "self-maturbation on the part of bored jobless software engineers that aren't solving any problems that need solving".
    Speak for your yourself. Yeah, I'm a developer, but most of my minute-to-minute usage of my desktop isn't all that different from "lusers" or PHBs, i.e. massaging info.
    Get some perspective. Your statement is like saying, "Cars are really primarily made for mechanics and automotive engineers, not for soccer moms and commuters."

    4. Forked-data: sure, as long as it's restricted to the app-specific stuff. Take that table the user just created: use forked-data for the meta-data which is specific to the spreadsheet or WP app, but leave the table data as ASCII data which anyone can read.

    5. Someone said, "a file-name should be enough". Speak for yourself; a lot of my needs go waaayy beyond that. If the metadata goes beyond your neeeds, then your course is clear: just don't use it. It costs you nothing to architecturally allow for its use by other people.

    6. re: "clouds", there are times when I'd really like to know -- what app created this file? what OS? which host? which user? what other files had been opened (e.g., stdin)? what was the original volume label? etc.

  30. Watching the XML kiddies reinvent the wheel by Animats · · Score: 3, Informative
    It's fun watching the XML kiddies re-invent concepts from LISP. They just re-invented property lists, "is-a" links, and much of the baggage that made SGML painful.

    Knowledge representation via "is-a" links has been tried, and it breaks down rather quickly. Read "Artificial Intelligence meets Natural Stupidity", by Drew McDermott, for a 20 year old critique of this concept. It's overkill for searching, and not powerful enough for reliable automated question answering.

    The Cyc debacle illustrates how much work you have to put into tagging to get very little out. After twenty years of that money sink, it's still useless.

    1. Re:Watching the XML kiddies reinvent the wheel by benson+hedges · · Score: 2, Interesting

      if you had RTFA, or even read anything in the last 20 years, you would probably know that XML != RDF. there is a XML implementation of RDF, called (duh) rdfxml, but that's far from the only way to describe RDF data. I have to agree though that rdfxml is one of the worst ways to do RDF.

      have a look at N3 or ntriples for starters.

      --
      Karma : Soylent Green (Mostly due to eating junk food and mocking religion)
    2. Re:Watching the XML kiddies reinvent the wheel by 12357bd · · Score: 1

      XML != RDF

      1-RDF as XML need a 'neutral' or 'established' set of vocabularies to be really useful (interoperability).
      2-Both are verbose.

      Not sooo different, if you remember that metadata is data. XML formats data, RDF formats metadata, big deal!

      And please don't forget the data acquisition bottleneck that all those annotated formats produce.

      --
      What's in a sig?
    3. Re:Watching the XML kiddies reinvent the wheel by ambrosen · · Score: 1

      Strangely enough, I have actually seen people giving papers where they use CYC for inference in Question Answering systems. Amazingly.

    4. Re:Watching the XML kiddies reinvent the wheel by Animats · · Score: 1
      MIT used to have a Cyc-based system on line, but it was so lame they took it down.

      They'd loaded it up with information about the MIT/Cambridge area, and information about the Middle East. Suggested queries were things like "Who is the king of Jordan", and "Is MIT in Cambridge?". So I tried queries like "Who is the king of Israel", which returned the name of the premier of Israel.

      Inference was broken. I asked "Is MIT in Cambridge" - Yes. "Is Cambridge in Massachusetts?" - Yes. "Is MIT in Massachusetts?" - Don't know.

      That was lame. It was really no smarter than a search engine. It seemed to be on a par with Ask Jeeves. That's embarassing. Cyc is supposed to be able to do simple inference.

    5. Re:Watching the XML kiddies reinvent the wheel by wkearney99 · · Score: 0

      Without effective data the overarching search system will never be smart enough. Without effective searching tools it will never be easy to extract it.

      What's worse, bad tools and no data or NO tools because there's no data?

    6. Re:Watching the XML kiddies reinvent the wheel by Phillip2 · · Score: 1

      "The Cyc debacle illustrates how much work you have to put into tagging to get very little out. After twenty years of that money sink, it's still useless."

      And after five years (and yes a lot of cash), the Gene Ontology is an incredibly useful tool for biologists.

      It's not the answer to everything, but it makes some things easier. This is enough.

      Phil

  31. RDF (and OWL) in Pike by janbjurstrom · · Score: 3, Interesting

    I noticed the article made no mention of Pike (also the name of a fish - see language logo). Pike's a fine C-like scripting language ...that I know extremely poorly myself, but anyway..

    From Pike's official homepage (at the University of Linkoping, Sweden):

    The release of Pike 7.6 marks the first results of a long-running project to make Pike the first scripting language for the Semantic Web. The current highlight in that respect is the support for W3C's standard formats RDF and OWL.

    Worth downloading and checking out for other reasons than "just" RDF & OWL. Free software, available under LGPL, GPL, and MPL (Mozilla Public License).

    --
    668.5
  32. None of you have mentioned by GNAA+Goat-See · · Score: 0, Interesting

    the fact that Mac OS X uses .plist files to represent creator code and application information in .APP bundles.

  33. Look also at XMP by mughi · · Score: 1

    When looking into metadata, people should probably be sure to check out XMP

    It's from Adobe, and whereas RDF just says how to format metadata, XMP addresses what to include in your RDF, and how to place it into different types of files. They have free libraries, but it's simple enough to follow even with your own code. And... given that it's how all Adobe products are doing metadata, at least in the publishing world it will probably stay something to pay attention to.

    Creative Commons has addressed this, and I first hit it in researching implementing metadata support for Inkscape.

    The more things play nice together, the more users are likely to adopt using them.
  34. NTFS streams by Otto · · Score: 2, Informative

    This "metadata" is actually called an "NTFS stream" and has been around since at least NT4.

    If you move the file around the NTFS drive, or from one NTFS drive to another, then yes, the metadata goes with it. If you move it to a FAT volume though, the metadata is lost forever. Not a huge deal as NTFS is getting more and more users nowadays.

    XP uses these metadata streams to some degree, actually. Some of the things in the properties page for a file are actually NTFS streams.

    Longhorn will make more extensive use of them, I'm certain.

    --
    - Give a man a fire and he's warm for a day, but set him on fire and he's warm for the rest of his life.
    1. Re:NTFS streams by Tony-A · · Score: 1

      Longhorn will make more extensive use of them, I'm certain

      Ditto the viruses.

    2. Re:NTFS streams by janbjurstrom · · Score: 1

      I suspected the properties for files were met^H^H^H NTFS streams. Interesting, thank you.

      --
      668.5
    3. Re:NTFS streams by juhaz · · Score: 1

      Ditto the viruses.

      Indeed.

      That one's pretty harmless and easy to spot since it's proof of concept, but it's nevertheless scary thought to have viruses that could hide themselves to pseudo-files that are not visible in any way if you don't know what to search for, and even then only enumerably by weird totally unrelated and/or undocumented functions...

    4. Re:NTFS streams by parksie · · Score: 1

      When I tried to copy a file with another stream from NTFS to FAT32, Win2K complained and said I was losing data. I don't know about zipping the file with, say, WinZip; is it even *possible* to get separate streams in? (Perhaps some kind of weird exported format that Windows can read/write so you can move your files around?)

  35. RDF by cyberfunk2 · · Score: 2, Funny

    Did anyone else read RDF and think.. Reality Distortion Field ( Steve Jobs)

    1. Re:RDF by xirtam_work · · Score: 1

      Yep. Totally, especially from the little information i got from thr RSS feed, I thought the article was going to be about Jobs demoing Tigers' search tech.

  36. Another format.. by 12357bd · · Score: 2, Insightful

    Good, yet another format to use/suffer!

    No matter how good those formats are (XML/RDF/etc) they all fail at the simplicity norm, the KISS principle.

    In the example of the article, by not using a simple text oriented format they innecesarily complicates the access by any program to these values, and that leads to the second point.
    The computational cost involved in parsing / validating all those formats; the day that our cpu's can process hundreds or thousands of simultaneous parsings without a noticeable impact on performance, that day it could start to make sense to popularize his usage, until then, they are a luxury and as such restricted to a limited (especialized) usage.

    On the RDF case, metadata is data, the 'meta' part is a human hability and can be used wherever we want, no need for a special format. By pretending to format the 'metadata' concept we are just defining a new stream format, and if we consider how wide the 'meta' concept is, it seems dificult to limit to a simple ontology. The result? the need of another international consortium to stablish a reasonable set of vocabularies, big deal!

    I think there are better ways to spend our cpu cicles than to parse verbose formats, but how knows?

    --
    What's in a sig?
  37. Desktop.ini by cRueLio · · Score: 1

    Maybe it can replace Windows' crappy Desktop.ini file, allowing for complex Desktop setups.. one use i see for it is to have links to all the icons, programs they start, etc...

  38. Re:This is largely irrelevant if you have experien by aralin · · Score: 0
    Are you troll? Nobody asks about the importance of metadata, since its widely known they are important. This article is about a way how to store them using an interesting technology meant for just that in a place for which the technology was not originally intended.

    Nobody questions your ability to index and search your own data anyway, its when you start to cooperate with other people, when metadata become really useful. You might not name your document document06.doc, but someone else might. And not everything is grepable, pictures, music files, binary data files, all that makes a great use of searchable metadata.

    Anyway, I think it would be great to have a unified system of metadata, so that you would not need specific system for every category of files, like mp3 and id3 tags.

    --
    If programs would be read like poetry, most programmers would be Vogons.
  39. XML?!?!? by RAMMS+EIN · · Score: 1

    I would say that RDF, or any XML format, is unacceptibly wasteful for metadata. Besides, many filesystems already support extended attributes. Why not use the mechanisms that were developed exactly for this purpose, instead of introducing a new and inferior one?

    Forgive my zeal, I just really hate the XML for everything mentality.

    --
    Please correct me if I got my facts wrong.
  40. DEVONthink for the Mac by holygoat · · Score: 1

    If you use a Mac, you might be interested in DEVONthink (and it's little brother, DEVONnote). It does (2), and very well.

    It text indexes all supported Mac document files (Web, RTF, text, PDF, etc.), and can store anything (links, movies, PDFs, whatever). You can then do very fast search.

    Have a look.

  41. Example of missing metadata by claes · · Score: 1

    I think there is too little metadata about installed applications in a regular Linux system. There is metadata in the packages (RPM for example), and there is metadata in the .desktop files. But the package metadata is on the package level, and does not describe each individual application it contains. The .desktop files are very sparse, and describes things that fit on one line of text or less. This makes it hard to write new kinds of user interfaces. I can't find any way to make a freshmeat like user interface acting on the software that is actually installed in the system, since there are no descriptions that are longer than one sentence in a .desktop file.

  42. Re:XML?!?!? *sigh* by holygoat · · Score: 1

    Firstly, RDF is not XML; its canonical exchange format encodes to XML, but there are plenty of other representations.

    Secondly, please explain how the implicitly-described files in your NTFS streams can be seamlessly shared over the Web in a composable way.

    The point of RDF on the desktop is that it does statement-level meta-data very well, and is Web-integrated.

  43. Re:XML?!?!? *sigh* by RAMMS+EIN · · Score: 1

    Hmm, perhaps I should read up on RDF more, but everything I have seen that had anything to do with RDF was in XML. Saying that RDF is "Web-integrated" also says "XML" to me.

    As for sharing metadata over the web (I am not talking about NTFS about it, because until today I didn't even know it supported extended attributes), I think HTTP headeders perfectly fit this purpose - they are metadata, after all. Just encode every attribute in an HTTP header.

    Besides, the main use I see for metadata is to improve organizing and finding objects. This works from localhost upwards; first, you slap meaningful metadata on your files, then you can use your (local) search functionality to efficiently find your files, and finally, with suitable protocols, others can find objects on your system, too.

    --
    Please correct me if I got my facts wrong.
  44. kSpaces? by Anonymous Coward · · Score: 0

    Isn't this what kSpaces does today? Granted, kSpaces is still a little rough around the edges, but it seems like a start.

    From the website:

    kSpaces is a metadata-driven, distributed knowledge management platform. It was designed to be lightweight, transparent and extensible. The kSpaces proof-of-concept allows files to be described with arbitrary RDF metadata. These descriptions can then be easily shared with and queried by other nodes in the system. Finally, kSpaces-managed files can be made available to all other nodes participating in the same kSpace.

    kSpaces employs file system monitoring and auto-tagging technologies in order to achieve almost full transparency to the user. Its lightweight, plugin-based design allows for maximum extensibility.

    The kSpaces reference implementation was written using Java for the server and C# for the client (on Microsoft's .NET platform). All client-server communication is done via SOAP, and client-client data transfers are made via HTTP.

    The kSpaces Node software works by monitoring a directory (My kSpace in the My Documents folder) and managing metadata about any files in that directory. Subdirectories are not supported.

    kSpaces automatically tags files through the use of plugins. The two autotagging plugins that are included analyze a file's ID3 and EXIF headers, and then generate the appropriate RDF metadata.

    Metadata associated with a file can be viewed and edited through the kSpaces Node application, supported by editor plugins. Five editor plugins have been included in the proof-of-concept, four of which are read only. These plugins allow the management of a subset of Dublin Core metadata, EXIF metadata, ID3 metadata and kSpaces-specific metadata. The Raw RDF plugin shows the raw RDF metadata associated with a knowledge asset.

    The metadata that is stored about knowledge assets in a kSpace can be queried using RDQL. In end-user applications, RDQL could be generated by using natural language processing or other technologies.

    Finally, the kSpaces node allows the kSpace contents to be viewed using browsing plugins. In this proof-of-concept, only a very basic one has been provided, which shows all assets present in the kSpace. Writing additional browsing plugins will allow users to see the kSpace assets from different facets that can be tailored to the user's needs.

  45. You mean... by Anonymous Coward · · Score: 0

    Bah.. you mean like DTP is irrelevant to anyone who can use a typewriter?

  46. Re:This is largely irrelevant if you have experien by johannesg · · Score: 1
    Don't you think that people who name their document document6.doc, won't bother to correctly set metadata tags as well? If you think about it, the filename is also just metadata. I know there are plenty of people out there who cannot be bothered to come up with a decent name, and they certainly won't be filling in topic, author, contents, etc. fields either.

    Anyway, the grandparent has it exactly wrong: "normal" users who won't correctly name things and store them in a badly-thought out directory tree will not be using metadata. We power users on the other hand, can use it to make our own systems far more useful to us.

  47. Its not irrelevant to any of us by Uninen · · Score: 1

    Since most of us are advanced computer users or even computer experts, I think we largely know how to search for content.

    But what good is your know-how if you can't find what you're looking for? Let's face it: mos of us have hundreds, even thousands (or tens of thousands if you count all your pictures and mp3s), of files on our computer(s) and finding what you want depends nowadays greatly on your memory.

    With Spotlight-type (metadata driven) search you can narrow the results, query by query. You don't have to remember where you stored the file or with what name. Even if you know how to archive your files, I think these metadata based technologies are going to help us a lot in the future.

    And on a second notion, let's remember that if you know how to find stuff on your computer, you belong to minority of computer users. Most users have no idea how computer works and therefore they don't know how to make efficient searches or how to archive their files practically. Metadata will help these people to find their ways better on this machine that they are not familiar with.

  48. Re:This is largely irrelevant if you have experien by aralin · · Score: 1

    Nah, in large corporations, most of the metadata is stored by global information systems, that usually have a process in place which won't let you proceed unless you fill in metadata. The point is that name and directory is as bad as hierarchical databases for all the purposes, which metadata are in usability alongside the relational databases.

    --
    If programs would be read like poetry, most programmers would be Vogons.
  49. Metadata with clipart & Inkscape by Bryce · · Score: 1

    There's been work on adding Dublin Core metadata support to Inkscape, for its next release.

    The need for the metadata support is entirely practical in this case: the Open Clip Art Library requires all SVG submissions have proper metadata embedded, to ensure licensing and authorship correctness. Also, there is an SVG Clip Art Browser that uses the metadata info for its display.

    One interesting observation that's come up recently and is being discussed on the lists is what happens when you embed several pieces of clipart into a larger document, how do you access the RDF of the individual bits in Inkscape?