Slashdot Mirror


The Mac, Metadata, and the World

Rick Zeman writes: "ArsTechnica has posted yet another compelling article, this time on metadata, its history and the future of metadata storage as seemingly indicated by Apple in OS X. Extensions==Bad!"

3 of 307 comments (clear)

  1. BeOS has already SOLVED the FileType/Metadata prob by Eugenia+Loli · · Score: 2, Informative
    I suggest to all read here:

    http://www.beosbible.com/exc_filetype.html
    and here:
    http://www.beosbible.com/exc_query.html

    The BeOS has solved the problem, years ago. The BFS has integrated all these features into the OS itself, so all applications are making use of them. The Byte.com BeOS articles from Scot Hacker are also a must read!

  2. Re:Linux thoughts by TWR · · Score: 3, Informative
    I think part of the Mac fascination with file type is due to the monolithic program structure; you find the file, and then you open a single program that does to it anything that you will ever do to it. In this model, there is a right program, and which program is right is based on file type. Windows clearly suffers greatly from having this model but not having a more reliable fashion of determining file type than Linux.


    You clearly don't understand the type and creator fields.


    There are TWO separate fields for each file in the classic Mac OS. One (TYPE) indicates what kind of file it is. The other (CREATOR) indicates what program will open the file by default. Each is four bytes long.


    The nice thing about this system is that you get a clean separation between file typing AND default launching application. It's other OSes which have the "monolithic" structure you're talking about.



    Incidentally, has anyone else noticed that the MacOS scheme is equivalent to having 4 character extensions which aren't displayed, with the corresponding problem of having malicious executables named README.txt (or even README)?


    First of all, it'd be an 8 character extension. Secondly, List view on a Mac shows file type by default; an application is listed as "application program". Granted icon view won't discriminate unless you do a get info or sort by kind. Finally, if you don't trust the source of a file, don't open the file. This is common sense, no matter what extensions you are showing or whatever file system you are using.


    -jon

    --

    Remember Amalek.

  3. Re:Poor technical expertise from a Mac Apologist by John+Siracusa · · Score: 2, Informative
    Barring anything academic, experimental, or "fancy," it's pretty clear he's never tried to think about UNIX linked-list style filesystems

    I assure you, that's not the case :)

    within the framework of his discussion, I would assert that a file's name is not part of its essential metadata in a UNIX-style FS. Why? All of the file information is contained in the file's inode and data blocks (the immediate decomposition into metadata and data being obvious). [...] unless one is willing to assert that inode + data blocks don't constitute a file, and that each instance of a reference to a particular inode is to be considered a file.

    But you can't get at the inode without the file's name and location. Inodes are not suitable as file identifiers since they are not guaranteed to be unique across the multiple disks that make up a given file system. The combination of the file name and location is unique in a given file system. "inode + data blocks" do constitute a file, but the file is inaccessible unless the file name and location are known. Therefore the file name is still essential metadata on a Unix-style file system.

    Furthermore, the examples of "immutable" metadata (ill-considered vocabulary in the first place, I think)...

    I considered "data-dependent", but stuck with immutable, for better or for worse.

    ...are poorly considered. File size can be altered without altering the underlying data on BSD-style unices that provide truncation and extension system calls.

    Truncation is a modification of the data.

    Modification time often gets changed on many systems without any change to the underlying data

    See my previous post on the topic. Yes, the semantics of modification date vary wildly. But there's no reason that the semantics I chose in the example in the fundamentals section (which tries to ignore existing implementations) couldn't exist.

    "File type" is essentially a nonsense notion on most UNIX filesystems

    I agree, which is one of the reasons I didn't address the Unix philosophy of reducing everything to a sequence of bytes or blocks at the OS level.

    the notion of file type is at least partially bogus. There's nothing to stop me from interpreting data many differnt ways: an XPM is something I can edit with an ordinary text editor, and hence a file of type "text," but it can also define pixmaps, so depending on what I want to do with it, it might be of at least two file types.

    What you want is a type hierarchy that indicates that XPM is of general type "text" and, more specifically, it is an X pixmap. There's nothing "bogus" about the notion of file type. I think you're unnecessarily constraining yourself to very simple metadata values.

    Similarly, I can try to view a raw audio file as a compiled pixmap, or, to recapitulate the famous joke, 'cat /boot/vmlinuz > /dev/audio'. The results of such voluntary file polymorphism aren't always useful, but they sometimes are.

    Storing file type metadata does not necessarily dictate any OS policies (if any) based on that metadata--something the article tries to point out many times.

    It seems abundantly clear to me that the author is a thoughtful and well-educated person whose primary computing experience has been with Macs and post-DOS MS machines: and while he may have used UNIX-like operating systems, he doesn't know much about data representation of filesystems on them

    I'm not so sure about "well educated." ;-) My primary computing experience is on the Mac and in Unix. I just chose not to address the Unix angle, for various reasons.

    and clearly hasn't considered more modern developments like filesystems with journaling or ACLs instead of permission bits.

    I've certainly "considered" them, and I did mention ACLs (although spelled out instead of by acronym: page 4) in the article. That's all just more, richer metadata.